You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 99 Next »

Current Status

NORMAL


Report a problem

Upcoming Scheduled Maintenance

(All times are Project Time (Pacific))

StartEndEventLocation

Description

Systems/services that will NOT be availableStatus

Third Thursday of every month 06:00

Third Thursday of every month 08:00

Recurring-Monthly

Monthly lsst-dev maintenance

NCSA
  • Routine system updates.
Variable. Do not expect any lsst-dev system to be available during this period.

SCHEDULED

Every Mon. 04:00


Recurring- Weekly

Purge of GPFS /scratch partition
NCSA

Per LSST data management policies, files older than 180 days will be purged from the LSST shared (GPFS) /scratch file system.

Purge logs can be found in /gpfs/fs0/admin/purge_logs/scratch/

No outage or service disruption.

SCHEDULED

Every Tu. 08:00

Every Tu. 10:00

Recurring- Weekly

Weekly Nebula Maintenance

NCSARoutine system updates. Computational services continue to run.Horizon and API interfaces.

SCHEDULED

Thursday 2017-12-14 04:00

Thursday 2017-12-14 10:00

Exception to Recurring-Monthly

Monthly lsst-dev maintenance

NCSA
  • Due to holiday schedules, the December maintenance event is being moved up 1 week, from 2017-12-21 to 2017-12-14
  • Routine system updates
  • Network switch replacement
  • lsst-db server replacement
Do not expect any lsst-dev system to be available during this period.

SCHEDULED

2018-01-02 
09:00

2018-01-05 
17:00
NebulaNCSANebula (OpenStack) will be shut down for hardware and software maintenance from January 2nd, 2018 at 9am until January 5th, 2018 at 5pm.All Nebula systems unavailable.

SCHEDULED

Previous Outages & Events

StartEndEventLocation

Planned Activities

Systems/services that will NOT be availableStatus
Tuesday 2017-11-28, 10:00TBDRolling reboots of PDAC qserv nodesNCSA
  • In order to address a spontaneous rebooting issue with some qserv nodes, firmware upgrades are being performed.
The occasional qserv node will need to be rebooted. Experience with the first couple will allow NCSA to give more precise information on the order and timing of the reboots.

COMPLETED

2017-11-20 7:002017-11-20 14:00Nebula Openstack cluster

NCSA

Nebula OpenStack cluster will be unavailable for emergency hardware maintenance. A failing RAID controller from one of the storage nodes and a network switch will be replaced.

Not all instances will be impacted. If any running Nebula instances are affected by the outage they will be shut down, then restarted again after we finish maintenance that day.

COMPLETED

Thursday 2017-11-16 06:00

Thursday 2017-11-16 10:00

Extended monthly lsst-dev maintenance

NCSA
  • Routine system updates.
  • Due to the volume of work that needs to be done, this event is being extended by 2 hrs. If systems become available before the end of the maintenance window, we will announce it here.
  • Be aware that this event will include an off-schedule purge of items in /scratch older than 180 days.
Do not expect any lsst-dev system to be available during this period.

COMPLETED

2017-10-31
NFS instabilityNCSANFS becomes intermittently unresponsive.

~STABLE

We are guardedly optimistic that this problem has been resolved. PDAC is now utilizing native GPFS mounts.

2017-10-24 09:50LSSTGPFS outageNCSAAll LSST nodes from NCSA 3003 (e.g., lsst-dev01/lsst-dev7) and NCPF (verify-worker, PDAC) that connect to GPFS (as GPFS or NFS) have lost their connection.GPFS

ONLINE

Storage is working to bring GPFS back online

2017-10-21 17:15

LSSTpublic/protected network switch is down in rack N76 at NPCF


nodes cannot communicate DNS, LDAP, etc. so largely cannot communicate with other nodes, e.g., no communication between affected verify-worker nodes and the Slurm scheduler on lsst-dev01, no communication between affected qserv-db nodes and the rest of qserv

Efffectively, the whole verification cluster

RESTORED

in progress, replacement switch is on order

Workaround in progress. If all goes well, systems should be back online by late afternoon.

2017-10-19 06:00

2017-10-19 14:00qserv-master replacementNCSA

qserve-master will be down so that systems engineering can finish configuring the new server and xfering files. Status updates here: IHS-378 - Getting issue details... STATUS .

qserv-master will be down for this entire period

COMPLETE

Archived events


Important Project Dates

(those with asterisk* are LSSTC funded):

2017

 

April 24-28

Data Science Fellowship Program – Session 3 * Tucson, AZ

May 1 – 3

NSF Large Facilities Workshop, Baton Rouge and Livingston, LA

May 1 – 5

AURA Board and Member Representatives Annual Meeting, Tucson, AZ

May 12 - 13

LSST Detection of Optical Counterparts of Gravitational Waves*, BNL.  Contact Morgan May for additional information.

May 22 – 25

Infrastructure for Time Domain Science in the Era of LSST, Tucson, AZ

May 31 - June 2

Supernovae:  The LSST Revolution Workshop *, Northwestern University, Evanston, IL

June 12 – 16

Getting Ready for Doing Science with LSST Data,* IN2P3, Lyon, France

June 19 – 21

AURA Workforce and Diversity Committee (WDC), Maui, HI

July 10 - 14

DESC Meeting, Dark Energy School, and Hack Day*, jointly hosted by Stony Brook University & BNL

July 25 – 27

NSF/DOE Joint Status Review of Data Management, NCSA, IL

August 14 – 18

LSST 2017 Project & Community Workshop, Tucson, AZ

September 6 – 8

NSF/DOE Joint Status Review, Tucson, AZ

September 14 – 15

AURA Management Council for LSST (AMCL) Meeting, Tucson, AZ

October 26 – 28

Society of Women Engineers WE17 Conference, Austin, TX,

Get your LSST gear at our storefront: https://business.landsend.com/store/lsst/ 




  • No labels