You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

Report a problem

Upcoming Scheduled Maintenance

(All times are Project Time (Pacific))

StartEndWhat is happening?Location

Planned Activities

Systems/services that will NOT be available

2017-05-18 (06:00)

2017-05-18 (08:00)

LSST monthly maintenance

NCSA
  • Kernel upgrades and reboots in the LSST dev environment
  • Permanently unmount the old NFS home filesystem. This will complete the decommissioning of the NFS home filesystem.†
  • Install of Unbound local caching resolver software as recommended by NCSA Security
  • - Remount of remaining NFS exports in readonly mode. Users should migrate any old data off of NFS in preparation for the final NFS decommissioning. NFS is expected to be turned off entirely on July 20, 2017.
    - Upgrade to latest MySQL 5.5 on lsst-db.ncsa.illinois.edu.

All LSST-dev resources

2017-06-04

2017-06-05DAQ installationNCSADetails to followTBD

2017-07-20 (06:00)

2017-07-20 (08:00)

LSST monthly maintenance

NCSA
  • Permanently unmount any remaining NFS filesystems.

All LSST-dev resources







† Users that still need to migrate data must do so before midnight May 17th.

Previous Outages & Events

StartEndWhat happened?LocationWhat was affected?Outcome
2017-04-27 13:112017-04-27 14:20

Unplanned

Nebula outage

glusterfs crashed due to this bug, so no instances could access their filesystemsAll instances running on NebulaNeeded to reboot the node that systems were mounting from, but took the opportunity to upgrade all gluster clients on other systems while waiting for a reboot. Version 3.10.1 fixes the bug. All instances with errors in their logs were restarted.

2017-04-20 (04:30)

2017-04-20 (09:30)

LSST monthly maintenance

NCSA

This event is cancelled so as not to interfere with Early Integration Activity #03 being held at NCSA April 19 & 20.

nothing bad happened

2017-04-17 (13:41)

2017-04-17 (13:53)

Unplanned

lsst-dev login node down

NCSA

Users unable to log in to lsst-dev.

Probable cause is that the root file system filled up due to excessive logging

Fixed
2017-03-27 (22:00)2017-03-29 (14:00)Blue Waters maintenanceNCSA

Due to maintenance of cooling infrastructure at NPCF, Blue Waters will down during this period. Cray will also take this maintenance window to perform some system updates at the same time.

Systems that will be down

  • Slurm cluster compute nodes will be powered down for the duration of the outage.

Systems that will remain up

Qserv nodes ( lsst-qserv-* ), SUI nodes ( lsst-sui-* ), Bastion node ( lsst-bastion01 ) should remain online during the outage.  

However, if temperatures in the NPCF rise too high, we will be forced to shut these down as well. I've been told that this is a low-probability scenario and we will be given time to do graceful shutdowns. In the unlikely event that this happens, it will be communicated through the DM Slack channel and also posted here.

All systems normal

2017-03-23 (0800)

2017-03-23 (1300)

NCSA Nebula OutageNCSANebula will take an outage to balance and build a more stable setup for the file system. This will require a pause of all instances, and Horizon being unavailable.

Nebula is back to normal.

2017-03-16 (0430)2017-03-16 (0930)LSST monthly maintenanceNCSAGPFS filesystems will go offline for entire duration of outages. Some systems may be rebooted, especially those that mount one or more of the GPFS filesystems.
2017-02-22 14152017-02-22 (1615)Nebula Gluster IssuesNCSAAll Nebula instances paused while gluster repairedNebula is available.
  • No labels