- Note that the December maintenance window is moved from 12/21 to 12/14.
- Nebula will not be affected. The L1 test stand would have had OS upgrades early Thurs. morning, but I've asked the admin's to hold off until Jan. in order to provide maximum stability during the early integration excercise.
- 2017-12-22 to 2018-1-01 (inclusive) is a University holiday period. Services will be operational. NCSA will respond to incidents based on business criticality. During the holidays, incidents should be submitted as JIRA IHS tickets as we will not be monitoring slack channels as much as normal.
1 created, 1 resolved.
Created & Resolved:
Discussion of Notable Issues
Unexpected reboot of lsst-qserv-db16 (IHS-606).
Firmware upgrades were successful and all nodes were returned to service Thur. 11/30. Since then, there have been no unplanned reboots.
4 created, 3 resolved
This process primarily targets requests that can be handled with current level of effort (LOE) resources. This process is also designed to detect and redirect items to the EVMS process if they exceed LOE resources.
Changes proceed through 5 stages:
|Initial Assessment||Check that the submitter has stated a plausible business case and the relevant T/CAM agrees|
|2||Feasibility Assessment||Is the change well-formulated, address a project need and cost-effective.|
|3||Planning||A detailed implementation plan is created which takes into account impacts, resource needs, testing and verification.|
|4||Implementation||The plan is executed to implement the change.|
|5||Assessment||Verification of successful change & issues analysis|
|6||Closed||Documentation and formally close the request. close-out.|
Open Change Requests
|02/Nov/17||This change has been approved and is tentatively scheduled for early CY18.|
Use case and requirements are being gathered. Unknown User (pdomagala) will discuss this with the PDAC working group this Thursday.
|16/Nov/17||Currently being planned and tested. Tentatively scheduled for deployment before 22/Dec/17|
|04/Dec/17||Approved by M. Butler and completed on 08/Dec/17|
Heard on the Street This Week, but no Ticket Filed
It was suggested that per-user storage usage for each shared fileset be made available. Preferably readable by any DM member.
- Several users expressed a desire to have the Intel compiler suite (icc) available on last-dev
- Increase ssh idle session timeout, which is currently 1 hr. (John Parejko via Slack)
- Suggestion to deploy kubernetes on PDAC, it is assumed that this is being handled through the rolling-wave (EVMS) process
- Tools for parallel programming in batch computing environment (gnu parallel and others)
Change Process Notes
- Change process is being refined based on experience and feedback from exercising it over the past month.
Report format under development
Next PDAC meeting
Suspended until after the first of the year since Jeff is in Chile. However, I’m linked in to Chile IT.
The meeting was focused on the upcoming maintenance event.
Proposed that we install a standard Influx/telegraf/prometheus stack on the standard Nebula images. Install a monitoring system in openstack to serve up the data/dashboards.
- Unknown User (pdomagala), get cell phone numbers
- Unknown User (pdomagala), contact Gregory Dubois-Felsmann and/or Unknown User (xiuqin) for approval (phone)
- Unknown User (pdomagala), start monitoring the RFC project
- Unknown User (pdomagala), interface between RFC process and LDMCR process
From last week