(back to the list of all DMLT meeting minutes)

Location

Browser

Room System

Phone Dial-in

https://gemini.zoom.us/j/93625401560?pwd=cjdFb1ZWeGx1eVJGSVBUVmpMRUg5UT09


Meeting ID: 936 2540 1560 

password: 161803

Dial closest IP: 162.255.36.11 (east coast) and 162.255.37.11 (west coast) Then use the Zoom meeting ID 936 2540 1560 as the dialing extension. For example: 936 2540 1560@162.255.37.11 or: 162.255.37.11##936 2540 1560 Password: 161803

Dial-in numbers:

+1 346 248 7799 (US Toll)

+1 669 900 6833 (US Toll) Meeting ID: 936 2540 1560 International numbers available: https://gemini.zoom.us/u/adcUNrbXzS

Time

10:15am PT

Attendees

Regrets

Discussion Items

ItemWhoNotes
Notes
last time - I think Fritz but there were no notes ...
Project Updates
  • Kevin Long - sent out request for schedule status.
    • Jira updated by tomorrow please (last day of the month)
  • Summit power outage and stop work
    •  power overdraw from coating chamber - report later today or tomorrow
    • critical network systems were on the facility UPS (which failed for some reason).

      • CS: something was wrong; we were supposed to be behind the UPS. 
      • WO: electricians need to investigate
      • CS: We need protection for the coating chamber. If this happens again, it'll be bad. 
    •  still no alarms when the summit connection goes down (working opsgenie)

      • WO: need an out of bounds network. I don't know why radio wasn't used in this case. No one went on the radio and alerted them. 
    • still no summit-base connection redundancy (LTE/4G, microwave, satellite?). (on the way)

      • CS: got a quote from the vendor last week. Going to use the gemini antennae.

      • WO: haggle a bit, but get on with it because we need it in place.

    •  the procedure to bring network systems up from zero is woefully out of date-(this was the post covid unique startup but its a fair point we should work on it)

      • Lost some hardware. It was carnage for "just a power cut"
    • Chuck - cascading failures
      • its no longer a pure construction site
    • Kevin - communications poor. There was no broadcast on radio, no internet. There was a stop work called by email earlier in the week but people did not stop - (Victor) all points were well understood and a go ahead was agreed. Email is not the way to get a stop order (ther emay be hesitance for stop work also). Phrasing was "safety hazard I think we should not proceed". Then at the power cut there was no stop work order called.
      • Gessner - misunderstanding of what happened, power outage and domino effect - the summit did not think of this as critical (no visible imminent danger). But it was a circuit could have started a fire.
      • Anyone can call stop work (it does not have to be imminent danger) .. that should cause people to pause even if they continue to  safe position.
    • Frossie who is the designated authority - or on the ground ops manager - Chuck states the default person is himself. Chuck Gessner states its the site manager Eduardo or Oscar if he is not there .. but Jacques was the one who made the call on the critical lift.
      • Every time we asked, we got another answer, which wasn't confidence inspiring.
      • FE: There's needs to be a captain of the ship
    • Victor says we have an emergency list and we could ask people to work out of hours.
      • Giovanni is working on the call list
      • it has not been updated for a while
  • Veronica - needs approval authority for TER.
  • All hands Dec 9th .. confirmed (right after camera team all hands)
  • AAS -
    • Masks are not expense-able apparently
move to mainKTL: Done. everythign stabablized. Jenkins running. instructions out for developers. PR for dev guide is up. PR for templates for new repos. Jenksin master node is referred to as the "manager node" but Jenkins thinks its own name is the master node, and needs to be restarted. Will wait until we rebuild jenksins at SLAC. Looking around references to branches in them. Waiting for expert advice on that. Haven't heard anything yet. Couple more things could use a restart like slackbot, they work because the urls are forwarding, but will stop working if someone pushes a master branch in the future. 
CLO update

FE: Major upgrade during thanksgiving but we'll need to do another one. Andes(sp?) did not come up from the power-outage.  Decided not to interrupt thanksgiving dinners. Yagan (sp?) is coming up today. 

CS: Andes is gone and we will never resurrect it. Convo about renaming Yagan to Andes. 

Summit

Especially any help needed..(could be TCAM )

Victor asks if everything is now down for the backup link.

CS: lost leaf switches, 3 Andes nodes, and several optic units, but have spares. 

DEI


See  #inclusion and https://www.lsst.org/about/dei

Community link: https://community.lsst.org/c/eji/45

F2F
Please add topics as they come up to DM Leadership Team Face-to-Face Meeting, 2022-02-15 to 17

Level 2 Milestones


See also LSST Verification & Validation Documentation and docsteady.lsst.io.

Test Plans due in the next 45 days

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

Milestones due in the next 45 days

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

WO: I thought the EFD test plan was drafted.  KSK: I changed some wording that Sandrine needs to approve.  


Risk Review



Next risk meeting tomorrow 14:00 - Tim Jenness this clashes with Ops Exec not sure if Bob will cancel for JDOR.

TJ: I have a faff (sp?) meeting at the same time 

Risk items needing review

Key Summary T Created Updated Due Assignee Status days since review
Loading...
Refresh

Overdue risks (obligation date passed)

T Key Summary Assignee Reporter P Status Resolution Created Updated Due
Loading...
Refresh

Risk mitigations due at the end of this month

Key Summary T Created Updated Due Assignee Reporter P Status Resolution
Loading...
Refresh

Overdue risk mitigations

 

T Key Summary Assignee Reporter P Status Resolution Created Updated Due
Loading...
Refresh

DMLT Travel & Availability this week


JDOR on Wednesday, Thursday, out brief on Friday

Should be done for Status meeting Thursday.

KT: I'm going to take some time off Thurs/Friday because I worked through Thanksgiving weekend. 

Any Other Business



TCAM Stamdup

as needed.

No business

Action review


WO: You get a break this week.


RFCs

Overdue

Key Summary Reporter Assignee Created planned end
Loading...
Refresh

Due this week

Key Summary Reporter Assignee Created planned end
Loading...
Refresh

Action Items

Confluence Quick Tasks

DescriptionDue dateAssigneeTask appears on
  • Frossie Economou Will recommend additional Level 3 milestones for implementation beyond just the DAX-9 Butler provenance milestone.   
15 Mar 2022Frossie EconomouDM Leadership Team Virtual Face-to-Face Meeting, 2022-02-15 to 17
  • Kian-Tat Lim Convene a meeting with Colin, Tim, Robert, Yusra to resolve graph generation with per-dataset quantities (likely based on Consolidated DB work).  
18 Mar 2022Kian-Tat LimDM Leadership Team Virtual Face-to-Face Meeting, 2022-02-15 to 17
  • Frossie Economou Write an initial draft in the Dev Guide for what "best effort" support means  
17 Nov 2023Frossie EconomouDM Leadership Team Virtual Face-to-Face Meeting - 2023-Oct-24
  • Convene a group to redo the T-12 month DRP diagram and define scope expectations Yusra AlSayyad 
30 Nov 2023Yusra AlSayyadDM Leadership Team Virtual Face-to-Face Meeting - 2023-Oct-24
11 Dec 2023Gregory Dubois-FelsmannDM Leadership Team Virtual Face-to-Face Meeting - 2023-Oct-24
02 May 2024Frossie EconomouDMLT Meeting - 2024-04-22
22 May 2024 DMLT Meeting - 2024-04-22
  • Richard Dubois USDF part in data facilities for PSTN-017 and distrib processing ? 
22 May 2024Richard DuboisDMLT Meeting - 2024-04-22
22 May 2024Fabio HernandezDMLT Meeting - 2024-04-22
  • Tim Jenness - section on middleware for PSTN-017  
22 May 2024Tim JennessDMLT Meeting - 2024-04-22
  • Cristián Silva - section on summit/data acquisition  for PSTN-017  
22 May 2024Cristián SilvaDMLT Meeting - 2024-04-22

DMLT-relevant Jira Tickets

Key Summary Reporter Assignee Created Due
Loading...
Refresh