Logistics

Date

4-6 June 2019

This meeting will be preceded by a DM-SST meeting on 3 June.

The meeting will finish by lunchtime on 6 June; feel free to arrange travel for that afternoon.

Location

BlueJeans

Slides

Social Events

Participants

Apologies


Agenda

Day 1: 2019-06-04

Time (Local!)TopicChairDiscussion TopicsNotes and action Items
09:00WelcomeWil O'Mullane
  • Confirm agenda.
  • Review action items.
  • Slides
  • The next DMLT face-to-face meeting will take place in the week of 28 October at SLAC. Remains to be determined (later this meeting) whether we have an SST meeting on the Monday.
  • Discussion of the (uncertainty about) requirements around L1PublicT. This is becoming an urgent issue since it plays into the construction plans for community brokers.
  • Kian-Tat Lim — confirm Chuck Claver's agreement with DMTN-111.  
  • Leanne Guy  — create a Jira ticket to collect obs_lsst/YAMLCamera plans from Robert Lupton 
  • Leanne Guy  & Colin Slater  — decide whether there will be an SST meeting on 28 October 2019.  
  • Wil O'Mullane — submit an LCR setting a lower limit on L1PublicT.  
  • Unknown User (mbutler) — follow up on sizing model issues for PVIs following tickets triggered by RFC-325.  
09:15Preparation for this summer's reviews
  • Review document pack; assign actions to complete documents where necessary.
  • Review meeting agenda.
  • Decide on contents of DM sessions; assign actions to prepare.
    • Should this include a demo? If so, of what?
  • Who should be there?


Presentation outline for JSR is here https://github.com/lsst/wom_presentations/tree/master/STATUS_REVIEW_2019


  • Agreed that there will not be an attempt to rewrite LDM-564 to reflect real releases, but will rebaseline it (and LDM-503) based on the current format with updates from current PMCS.
  • Unknown User (mbutler) will be proposing a new sizing model to DM management next week.
  • Worried that LDM-522 does not present an SDQA plan as-is. Agreed to present a modest update to that, together with DMTN-085 and a discussion of QA priorities in early Ops.
  • “Verification Elements Baseline Documents” (which effectively track changes to the VEs) are not required for the review.
  • Agreed not to try and rush RFC-600 (alert requirements update) through an LCR before the review.
  • 2 slides per third level of WBS in the DM breakout. Wil will circulate a template in advance of the meeting.
  • Probably no demo at the JDR, but we don't have an agenda yet.
  • Demo: running HSC data processing, having data appear in the LSP. That means that Fritz Mueller (at least; other members of his team welcome) should likely be at the JSR.
  • Not clear if there will be an EV surveillance at the JSR.
10:30Break (Refreshments Provided)
11:00Middleware Demo
  • Demo of Oracle Registry & large scale test run
  • slides
  • There is no demo, because looking at large-scale database is really demo-able.
  • Instead, summary presentations by Jim BoschMichelle Gower.
  • Multi-registry functionality makes it possible for users to combine their own data with the data backbone.
  • In the short term at least, it is likely that writing directly to HTCondor is a better use of developer time than trying to interface with Pegasus. We may revisit that in some time.
  • There is currently no capability to restart processing in the result of failures in Gen3.
11:30Middleware Planning
  • Timeline & milestones for transition to the Butler Gen 3 / PipelineTask middleware.
  • slides
  • Expect DM developers outside the existing middleware team to start engaging with the “Gen3 porting” at the PCW; no requirement that they begin before that.
  • Expect “DMLT checkpoints” showing status relative to Fritz Mueller 's slides at the PCW, and potentially another in ~October.
  • Previously-announced plans for a PipelineTask UI review are on hold for now; the Middleware Team is not convinced they would learn much from such a review. Should aim to solidify plans and set expectations on this by (say) the PCW.
  • Not yet clear if there will be a separate “user-batch” service that is separate from the batch production service. BaBar experience suggests this is a risky idea.
12:00Science Verification & Validation
  • How we plan to move forwards with requirements verification & metrics based on LSE-61, LDM-639 and the Jira LVV project. Who will do what (e.g new hires, overlap with commissioning efforts). 
  • Slides
  • Monet is not relocating to Tucson.
  • Planning an LSST-wide validation effort, rather than something DM specific. A document describing this will be forthcoming, produced in conjunction with Chuck Claver.
  • There is no LDM-240-equivalent “glide path” on metrics; we think this is probably not a problem for science numbers, but we should be careful to ensure we are working towards realistic performance numbers.
12:30Lunch (Provided)
13:30Networks, Summit and Base Data Center Status and Planning
  • Brief overview of Summit, Summit - Base, and Base - LDF Network status and schedule
  • Brief overview of Summit and BDC construction status and move-in schedule
  • Update on planned deployments of DM equipment to BDC, visits, tests/rehearsals in remainder of FY19 (and IT support required)
  • Summit servers this summer:
    • NFS & NCSA security systems, in anticipation of AuxTel installation.
  • Base:
    • First forwarders coming from NCSA.
    • Storage devices.
  • Base cabling is delayed by:
    • Access & power to the facility.
    • People from other subsystems moving into the base offices.
    • Were shipped the wrong fiber drops.
  • When will the Commissioning Cluster at the Base be available?
    • Should be up in February.
    • Hope to have this available to support LSP access for AuxTel/ComCam.
  • “Campus” networking goes over AURA/REUNA circuits to Florida, then Internet2 to NCSA.
    • ie, not over the “commodity internet”.
    • However, it would be possible to arrange one-off experiments for higher bandwidth in conjunction with the networking team (starting in October when we have 2 x 20 Gbps Florida - Chicago on ESnet)
14:30Summary and outcomes of the LSP Final Design Review
  • The outcomes of the LSP FDR.
  • This presentation will either become or form the basis for input to the JSR.
  • Presentation (pdf)
  • Annotated report (pdf)
  • Recommendations (xlsx)
  • Discussion of expectation setting around “lsp-stable”: it is still under development, and stability cannot be guaranteed.
    • No member of the team should currently have to worry about uptime metrics rather than ongoing development.
  • Should look for “in-kind contributions” to include compute support for the users they bring with them.
  • Should update predicted compute requirements based on the assumption that Dask is a widely-used resource.
    • Although some of the Dask usage may be driven by the fact that processed LSST data is not yet available in Qserv.
    • Expecting to have an HSC dataset ingested and available for testing this early in FY20.
    • Need to define Python interface to LSST data in the LSP; at least pyvo as a minimum.
  • As of next week, 30% of Gregory's time will be devoted to SPHEREx.
15:00Break (Refreshments Provided)
15:30AuxTel data transfers, ingestion and access via LSP
  • By request of the DM-SST, the aim of this session is to clarify the workflow for people (notably Robert Lupton ) involved in analysing AuxTel (or other) data during commissioning & I&T.
  • In particular: with what latency is this data made available for analysis at NCSA? By what mechanism?
  • Bandwidth from Tucson may be improved since a new router was installed there ~a week ago; has not been confirmed from the NCSA-side.
  • Worst case from data being taken to being available at LDF: 10 minute wait for rsync to start, ~73 second transfer, ~15 minute ingest (so, 25–30 minutes).
  • rsync is only intended to be a temporary mechanism to support this until the DBB is available.
  • Header formats are still changing rapidly; cannot count on LSE-400-style headers yet. This is causing problems with ingest. Hope that will converge in ~a month.
    • Re-ingesting old data to upgrade/improve metadata based on software upgrades will only be handled by request; it will not be a standard service.
    • Failures to ingest should be reported so that we can track down what happened, but it is not an NCSA responsibility to ensure all failures are resolved.
16:00

Creating Diverse STEM Workplaces through Intersectionality, Intentionality, & Inclusion

Heather Metcalf
Chief Research Officer
Association for Women in Science


17:00Close

Day 2: 2019-06-05

09:00Status of requirements flow down, doc tree, product tree
  • An overview of the current status of, and future plans for, the product & document trees, together with an understanding of how requirements flow between them. In particular, this should address questions like:
    • What is the plan for finishing/adopting DMTN-104?
    • Will the document tree be linked to the product tree (e.g., with one requirements doc per product)?
    • Are documents like LDM-602 obsolete, on the basis that there is not an “Alert Production” product? If so, when will new documents be created and where should requirements live?
  • Given that this may be a work in progress, an understanding of goals and timelines is the aim.
  • Should remove the “coming in 2018” labels from the DocTree!
  • While we are happy with the idea that not every component requires a detailed requirement flowdown, we note that product owners for subsidiary components must understand how their work is constrained by higher-level requirements.
  • We note the product-to-requirement mapping is shown in LDM-148.
  • We do require test cases explicitly for each product, even if there is no separate set of flowed-down requirements for that product.
  • We draw a distinction between a “product”, which is a major component which will undergo its own release process, and a “component”, which (in software terms) map to single repositories, but may not be released independently.
  • We expect design documentation to live separately from the component repository in general (but adding it to the repository is may be appropriate on occasion).
  • Kian-Tat Lim — Reconciliation of LDM-148 naming with the product tree.   
  • Kian-Tat Lim — Map products tree to design and requirements documents.   This is now DM-20832.
  • Kian-Tat Lim  — Reorganization of design documents, per DMLT F2F discussion.  
  • Kian-Tat Lim — discuss with Jonathan Sick where software design documentation should live as part of package docs, if provided.  
09:45Release Process & Policy (pdf)
  • Summarise the conclusions of LDM-672 and DMTN-106.
  • Outline plans for short-term and longer-term changes to the DM release process.
  • Discussion about which of our existing repositories are well-formed products. Is daf_butler? daf_base? afw? Balancing the desire to reduce the number of products with wanting to make fixes in one repository without forcing releases of many apparently unrelated packages. The answer seems to be that the product tree needs to be design “intelligently”.
  • There are various concerns about the definitions that Unknown User (gcomoretto) proposes, which may be best addressed in a focused technical follow-up session or breakout meeting.
  • The Commissioning Team is part of the “non-operational” stakeholders per Unknown User (gcomoretto) definitions. However, we note that Commissioning will require stable services: arguably, they will require different policies applied to (say) Pipelines (which require quick fixes) and services.
  • At least ~50 of our existing repositories are updated at least every few months.
  • SQuaRE requests that each team nominate a “release engineer” who can help resolve build problems.
10:30Photo outside
10:35Break (Refreshments provided)
11:00QAWG Recommendations
11:45LSST Data FrameworkWil O'Mullane
  • What does the recently-announced data framework mean for DM?
  • What sort of in-kind contributions could enhance or add to the expected capabilities of the DM system? We should prepare a list to feed into future discussions.
    • E.g. crowded fields, ...
  • “Contributions that will offset NSF or DOE costs for operations” is the key line.
  • There are things we need in operations which could make useful contributions.
    • E.g. Portal.
  • Worry that often external contributions “mostly work”.
  • We will still need the on-project team to do the core work.
    • One could imagine grey areas around the edges, like QA.
  • New algorithms which expand the scientific output of the survey would be welcome, assuming they don't increase our costs.
  • There is some ambiguity about whether in-kind contributions can apply to construction (per “talking points” memo page 1) or only to operations (per page 2).
    • Blum suggests we would consider innovative contributions to construction, but it's hard to see what that would be.
  • A new data rights policy document will be forthcoming; Bob Blum has authorized the release of a draft to DMLT members.
  • Bob tells us that there will be a process within Ops to identify things which could make useful contributions; the DM list can feed into this. This list will then be audited for risks, and presented to NSF and DOE.
  • Bob says: forget about the LSST Resource Board; it will not approve in-kind contributions.
  • We note the the DM system will need to be technically capable of providing the access specified in the data rights policy.
  • There is scope for both “offset” and “added-value” (upscopes) from the in-kind contributions.
  • Wil O'Mullane — create and circulate a Google Document for DMLT members to suggest possible in-kind contributions.  
12:15Update from LSST@Asia
  • The LSST@Asia conference was held in Sydney, Australia at the University of New South Wales from 20-23 May 2019.
  • The project was represented by Tony Tyson, Leanne Guy, Robert Blum and Robert Lupton.
  • Tony presented the project status, Leanne the LSST science drivers and data products, Bob the new framework for access to LSST data by the international community and Robert Lupton on HSC reprocessing using the LSST science pipelines.
  • Robert and Leanne jointly ran an session demoing the LSST Science Platform 
  • It was made clear that going forwards, international access to LSST data would be in the form of ‘in-kind’ contributions and that it would not be possible to pay for data access rights.
  • Most of the discussion focused around the new data access framework and how international communities could get involved, both now during the construction project and in operations.
  • Regional scientists were most interested to understand what would count as an ‘in-kind’ contribution, e.g telescope time for follow-up,  and how in-kind contributions would be translate to FTE
  • Some concern was expressed about what  would happen to people with data rights now but for whom no in-kind contribution could be identified in operations.
  • A lot of the regional research interests are focused around Galaxy studies and cosmology and many of the talks from regional scientists focused on synergies with LSST with talks

    talks focused on what they can contribute,  including followup ideas, and contributions to commissioning. Many have joined science collaborations already.

  • David Trilling of the SSSC pointed out that as the Solar System data products will be sent to the MPC, that effectively all Solar System data is public.
  • European LSST colleagues shared their experiences of building and an LSST community outside of the USA;  there is enormous interest in the Asia-Pacific region in building a regional LSST LSST community. The community has been named LSST@A^3 (Asia, Australia, Africa) .

  • The next LSST@A^3 meeting will take place at Peking University, Kavli led by Hu Zhan.

  • We note that not all in-kind contributions would necessarily directly integrate with LSST: other contributions of capabilities to the US astronomical community might also be possible.
  • We note that the current baseline does not send all solar system data to the MPC.
  • Date for the next meeting has not yet been set (may be in either one or two years).
12:30Lunch (Provided)
13:30Science Pipelines project planning
  • Discuss the approach being taken to planning and scheduling Pipelines work during F19.
  • Slides.

13:45DM-SST project planning
14:00Risks
  • Updates from LSST Project Management about how we ought to be using the risk register.
  • Assign actions (with deadline end of June) to specific individuals to ensure that all mitigations have anticipated completion dates.
  • Can add an “ops” tags to risks; Wil O'Mullane will then discuss with Victor whether they should be moved to the Ops project.
  • Every month, the assignee on the risk should review and (if applicable) adjust exposures.
  • All handing actions must have a date associated by the end of this month.
  • John Swinbank — set the assignee on all unassigned RM-Handling tickets in the DM component to the assignee of the associated risk.  
  • Everybody with RM-Handling tickets assigned — ensure your RM-Handling tickets have “anticipated completion dates” set.  
  • Wil O'Mullane — set up a monthly DM risk review as part of the regular DMLT call.  
14:30Clear the milestone backlog
  • In the April monthly report there are over 40 DM milestones listed as being delayed.
  • We will review this list, confirm the status of each milestone and what must be done to accomplish it (or to drop it from the plan), and assign action items to substantially reduce this list ahead of this summer's reviews.

Keep DM-SUIT-5; should be easy when SDMization arrives.

Keep DM-SUIT-8.


15:30Break (Refreshments Provided)
16:00Clear the milestone backlog (continued)
  • Continuation of the earlier session, if necessary.

17:00Close

Day 3: 2019-06-06

09:00Plans for the F19 cycle
  • We note that the plan is currently not to use Pegasus as part of the Batch Production Service, but this does not preclude its use elsewhere in the project.
  • The LSP team and SST may set requirements for workflow systems which are not covered by the PipelineTask/HTCondor system.
  • We draw everybody's attention to the Project-level monthly report, which folks aren't well aware of.
  • Wil O'Mullane — ensure that the Project monthly report is forwarded to the DMLT.  
  • John Swinbank — add a standing review-milestone-changes-in-the-monthly-report item to the DMLT calls.  
10:30Break (Refreshments Provided)
11:00Further discussion of the F19 cycle
  • Addressing emergent issues.

12:00Review action items & plans for future meeting
  • Next meeting at SLAC, 28-30 October.
    • DMLT Monday through Wednesday lunchtime.
    • DM-SST half-day Wednesday afternoon.
  • Meetings in 2020:
    • Virtual F2F, 24-27 February 2020
    • Seattle, 11-14 May 2020
    • Tucson, 9–12 November 2020
  • Consider an F2F in La Serena in 2021.
  • John Swinbank— create Confluence page for October 2019 DMLT F2F at SLAC.  
12:30Close

Pre-Meeting Planning

Suggested topics for discussion


TopicRequested byTime required (estimate)Notes
QAWG recommendations45 minsReview progress on QAWG action items as discussed at previous DMLT vF2F.
Butler Gen3
Demo of Oracle Registry and larger scale test run
DMLTf2f 2020
Fix dates for 2020 meetings and decide if some could be virtual
JSR

Unfortunately we need to prepare for JSR and directors review.

  • Demo was well received last year so shall we demo butler/pegasus/condor ?
    • LPG: I like this idea!
  • first Verification Control Doc draft .. ??  We have some test results in ...
  • Otherwise status slides 
  • Try to cover with locals + NCSA reps again - who for Butler Demo if we do that? 
  • Summary of the community broker workshop, conclusions, path forward?
  • Summary of the LSP review (a JSR recommendation), conclusions, path forward. 
Release Process & Policy1 hour? Depends how much discussion can converge beforehand.We have draft documents covering various aspects of the DM release process, but nothing has converged to the extent we can actually start using it yet. Depending on how far we get before the meeting, the discussion should either be “how do we move forward with drafting DMTN-106 & LDM-672?”, or it should be “does the DMLT agree with those documents?”.
Networks, Summit and Base Data Center Status and Planning 1 hour
  • Brief overview of Summit, Summit - Base, and Base - LDF Network status and schedule
  • Brief overview of Summit and BDC construction status and move-in schedule
  • Update on planned deployments of DM equipment to BDC, visits, tests/rehearsals in remainder of FY19 (and IT support required)
Changes to DM Science 15 minsI will present change in and to the work processes of the DM Science team
Summary and outcomes of the LSP Final Design Review30 minsWe will present the outcomes of the LSP-FDR. This presentation will either become or form the basis for input to the JSR (Gregory will give this presentation)
AuxTel data transfers, ingestion and access via LSP

John Swinbank (on behalf of Robert Lupton , Leanne Guy  and the DM-SST)

30 mins

Following discussion at the 2019-05-10 DM SST meeting, the aim of this session is to clarify the workflow for people (notably RHL) involved in analysing AuxTel (or other) data during commissioning & I&T. In particular: with what latency is this data made available for analysis at NCSA? By what mechanism?

Question: how much of this is addressed in DMTN-111?

Status of requirements flow down, doc tree, product tree

John Swinbank (in discussion with Eric Bellm & Leanne Guy )

30 mins

We request that the Systems Engineering / Architecture team provide an overview of the current status of, and future plans for, the product & document trees, together with an understanding of how requirements flow between them. In particular, this should address questions like:

  • What is the plan for finishing/adopting DMTN-106?
  • Will the document tree be linked to the product tree (e.g., with one requirements doc per product)?
  • Are documents like LDM-602 obsolete, on the basis that there is not an “Alert Production” product? If so, when will new documents be created and where should requirements live?

We understand that this is a work in progress, so a presentation of vision & timeline is fine!

Science Pipelines development plans30 minsPipelines are trying a more “agile“ approach to development this cycle. We'll discuss briefly the aims, implementation and implications.
Update from LSST@Asia 5-10 minsIf there is interest. This might also include updates about ODF - if we get any before this meeting 
Science Verification & Validation20-30 minsI will present how we plan to move forwards with requirements verification & metrics based on LSE-61, LDM-639 and the Jira LVV project. Who will do what (e.g new hires, overlap with commissioning efforts). 


Attached Documents

Action Item Summary