Date & Time 

  11:00 PDT

Location

Browser

Room System

Phone Dial-in

https://bluejeans.com/103664856

  1. Dial: 199.48.152.152 or bjn.vc
  2. Enter Meeting ID: 103664856 -or- use the pairing code

Dial-in numbers:

  • +1 408 740 7256
  • +1 888 240 2560 (US Toll Free)
  • +1 408 317 9253 (Alternate Number)

Meeting ID: 103664856

Attendees

Regrets

Discussion items

ItemWhoNotesConclusions and Action Items
Project/Science Updates

Scarlet workshop in Naples  7-9 Oct

  • Robert Lupton,  Leanne Guy and Peter Melchior were invited speakers.  Leanne gave an overview of data management and the science platform, Robert spoke about the science pipelines and crowded fields  For most of the rest of the week, Peter ran a tutorial on using Scarlet (independently of the LSST stack). 
  • Their interests are in crowded stellar fields - Scarlet is one part of this problem, not the whole solution.  We could have run this on the LSP and using the Stack. 
  • It was stressed that LSST is currently evaluating Scarlet as a deblender and that no decision has yet been made. 
  • Italian community is very keen to maintain and  grow their existing MOU with LSST into operations beyond the current 15 PIs and are looking for ways that they can get involved. 
  • For us - it was good to get an idea of the interests outside of the non-DESC community. 



Scientific datasets 

DM-15448 - Getting issue details... STATUS

Latest version of document: https://dmtn-091.lsst.io/v/DM-15448/index.html

  • What is the motivation for having a CI and SMALL dataset? There seems to be a lot of overlap. Do we need something on that intermediate scale or do we just need CI/med/large
  • Vision is that small is still something that can be run on a single developer machine in a few hrs. MEDIUM is the next scale requiring a cluster somewhere.
  • Having a dataset that can be run a few times a day while developing (SMALL)  is useful rather than having to wait overnight for results (MEDIUM). 
  • Consensus that a bigger MEDIUM rather than removing SMALL would differentiate them better and be useful.
  • Is CI_HSC in fact 'SMALL' in this definition? CI_HSC  is 8GB( ~ 30 CCDs from ~ 12 visits)  so yes in that sense but it currently takes ~8 hrs to run (not SMALL). Most of this is thought to be due to Jenkins processes. On a few cores on most machines it takes 45 mins to hour with most of the processing down in ~20 mins in this case. Needs to be profiled to understand, perhaps some inefficiencies in I/O. Reported to be faster under Gen3 (30 mins on 8 cores) but no detailed timings have been done. 
  • The technote does not address computational performance monitoring, only  or just algorithmic scientific performance  monitoring. Even though we are not running testing in the context of an orchestration workflow on a known hardware configuration, it is nonetheless useful to know and track how long it takes to process these datasets. We should recommend doing that and add to the document especially or MEDIUM and LARGE datasets. It would also be useful to know if a SMALL dataset suddenly starts taking twice as long to run. 

  • What is the tradeoff between individual developers knowing they should run a CI/SMALL dataset regularly to check they didn’t break something algorithmically, and a regular CI that goes through SQUASH? Is that sufficient to catch regressions? AP team looks weekly, is that sufficient?

  • Agreement with the following for running on dataset on Jenkins: 
    • CI level is required for a merge 

    • SMALL is at developer discretion, with the understanding that they will fix any breaks. 

    • MEDIUM /LARGE for algorithmic or larger scientific changes changes, but not unless there is an expectation that there would be a change, e.g don’t run on HSC-RC2 unless an algorithmic change might be expected to produce a different output. 

  • This means that the current CI_HSC is 'SMALL' both in runtime and in usage. 
  • Could maybe make CI_HSC 20-50% smaller but need to maintain a sufficiently interesting dataset for testing (area, depth, # epochs/patch)
  • MEDIUM dataset definition is satisfied by HSC-RC2. Lauren has put a lot of effort into defining this. Details in the TN. This will be the main dataset that will give a balance between ’scientifically interesting’ and ‘does not take too long to run’
  • Key point about this this dataset that it is not representative – Lauren intentionally included more edge cases. This makes it more interesting for scientific development and performance monitoring but does mean that any predictions will be  conservative or non representative to some degree. We should bear this in mind when doing characterizstion or commissioning level studies. 
  • Currently takes longer than a night to run. Not clear how much is  limited  by the current middleware's ability to balance over many more cores. Cannot be easily automated at this time as it requires some babysitting. The new middleware will address this and automation should be possible with Gen3 a workflow system. 
  • Hsin-Fang currently runs these on a monthly basis and is handing over to NCSA. Can we go back to fortnightly. 
  • LARGE: We have currently only run once or twice on PDR1. PDR2 still coming out of Japan. 

Simon Krughoff asks, is it within our remit to define a mechanism for identifying exactly what data are to be processed in each context?

AOB

Reminder that we will have a special meeting on LOY1 alerts on Wednesday



List of SST tasks (Confluence)

DescriptionDue dateAssigneeTask appears on
  • Robert Lupton Clarify the meaning of time in the object table. 1 sentence description in sdm_schemas, can link to a short DMTN.  Update 2022-02-09: Meeting to resolve this on 2022-02-21  
28 Feb 2022Robert Lupton2018-11-05 DM SST F2F Agenda and Meeting notes
  • Gregory Dubois-Felsmann check if SDM standardization is adequately represented in project documents, and whether DMTN-067 should be required.
31 Mar 2022Gregory Dubois-Felsmann2022-02-14 DM-SST Virtual F2F Agenda and Meeting notes
28 Feb 2023Leanne Guy2023-01-23 DM-SST Agenda and Meeting Notes
  • Leanne Guy talk to Steve R about presenting plans for the ShearObject table to PST and SciCollab chairs   
20 Mar 2023Leanne Guy2023-02-27 DM-SST Agenda and Meeting Notes
31 Mar 2023Jim Bosch2023-02-27 DM-SST Agenda and Meeting Notes
  • Leanne Guy  talk to Gregory Dubois-Felsmann to review the original intent of the AFS-related Portal requirements before deciding on a course of action  
29 May 2023Leanne Guy2023-05-01 DM-SST Focus Meeting - Brokers in Commissioning
  • Leanne Guy Prepare to consult the PST on the question of providing compressed PVIs for AP outputs, to cover the period before the data become available in a DR.  
02 Jun 2023Leanne Guy2023-03-27 DM-SST Agenda and Meeting Notes
  • Jim Bosch Incorporate 30-60 day period for raws on disk into the strawman proposal and present to KT  
26 Jun 2023Jim Bosch2023-05-08 DM-SST Agenda and Meeting Notes
  • Parker Fagrelius Patrick Ingraham  how long will it take to do a scan as described? No need to scan the whole WL range but will require additional points outside nominal lambda range.  
30 Jun 2023Parker Fagrelius2023-03-27 DM-SST Agenda and Meeting Notes
31 Jul 2023Colin Slater2023-07-10 DM-SST Agenda and Meeting Notes
  • Eli Rykoff , Leanne Guy  Develop a proposal for what calibration processing, hardware, data we actually need and what will be needed for DR1. This has implications for the ORR and for prioritisation of work in commissioning  
31 Jul 2023Eli Rykoff2023-01-30 DM-SST Agenda and Meeting Notes
  • Yusra AlSayyad will look to see if there is any effort to help on option 1  
28 Aug 2023Yusra AlSayyad2023-08-14 DM-SST Agenda and Meeting Notes
  • Jim Bosch  Provide a physical example of that a  up on cell table would look like fo the Colin Slater / DAX team to review  
31 Aug 2023Jim Bosch2023-02-27 DM-SST Agenda and Meeting Notes
  •  "What is the pathway to defining the data products that are required to meet DMS-REQ-0266" Jeffrey Carlin   
30 Nov 2023Jeffrey Carlin2023-10-23 DM-SST vF2F Agenda and Meeting Notes
30 Nov 2023Gregory Dubois-Felsmann2023-10-23 DM-SST vF2F Agenda and Meeting Notes
30 Nov 2023Leanne Guy2023-10-23 DM-SST vF2F Agenda and Meeting Notes
  • Jeffrey Carlin follow up with KT on DMS-REQ-0176 and DMS-REQ-0315 to update/disaggregate this for latest base/summit infrastructure split.  
30 Nov 2023Jeffrey Carlin2023-10-23 DM-SST vF2F Agenda and Meeting Notes
  • Jim Bosch Follow up on the possibility of investigating further the ability to process 2 collections in parallel.   
31 Jan 2024Jim Bosch2023-12-04 DM-SST Agenda and Meeting Notes
31 Jan 2024Jeffrey Carlin2023-12-04 DM-SST Agenda and Meeting Notes
Gregory Dubois-Felsmann2023-10-23 DM-SST vF2F Agenda and Meeting Notes