Page History

...

Item

Who

Pre-meeting notes

Notes and Action Items

News, announcements and task review

The Director’s Review in preparations of the Joint Status Review is planned for 7-9 Sep 2021. There will be a strong focus on SITCom. We should present performance metric verification status.
- chronograf dashboard
- pipeline Characterization Metric Report
- example per-detector metric (e.g., 5 sigma depth)
- documentation
International in-kind contributor Nacho Sevilla is interested in starting to get prepared for commissioning SV related work. What is our approach to starting to incorporate such in-kind contributions?
- What do we need? What do we want?
- Concept to do direct imaging survey with AuxTel
- Getting faro to a stable place
- What is the usage in 6 month time
- Capability to defining work packages. Specific dataset and specific tests to run and come back with an analysis.
  - Example: is stellar locus stable, e.g., in HSC RC2? Can you make sense of the results in science context.

Plan is to use labels for sandbox activities
Postpone discussion of failure modes
Cronograf dashboard now set up to show metrics computed with faro for both DC2 and HSC RC2. Can select between native Gen3 processing and Gen2→Gen3 conversion. This is good milestone.
Nightly dashboard. There is one successful run on 11 August? Thought we disabled this job?
- validate_drp_gen3. It was breaking. Returning all Nans. Replace with RC2 subset? Is there value in having another small dataset regularly processed, e.g., ci_imsim.
- We used to do CFHT, DECam, and HSC running nightly.
- According to jenkins, still running on validate_drp_gen3 nightly. jenkins is happy, but NaN values are not being displayed in cronograf.
Just got faro running as part of ci_imsim
About to deploy checking biases on the mountain. Robert will put ticket number in the minutes. We will have notebooks associated with CP verify package. Many calibration products. Deliver as notebook. Idea that this will be run every afternoon/night to check that that calibrations products were generated.

Bugs, issues, topics, etc of the week

All

Congratulations to Erik Dennihy who has completed his first faro ticket
How do we enable faster access to large numbers of metrics in data repos (e.g., tens of thousands of per-detector metric values could be generated in a single night)?
Some errors were introduced to faro in
Jira
server JIRA
columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId 9da94fb6-5771-303d-a785-1b6c5ab0f2d2
key DM-31381
. See Slack threads here and here.
- Homogenization of run methods is high-priority item
  Jira
  server JIRA
  serverId 9da94fb6-5771-303d-a785-1b6c5ab0f2d2
  key DM-31061
What guidance do we provide for developers regarding executing faro on real data (and ensuring success) before merging changes? We need to make sure we aren't merging breaking changes – these breakages are often not evident from scons/Jenkins. (see conversation in
Jira
server JIRA
columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId 9da94fb6-5771-303d-a785-1b6c5ab0f2d2
key DM-31382
)
- It seems we need an efficient way to actually run a pipeline against actual data repository to test the pipeline yaml files and runQuantum methods across multiple analysis contexts

We will need a technical solution to access large numbers of metric values. Summary task can write a aggregate statistic (e.g., mean), but could also create astropy table for example. Another solution would be putting results in relational database (but this functionality is not planned to be soon, no concrete plan to implement, not going to happen in next 3 months).
- Keith to repeat test with more recent weekly
- We might need something ad hoc. Might need to compare with metadata.
- Colin has this as high priority

Keith Bechtol to report on timing tests of loading thousands of metric measurements into memory with more recently weeklies 31 Aug 2021

Dan has PR as short term solution to consistency of run methods
Integration test. Executing faro on real.
- Need small CI dataset for faro. Execute faro against some dataset.
- Ideally, we would start from an already processed dataset
- CI HSC has coadd outputs. 2-3 hours.
- Is there a processed ci_hsc dataset. We could set up a cron job?
Multiple potential solutions to have faro run as part of routine pipeline testing and avoid causing breakages on master, i.e., make sure that jenkins run fails if faro is broken
- Could faro pipelines be merged with DRP pipelines? (obs_lsst/pipelines/imsim/DRP.yaml)
- Can we just add faro to ci_hsc pipelines?
It is possible to run specific analysis contexts of faro (e.g., per detector, per tract) separately
- Hsin-Fang suggests that we separate out according to the stage of analysis to make more modular and possible to scale to larger datasets
Adding faro to ci_imsim and ci_hsc would make it so that faro is executed as part of jenkins

Reprocessing status and metrics review

Hsin-Fang Chiang would like us to add faro pipelines to the DRP yaml pipeline definition. This means that faro will be run automatically as part of monthly reprocessing.

(Hsin-Fang will join the meeting at 11:30)

Development status

Fall 2021 epic
Jira
server JIRA
serverId 9da94fb6-5771-303d-a785-1b6c5ab0f2d2
key DM-30748

AOB

Next meeting 24 Aug 2021

...

Space shortcuts