The Director’s Review in preparations of the Joint Status Review is planned for 7-9 Sep 2021. There will be a strong focus on SITCom. We should present performance metric verification status.
pipeline Characterization Metric Report
example per-detector metric (e.g., 5 sigma depth)
International in-kind contributor Nacho Sevilla is interested in starting to get prepared for commissioning SV related work. What is our approach to starting to incorporate such in-kind contributions?
What do we need? What do we want?
Concept to do direct imaging survey with AuxTel
Getting faro to a stable place
What is the usage in 6 month time
Capability to defining work packages. Specific dataset and specific tests to run and come back with an analysis.
Example: is stellar locus stable, e.g., in HSC RC2? Can you make sense of the results in science context.
Plan is to use labels for sandbox activities
Postpone discussion of failure modes
Cronograf dashboard now set up to show metrics computed with faro for both DC2 and HSC RC2. Can select between native Gen3 processing and Gen2→Gen3 conversion. This is good milestone.
Nightly dashboard. There is one successful run on 11 August? Thought we disabled this job?
validate_drp_gen3. It was breaking. Returning all Nans. Replace with RC2 subset? Is there value in having another small dataset regularly processed, e.g., ci_imsim.
We used to do CFHT, DECam, and HSC running nightly.
According to jenkins, still running on validate_drp_gen3 nightly. jenkins is happy, but NaN values are not being displayed in cronograf.
Just got faro running as part of ci_imsim
About to deploy checking biases on the mountain. Robert will put ticket number in the minutes. We will have notebooks associated with CP verify package. Many calibration products. Deliver as notebook. Idea that this will be run every afternoon/night to check that that calibrations products were generated.
Bugs, issues, topics, etc of the week
Congratulations to Erik Dennihy who has completed his first faro ticket
How do we enable faster access to large numbers of metrics in data repos (e.g., tens of thousands of per-detector metric values could be generated in a single night)?
Homogenization of run methods is high-priority item
What guidance do we provide for developers regarding executing faro on real data (and ensuring success) before merging changes? We need to make sure we aren't merging breaking changes – these breakages are often not evident from scons/Jenkins. (see conversation in
It seems we need an efficient way to actually run a pipeline against actual data repository to test the pipeline yaml files and runQuantum methods across multiple analysis contexts
We will need a technical solution to access large numbers of metric values. Summary task can write a aggregate statistic (e.g., mean), but could also create astropy table for example. Another solution would be putting results in relational database (but this functionality is not planned to be soon, no concrete plan to implement, not going to happen in next 3 months).
Keith to repeat test with more recent weekly
We might need something ad hoc. Might need to compare with metadata.
Colin has this as high priority
Keith Bechtol to report on timing tests of loading thousands of metric measurements into memory with more recently weeklies
Dan has PR as short term solution to consistency of run methods
Integration test. Executing faro on real.
Need small CI dataset for faro. Execute faro against some dataset.
Ideally, we would start from an already processed dataset
CI HSC has coadd outputs. 2-3 hours.
Is there a processed ci_hsc dataset. We could set up a cron job?
Multiple potential solutions to have faro run as part of routine pipeline testing and avoid causing breakages on master, i.e., make sure that jenkins run fails if faro is broken
Could faro pipelines be merged with DRP pipelines? (obs_lsst/pipelines/imsim/DRP.yaml)
Can we just add faro to ci_hsc pipelines?
It is possible to run specific analysis contexts of faro (e.g., per detector, per tract) separately
Hsin-Fang suggests that we separate out according to the stage of analysis to make more modular and possible to scale to larger datasets
Adding faro to ci_imsim and ci_hsc would make it so that faro is executed as part of jenkins