2pm Pacific Time
Zoom: https://washington.zoom.us/j/98846655020
attending:
regrets
- Ian Sullivan
- Krzysztof Findeisen (vacation)
Topics for discussion:
- Project updates (Eric Bellm , Ian Sullivan ):
- PCW TRs
- fill out sprint planning Doodle
- Auxtel processing status (all):
- run coming up July 4–going try running diffim
- Pipelines Infrastructure (Krzysztof Findeisen , John Parejko ):
- John: Wednesday at Science Pipelines, will discuss CalibrateImageTask
- John: did a baseline profiling run ahead of py 3.11. 160 sec on USDF rome-001 without io and transinet. Transinet adds 80 seconds–largest time unit. DM-39491.
- Eric: various tradeoffs of latency, pipeline performance, hardware to consider
- Precursor processing & Campaign Management ( Kenneth Herner , Erin Howard):
- Erin: started many runs but none finished due to database space issues, competition for resources prior to USDF shutdown
- Ken:
- did some diffim sprint runs with sqlite so he could try to make the run comparison utility–some import issues to sort out but it's close
- clustering seems to be working okay? Erin: 99% sure it's working okay
- Eric Bellm : discuss retention policy for precursor runs in APDB: need to communicate sizing needs to KT/Dan/USDF
- Meredith: keep all the runs!
- Eric: wonders if we can dump old runs out of a db for "cold storage"? Ken: probably possible but somebody's got to do the work.
- John: thinks we should not retain or work with data older than 6 months. db schema changes make it hard to work with old data, and it will slow us down to have to support old schema versions in our analysis code
- Eric: we probably want a minimum number of iterations
- Ken: minimum # of kept version is operationally simpler (2-3 old versions)–can wipe out oldest as part of new processing
- Eric: Nima, what is needed for some degree of replicability of the ML models? Nima: it's a difficult task, not so usual even in publications. Would need to keep the original dataset–it's a few TB. But we're aiming to retrain gradually to tune the model. For reproducability, either need images or cutouts, plus the labels? Eric: do we need to keep unlabeled cutouts? Nima: for the basic case, no; but for domain adaptation yes. But in general just need cutouts + labels. Eric: APDB is catalog data; do we need to preserve it? Could also imagine using persisted DIASource catalogs
- Eric Bellm write up retention policy and sizing for APDB
- Eric Bellm write up retention policy and sizing for APDB
- Image differencing algorithms, DCR ( Bruno Sanchez , Ian Sullivan ):
- both out
- alert distribution (Brianna Smart , Eric Bellm ):
- Bri: talking to Dan Speck about splitting alert broker away from the RSP–running into namespace and security conflicts with the RSP. alert broker, prompt processing, one other thing will live separately
- Bri: daily topics? wondering about how to create them. requires external
- transinet (Nima Sedaghat Harshit Rai , Eric Bellm ):
- Harshit working on collecting labels from Zooniverse
- Nima: presented on end-to-end transinet at UIUC
- Solar-system processing (Ari Heinze ):
- out today
- Review CI (https://usdf-rsp-dev.slac.stanford.edu/chronograf/):
- USDF is down
- Review outstanding action items
- QA meeting July 10:
- off-chip centroids–resolved by maxDistToPeak–we don't have a great example before a full dataset run. Bri did the work but is out. Eric or John to present?
- Bruno on fakes progress/updates?
- Hsin-Fang on Auxtel run?
- Eric on detection thresholds/fakes forced photometry?
- streak masking/long trail rejection (Bri & Meredith)–Bri out July 10
- AOB
Action Items
Description | Due date | Assignee | Task appears on |
---|---|---|---|
| 17 Jul 2023 | Eric Bellm | AP Pipeline Meeting, 2023-06-26 |
| 11 Sep 2023 | Eric Bellm | AP Pipeline Meeting, 2023-08-21 |
| 25 Sep 2023 | Eric Bellm | AP Pipeline Meeting, 2023-08-14 |