Mondays 12pm - 12:50pm ET
Yusra's Zoom: https://princeton.zoom.us/my/yusra
1.1.1. Attendees:
- Lauren MacArthur
- Robert Lupton
- Yusra AlSayyad
- Arun Kannawadi
- Dan Taranu
- Hsin-Fang Chiang
- Jeffrey Carlin
- Keith Bechtol
- Merlin Fisher-Levine
- Erik Dennihy
- Jeffrey Carlin
- Simon Krughoff
- Nate Lust
- Sophie Reed
- Joshua Meyers
- Erik Dennihy
- Eli Rykoff
1.1.2. Agenda:
- Announcements:
- Welcome Erik Dennihy!
- Review Action items from last week
- Fakes dashboard: stats have been made, but dashboard. Update due date
- Erics logs: need to close loop on this
- Generating sky frames: ticket created, questions on timing and scope. Maybe push off till post-Gen2. Should also consider need to coordinate with work on offsets. Might be pushed off.
- Modify rho statistics: still need to work with Josh. Original code from pipe analysis was ported to faro.
- Investigate rho stats ellipticity residuals: has to do with flags. May need Fred. PSFs with scarlet. There are issues, but Dan has not looked at rho statistics specifically, might be more generally informative on PSF issues
- Add absolute astrometry metric. Will do in faro now that have capability to compare with refcats.
- Review w_2021_30
- What was noteworthy in this rerun?
- Gen2(
-
DM-31184Getting issue details...
STATUS
:
- a few run issues noted by Morgan on - DM-31184Getting issue details... STATUS (but solved with the help of Yusra & Eli). Memory issues with coaddDriver (solved by running with --node 2 --procs 12)
- Lauren noticed a failed patch due to issues of
-
DM-31359Getting issue details...
STATUS
which oddly occurred on a different tract/patch ID than for the gen3 run
- See discussion on #dm-science-pipelines regarding negative pixels
- Gen2(
-
DM-31184Getting issue details...
STATUS
:
- What was noteworthy in this rerun?
- Gen3 (
-
DM-31182Getting issue details...
STATUS
):
- issue of - DM-31359Getting issue details... STATUS
- Brock's pipelines/DRP.yaml#healSparsePropertyMaps,consolidateObjectTable,diffimDRP is taking a while to start.
hanging query. - Eli reports a query with ~120 collections, still running, might not complete. Running healsparse, consolidate, (something else). Eli will post on dm-middleware-dev the query. One database query. Known limitation for the system. Nate offers to help debug.
- Gen3 (
-
DM-31182Getting issue details...
STATUS
):
- What changed? Annotate
- now have GAaP colors: something amiss with "optimal" flux vs. "PsfFlux" (all plots at https://lsst.ncsa.illinois.edu/~lauren/HSC_RC2_gen2/color/):
- Arun Kannawadi to investigate
- Pathologies that are different for galaxies and stars, and different for different datasets. Eli and Lauren have been looking into this. "optimal" = gaussian weight function matches the size of the source. PSF is optimal for point sources. Suggest to have another name than "optimal" to be clear about the algorithm.
- While investigating error message (DM-31345). Arun found a minor bug that is fixing now before deeper investigation of plots.
- What do we expect from w_2021_34?
- Changes toward Gen2-Gen3 parity:
- am making pipe_analysis plots for both gen2 & gen3 runs and gen2 vs. gen3 directly (see plots at https://lsst.ncsa.illinois.edu/~lauren/HSC_RC2_gen2/ and https://lsst.ncsa.illinois.edu/~lauren/HSC_RC2_gen3/ and https://lsst.ncsa.illinois.edu/~lauren/HSC_RC2_gen3NoExtCal/)
- still have parity through SFP [phew!] (all visit-level comparison plots at: https://lsst.ncsa.illinois.edu/~lauren/HSC_RC2_gen3NoExtCal/)
- Still some differences when adding external calibrations:
jointcal issue noted on - DM-29821Getting issue details... STATUS whereby a full visit gets 0 matches for every run
(could John Parejko or Clare Saunders follow-up?)- Every time run with Gen3, an entire visit has zero matches. Input ordering not deterministic. Might need help with someone who is expert in jointcal.
- fgcm still shows differences (is this just an input order issue that can't be synched?)
- This an example of comparison between Gen2 and Gen3. Different pattern for every visit. Is it possible to test that this is an ordering issue? 20 mmag variation peak to peak. Will never get a set order unless sorted yourself. Algorithm has been updated between Gen2 and Gen3. Ordering matters in FGCM because it chooses 10% of stars randomly that don't go into the fit.
- What new metric can we expect next time
- Changes toward Gen2-Gen3 parity:
- Review DC2
- w_2021_28
- lots of processing issues (memory limits, etc.) Mostly resolved. Had to setup master branch of meas_extensions_gaap
- now have GAaP stellar locus plots: some issues under investigation on - DM-31322Getting issue details... STATUS and - DM-31156Getting issue details... STATUS
- still have parity through SFP
- w_2021_32:
- running only tract 3829
- this made me sad: gen2 vs. gen3 SFM comparison (should be a flatline):
- perusing the changelog led me to
-
DM-30820Getting issue details...
STATUS
. It seems the change did not take effect for gen2 processing (config/imsim/processCcd.py is not importing the newly added config/imsim/characterizeImage.py). Lee Kelvin to confirm?
- Suspect that related to background modeling where a step is applied or not applied between Gen2 and Gen3. Fill a ticket to add this step. This was a simple mistake.
- plea to give some thought to changes to gen3 but not gen2 → avoid where possible, but if not feasible, please bring it up before merging (a decision may need to be made as to whether the merge should wait for gen2/gen3 parity is settled).
- When do we get to the stage of where we break parity intentionally because of improvements in Gen3? Need to at least look at the coadd once to sign off on parity.
- Two separate issues
- Improvements to both Gen2 and Gen3
- How much do results need to match? jointcal and fgcm are not deteministic. Scientific parity vs. bitwise / numerical / machine precision parity
- It would be a lot of work to make FGCM work the same way in Gen2 and Gen3
- Should jointcal and FGCM be deterministic / repeatable?
- w_2021_28
- AOB
1. Dispatching to SQuash
We will start making use of the--date-created
option when dispatching metrics to SQuaSH
with dispatch_verify.py. After a lengthy debate on #dm-hsc-reprocessing
(https://lsstc.slack.com/archives/C4JQP6FRS/p1628638062075200?thread_ts=1628633010.069000&cid=C4JQP6FRS),
all 2 votes opted for a format of:20xx-xx-xxTxx:xx:00Z
where all the x's come from the weeklies tag date found at:
https://eups.lsst.codes/stack/src/tags/(so, e.g., for w_2021_32 it would be:
Dan Taranu and Lauren MacArthur have created dashboards (https://chronograf-demo.lsst.codes/sources/2/dashboards) for the Gen3 and Gen2 DC2 runs (currently called DRP metrics monthly DC2 (Gen3/2to3, test-med-1) and DRP monthly metrics for DC2 (Gen2), but we will homogenize that!). Simon Krughoff is clearing out all previous dispatched/uploaded metrics so that we can start from a clean slate and use the appropriate--date-created 2021-08-05T08:29:00Z
)--date-created
timestamps. 2. Slurm Resources
The resources available for gen2/slurm jobs have been severely restricted (and may
imminently become further so). We are all supposed to be moving from gen2/slurm to
gen3/bps. However, there are still a number of users of the former, so this is a
friendly reminder (and plea) to start making that switch wherever possible. There
is no ban on submitting jobs to the slurm cluster, but given the requirement to
get through the gen2 RC2/DC2 reprocessing (for at least a few more weeklies...),
you may be asked to ease off on personal runs during times of high traffic on the
slurm cluster.3. Reduced gen2 processing
Partly in light of the above slurm resource limitations, we have decided to
decrease our gen2 processing by one tract per dataset. So, going forward (and
starting with w_2021_32) we plan to fully process the following tracts in gen2:RC2: 9813 9697
DC2: 3829If you have any reason to contest this (e.g. you know a given pathology only
ever occurs in one of the omitted tracts), please shout!- FGCM is global fitting routine. All the tracts inform all the tracts.
- Dashboards.
- Create date. Use tag date for the weekly for timestamps on chronograf.
- DC2 Gen3 dashboard.
- Is it possible to file issue in chronograf to not use last 15 min as time window, but some other choice of time window as default
- Currently on a single dashboard for both DC2 and HSC_RC2
- Label everything with underscores instead of spaces for dataset name
- Should we prune the set of tracts to select from?
- psfPhotRepStar. Like PA1, but for 4 different signal to noise bins. The lowest SNR bin is 5-10, 10-20, ... Not sure why second bin in nan values.
- We have a Gen3 dashboard!
1.1.3. Action Items
Description | Due date | Assignee | Task appears on |
---|---|---|---|
| 04 Sep 2020 | Sophie Reed | DRP Metrics Monitoring 2020-08-07 |
DRP Metrics Monitoring 2024-04-22 | |||
DRP Metrics Monitoring 2024-03-18 | |||
| DRP Metrics Monitoring 2023-06-26 | ||
| DRP Metrics Monitoring 2023-06-26 | ||
| DRP Metrics Monitoring 2022-10-31 | ||
| Yusra AlSayyad | DRP Metrics Monitoring 2021-06-14 | |
| Arun Kannawadi | DRP Metrics Monitoring 2021-04-19 | |
| Arun Kannawadi | DRP Metrics Monitoring 2021-03-01 | |
| Yusra AlSayyad | DRP Metrics Monitoring 2021-01-04 | |
| Jeffrey Carlin | DRP Metrics Monitoring 2021-01-04 |