This document describes the procedure for validating the first non-beta release of the jointcal package as a replacement for meas_mosaic.  Longer term requirements for jointcal are described in a separate document (draft here).  Because meas_mosaic only runs on Hyper Suprime-Cam data, the scientific performance tests here are also focused on HSC data.  The jointcal beta already supports additional cameras and we expect it to continue to do so, but as there's nothing to replace for other cameras there are no short-term requirements for its performance on them.

Basic Requirements

  • jointcal shall be runnable as a command-line Task that takes the src and ref_cat datasets as input and produces (at least) a wcs and photoCalib dataset as outputs, which provide updated astrometric and photometric calibrations.
  • jointcal shall perform a different fit for each tract, and allow multiple tracts to be run in single invocation (at least in serial, which is all meas_mosaic does).  Some visits may thus be processed with multiple tracts.
  • jointcal shall fit models that have a level of sophistication similar to those in meas_mosaic (if it fits simpler models, it will be because the tests below demonstrate that they are sufficient to generate similar-quality results).  That involves:
    • astrometry: a full-focal-plane polynomial transform for each visit composed with a rotation and translation for each CCD (which is the same for all visits in the fit).
    • photometry: a full-focal-plane polynomial scaling for each visit multiplied by a constant scaling for each CCD (which is the same for all visits in the fit) and the determinant of the Jacobian of the astrometric model.

Science Quality Tests

As of DM-10728 - Getting issue details... STATUS  and DM-10729 - Getting issue details... STATUS , the validate_drp package can now (optionally) utilize the wcs and photoCalib datasets to calibrate the src dataset prior to computing its astrometric and photometric accuracy metrics.  To replace meas_mosaic, we require jointcal to match (no significant difference in metric value) or improve upon the following metrics as computed with meas_mosaic on HSC data.

  • AM1, AF1 (astrometric accuracy and outlier fraction on 5 arcmin scales).
  • AM2, AF2 (astrometric accuracy and outlier fraction on 20 arcmin scales).
  • Median astrometric RMS (left plot of validate_drp's check_astrometry plot).
  • PA1 (astrometric accuracy).
  • Median photometric RMS for SNR>100 (upper-left plot of validate_drp's check_photometry plot).

No significant difference is a bit tricky to define here because most of these metrics are not accompanied by uncertainty estimates; to estimate them we'll use the RMS of the meas_mosaic metric (henceforth σ) results over different tracts of HSC Wide.  Differences less than 0.1 mmag for photometry or 0.1 mas for astrometry (even if larger than the cross-tract RMS, though I doubt the RMS across tracts will be that small) can also be ignored.

We will run the metric comparison independently on each of at least 19 tracts of HSC-SSP Wide data, as well as the (single tract) HSC-SSP UDeep COSMOS dataset.  No more than 1 of these 20 tracts shall have a regression from the meas_mosaic performance in that tract by more than 2σ, and none may deviate by more than 3σ.  The number of tracts may be extended if initial failure to achieve these rates appears to be due to statistical fluctuations.

The WCS mappings and photometric scalings output by meas_mosaic and jointcal should also be manually inspected for at least 5 visits in each of 5 tracts (including COSMOS) to look for unusual differences.

Doing the Work

Scheduling work is obviously a T/CAM responsibility, but it may be best to try to share the generation of the input data repositories for these tests with re-run of the full HSC-SSP PDR1 (NCSA may have already scheduled some time for this later this cycle).

Scientific validation of the results is the responsibility of the Pipeline Scientist.

  • No labels

3 Comments

  1. Thank you for writing this up. It looks like a reasonable plan to me.

    Are any of the datasets you have selected for comparison in the ~few TB size (instead of >many TB), such that I could test with them on my workstation?

  2. The src files for one HSC-SSP Wide tract (in one band) should only be < 15 GB.  calexps (which you may or may not need) look like they'll probably be < 400 GB.  The part that will take a bit of work will be extracting just what you need from the big HSC repo on lsst-dev.  I'd recommend tract=8766 or tract=8767, which is what we're now regularly processing for each weekly.  There's a list of the visits that overlap both tracts at S17B HSC PDR1 reprocessing.

  3. I do need the calexps, because that's where the metadata is stored. Sounds like I can pull down a few of those tracts to play with. Thanks.