Deep Processing

Overview

This page attempts to capture at a high level the software and algorithm development necessary to implement the processing of objects detected at the full survey (at the time of a particular data release), including the detection, deblending, and measurement of sources too faint to be detected in any individual visit. The algorithms to be used here are generally poorly understood; we have many options for extending well-understood algorithms for processing single-epoch data to multi-epoch data, and considerable research is needed to find the right balance between computational and scientific performance in doing so. Unfortunately, different algorithmic options may require vastly different parallelization and data flow, so we cannot yet make assertions about even the high-level interfaces and structure of the code. We do, however, have a good understanding of most of the needed low-level algorithms, so our goal should be to implement these as reusable components that will allow us to quickly explore different algorithmic options. This will also require early access to parallelization interfaces, test data, and analysis tools that will be developed outside the DRP algorithms team.

Inputs

Calibrated Exposures from Visit Processing
Final relative astrometric calibration
Final relative photometric calibration
Moving and transient sources from Image Differencing and MOPS*
External Catalogs (e.g. Level 3 inputs or known bright stars)

* We don't need Image Differencing outputs to start the Deep Processing (e.g. we can do Image Coaddition first), and there may be some value to doing the DRP Image Differencing at the same time as some parts of the Deep Processing (Deep Background Modeling, in particular).

Stages/Components

In rough order - exact flow is very much TBD.

Image Coaddition

We'll almost certainly need some sort of coadded image to detect faint sources, and do at least preliminary deblending and measurement. We'll use at least most the same code to generate templates for Image Differencing.

Deep Background Modeling

Algorithm

Traditional background modeling involves estimating and subtracting the background from individual exposures separately. While this will still be necessary for visit-level processing, for deep processing we can use a better approach. We start by PSF-matching and warping all but one of the N input exposures (on a patch of sky) to match the final, reference exposure and subtract these exposures, using much of the same code we use for Image Differencing. We then model the background of the N-1 difference images, where, depending on the quality of the PSF-matching, we can fit the instrumental background without interference from astrophysical backgrounds. We can then combine all N original exposures and subtract the N-1 background difference models, producing a coadd that contains the full-depth astrophysical signal from all exposures but an instrumental background for just the reference exposure. We can then model and subtract that final background using traditional methods, while taking advantage of the higher signal-to-noise ratio of the sources in the coadd. We can then also compute an improved background model for any of the individual exposures as the combination of its difference background relative to the reference and the background model for the reference.

Status/Challenges

We have prototype code that works well for SDSS data, but experiments on the HSC side have shown that processing non-drift-scan data is considerably more difficult. One major challenge is selecting/creating a seamless reference exposure across amplifier, sensor, and visit boundaries, especially in the presence of gain and linearity variations. We also need to think about how the flat-fielding and photometric calibration algorithms interact with background matching, as the fact that the sky has a different color than the sources makes it impossible to for a single photometric solution to simultaneously generate both a seamless sky and correct photometry - and we also need to be able to generate the final photometric calibration using measurements that make use of only the cruder visit-level background models.

The problem of generating a seamless reference image across the whole sky is very similar to the problem of building a template image for Image Differencing, and in fact the Image Differencing template or some other previously-built coadd may be a better choice for the reference image, once those coadds become available (this would require a small modification to the algorithm summarized above). Of course, in this case, the problem of bootstrapping those coadds would still remain.

We also don't yet have a good sense of where background matching belongs in the overall processing flow. It seems to share many intermediates with either Image Coaddition or Image Differencing, depending on whether the background-difference fitting is done in the coadd coordinates system (most likely; shared work with coaddition) or the original exposure frame (less likely, shared work with Image Differencing). It is also unclear what spatial scale the modeling needs to be done at, which could affect how we would want to parallelize it.

Dependencies

Photometric Self-Calibration: the problems we're likely to see in background matching fully-calibrated inputs may be completely different, so we need to have this in place before we settle on a final algorithm
High-Quality ISR for Precursor Datasets: performance will depend on the details of both the astrophysical background and the camera, so tests would be carried out ideally on a combination of HSC, DECam, an PhoSim data. But trying to do background matching on any of these without doing a very good job on ISR first would be a waste of time.
Parallelization Middleware: putting all of the steps together will require at least some sort of scatter-gather, though we could continue with our current approach of manually starting tasks that correspond to different parallelization units while prototyping. Some algorithmic options may end up requiring even more complex interprocess communication, but it's hard to say at this point.

Effort Required

This is a difficult algorithmic research problem that interacts in subtle ways with ISR, Photometric Self-Calibration, and the data flow and parallelization for Image Coaddition. It should not be a computational bottleneck on its own, but it will likely need to piggyback on some other processing (e.g. Image Coaddition) to achieve this.

Scheduling

Because we can use traditional background modeling outputs as a placeholder, and the improvement due to background matching is likely to only matter when we're trying to really push the precision of the overall system, we can probably defer the complete implementation of background matching somewhat. It may be a long research project, so we shouldn't delay too long, though. However, wey should have an earlier in-depth design period to sketch out possible algorithmic options (and hopefully reject a few) and figure out how it will fit into the overall processing.

Space shortcuts

Page tree