DRP LDM-240 Breakdown

For several milestones, we define “prototype” and “complete” versions. We expect the prototype to provide an initial version of the described functionality with a plausible interface. Based on experience developing the prototype, we expect to refine or rewrite this initial version to provide the implementation used in production.

02C.04.01 - App framework for catalogs

Bosch's interpretation: "App framework for catalogs" essentially means "afw work done by Princeton". There's also "App framework for exposures", which means "afw work done by UW". That border is extremely flexible, and should be based on expertise and effort availability, not whether it's closer to DRP or AP processing.

This is by no means an exhaustive list (it's at most 5% of all the work I think we're likely do on afw).

FY15 Footprints redesigned and refactored

New design proposed through RFC-37.
Goals include both a cleaned up, easier to use API and better efficiency.
Essentially a prerequisite for rewriting the low-level deblender code

FY16 Spherical and image geometry libraries integrated

Currently two completely separate libraries, which is bad from an interface standpoint
Challenging packaging/dependency issues: high-level code not wanted by qserv would go naturally the same namespaces as low-level code needed by qserv.

FY16 Psf & Kernel redesigned and refactored; to include chromaticity and uncertainty

We'll need very different PSF classes and PSF modeling drivers to move from (DETC) Stage 3 algorithmic quality to Stage 4 algorithms quality.
We have long-standing interface issues in the relationship between the PSF and Kernel classes that need to be resolved.

FY16 Support for persistable object fields in tables

This is one step towards rolling back the proliferation of table subclasses: we need a new subclass every time we need to stuff a new first-class object in a record. Ultimately I'd like to just have a single Record class (rather than SourceRecord, ExposureRecord, AmpInfoRecord, etc.)

FY16 Support for join iterators in tables - we need a way to represent one-to-one, one-to-many, and many-to-many relationships between afw tables.

Needed for saving samples in galaxy fitting algorithms
Useful for representing spatial matches.

02C.04.02 - Calibration products pipeline

I don't think it's worth talking about this at all yet. Maybe ask Robert to do it, but this section of LDM-240 is essentially up in the air until the calibration plan is solidified.

Cross-talk matrix derivation

Master bias and dark frame

Defect mask construction

Derive illumination correction, pupil ghost, flat field

Pipeline for reduction of calibration telescope spectra

Master flat production from monochromatic flats

Estimation of telescope-camera bandpasses from monochromatic flats

Derivation of atmospheric models.

Fringe frame construction and subtraction

Produce initial optical ghost catalog

Radiometer and GPS data processing, incorporation in bandpass est.

Derive system bandpass given all calibration data

Full pipeline functionality

02C.04.03 - PSF estimation

FY15 Interface for full focal plane PSF estimation

We need to be able to build PSF models using information from all CCDs in a visit at once, including information from the telescope beyond what's in the science CCDs (e.g. wavefront information).
Flesh out the input and output requirements of advanced PSF estimation algorithms and provide an appropriate interface for implementing them.
Existing algorithms are adapted to the new interface, but no functionality changes.

FY16 Optics & atmosphere PSF modelling

Try to build a PSF model with physically motivated parameters (instead of just fitting and interpolating star images).

FY16 Existing wavefront estimation code incorporated

An initial Matlab implementation is available from elsewhere in LSST. If possible, we aim to incorporate that into the stack, rather than re-writing from scratch.

Estimation of undersampled PSFs

Sometimes the seeing will good enough that the PSF is undersampled, which breaks many of our algorithmic assumptions. We'll need new algorithms in order to avoid throwing away the data.

Estimation of camera/telescope contribution to PSF from wavefront sensor and camera metrology

Include these additional sources of information in our physically-motivated PSF models.

PSF estimation on simulated data at SRD specification level

Get PSF modeling working according to SRD specs on the best simulations we can build, using our physically-motivated PSF models.

Color dependence in PSF model

Include chromaticity in the PSF model (due to both atmosphere and optics).
Note that this is included in the interface in one of the above 02C.04.01 tasks, but we won't necessarily have a PSF model that makes use of it until here.

PSF estimation on comcam data at SRD level

Get PSF modeling working according to SRD specs on commissioning data, using our physically-motivated PSF models.

02C.04.04 - Image coaddition pipeline

FY16 PSF-matched image coaddition

Build coadds after matching input PSFs to a common predefined output PSF.
We need to figure out when we do this relative to warping in order to minimize noise covariance terms on the coadd.

FY16 Likelihood coaddition for point sources (Kaiser coaddition)

Likelihood coadds are formally the optimal way to detect sources in multi-epoch data, but the output isn't a traditional coadd, so our traditional algorithms won't work on them.
We need to:
- Implement building likelihood coadds and detection on them.
- Test detection on likelihood coadds and compare to traditional detection methods
- Explore options for how to do downstream processing beyond detection

FY16 Chi^2 and filter-matched coaddition

Explore options for multi-band detection that go beyond simply merging independent detections in different bands.
Options include chi^2 coadds and SED-weighted combinations of per-band coadds.

FY16 Initial background-matched coaddition

Background matching theoretically gets us free background models on N-1 of N images in a patch of sky.
In practice it's been much harder to implement on non-drift-scan cameras than it was on SDSS, and we're not sure what the data flow or parallelization will look like.

FY16 Initial DCR-corrected coaddition

We really have no idea how we're going to handle differential chromatic refraction in coadds (it's large enough we won't be able to ignore it; we can probably ignore other chromatic effects in coadds since we won't be using coadds for our highest-precision measurements).
We probably want to apply an SED-dependent pixel-level correction to input images, but to determine the SED at each point in the sky we'll need to at least know the SED of objects below the single-epoch detection limit, which requires coadds...
Closely related to work in 02C.03 image differencing pipeline; should be scheduled together.

Coaddition of undersampled images

As with PSF models, undersampled images break many of our algorithmic assumptions.
We don't currently know of any algorithms that are both computationally feasible and good enough from a science standpoint, so there's new work to be done here.
It's unlikely we'll find anything close to an optimal approach that's fast enough (very smart people have tried and only come up with very slow algorithms), so we probably want to focus on approximations whose effects we understand.

Coaddition pipeline interface defined

At this stage in development, we should have a good idea of the algorithms; here, we define the data flow and parallelization strategy we need to implement them.

Evaluate background matching for artifact identification

Both background matching and image differencing could provide ways of identifying image artifacts (ghosts, satellite trails, etc). We need do to evaluate which of these (maybe both) we want to use.

Complete DCR-corrected coaddition

See Initial DCR-corrected coaddition (it's really hard, so it's unlikely the first thing we try will work out of the box).

02C.04.05 - Object detection and deblending

FY16 Prototype/Complete multi-coadd detection

We'll be detecting on multiple different coadds, including:
- different filters (or combinations of filters, for different SEDs)
- different optimizations of exposure time vs. seeing
- different detection kernels (i.e. smoothing)
- likelihood (i.e. Kaiser) vs. traditional coadds
We need to work out how we want to build all the different coadds to maximize reuse of intermediate results. Is this a detection issue? Sounds like it goes under coadd pipeline 02C.04.04.
By the "complete" stage we need to work out which combination of things we actually want to use.

FY16 Prototype/Complete multi-coadd peak association

After detecting (generating Footprints and Peaks) on different coadds, we'll need to merge them.
Don't expect to have access to image data at this stage (we think we do need to have the Footprints and Peaks from all detection images in memory).
We'll make use of additional information about each Peak's origin saved during detection; need to determine what we need to save.

FY16 Single-epoch deblender overhauled

The current single-epoch deblender is a mess and essentially needs a rewrite before we can start making improvements to it (including extending it to a multi-coadd deblender).
Many useful tweaks added in the SDSS deblender were never added to the old LSST deblender, and should be, because we've determined that they make a big difference.
We can use the single-epoch deblender as a placeholder multi-coadd deblender (see RFC-46 - Getting issue details... STATUS ), and the limiting component in that procedure is the current single-epoch deblender.

FY16 Prototype/Complete multi-coadd deblender

This will start naturally from the single-epoch deblender (see above).
We'll somehow need to be able to ensure consistent deblending across multiple bands by sharing pixel-level information (e.g. templates).
Major challenge is probably fitting pixel-level information from multiple bands in memory.
- Change the parallelization to split images across more cores? (by blend family? by child? by pixel?!)
- Iterate between deblending and peak-merging.
- We only need to generate HeavyFootprints on the single-band coadds, not all possible detection images - but we may need to make use of more detection images (e.g. better seeing coadds) to do that.

Crowded field deblender

Deblending functionality available when PSFs substantially overlap over the field.
PSF determination gets harder in crowded fields, so we'll need to find a way to bootstrap that (we may want to have a separate entry in PSF modeling WBS).
We'll need some way to divide-and-conquer extremely large blends (this will be necessary sometimes even in non-crowded fields to keep memory use bounded).
- Try to identify "isthmuses"?
- Something based on sparse matrix diagonalization algorithms?

Prototype/Complete sky coverage mask

Many science cases need to make maps of different kinds of completeness and selection effects. Frequently these will be used to create random catalogs that can be used in clustering algorithms.
External code in this area includes Stomp, Mangle, Venice.
We intend to learn from the experience of DES & HSC before making a decision.

02C.04.06 - Object characterization pipeline

FY15 CModel fluxes tested against data from SDSS, HSC & CFHT

The current CModel code is a placeholder for galaxy modeling code for a photometric standpoint.
It's received a lot of testing on the HSC side, but the LSST version is slightly different (theoretically better in some respects, but that's not been verified).

FY15 Galaxy shear fitting performance parameters determined

We can estimate the performance of the final multifit galaxy modeling if we can measure a few numbers on simulations (those numbers are essentially factors that are multiplied to determine the computation time).
One number - the number of Monte Carlo samples - is extremely hard to put hard requirements on, because it requires essentially a full system and a huge amount of simulations to be measured. But I think we can order-of-magnitude estimate it (just from intuition, really) at 20-200 samples.
This effort measures the other two sets of unknown numbers - the number of pixels typically included in the fit, and the order and number of shapelet terms in the PSF, using simulations and a placeholder galaxy fitting algorithm.

FY15 Measurement framework overhauled

Definition of the meas_base framework and porting of all algorithms to run within it.

FY16 Morphological star-galaxy separation overhauled

Placeholder code for this exists (and it's not bad), but new techniques are becoming available and need to be incorporated.
We need to decide how to move from single-epoch classifiers to multi-band, multi-image classifiers.

FY16 Galaxy model evaluation code micro-optimized

Aim to implement or estimate ultimate galaxy model evaluation compute requirements to within a factor of two to drive procurement. Largest uncertainty should be number of samples (see above).

FY16 Plugin framework and driver for multi-epoch fitting

We actually have low-level code that could do maximum-likelihood multi-epoch fitting today, but we don't have the drivers (bookkeeping and plugin framework) for running it.
DES has an approach we could adopt with some modification that would be useful in development (but probably not in production, as there we'll need to be more careful about I/O and parallelization).
We need to think about how we want to do the parallelization in production before we write this (or just decide we're not going to try to make it at all future-proof for now).

FY16 Single-epoch plugin for Monte Carlo galaxy fitting

Monte Carlo galaxy fitting produces outputs (samples) that don't fit into our current table library, and hence we can't implement it within the current plugin system. Once we have the support for samples in the table library, then writing the plugin is relatively simple.
Testing, debugging, and tuning the plugin is not simple. It's a lot of work.

FY16 Support for locally non-affine coordinate distortions in model fitting code

Some kinds of sensor effects (tree rings and edge distortions) break our assumption that coordinate transforms are linear over the scale of objects.
To handle those correctly, we need to transform the PSF to celestial coordinates (or a locally equivalent tangent plane), convolve in that space, and then do the nonlinear transformation to pixel coordinates.
Might get in the way of using shapelets to speed things up; need to think about how to handle this.

FY16 Prototype/Complete probabilistic star/galaxy separation

Rather than a binary classification, we assign a probability to the outcome.
True probabilities depend on both magnitude and where we're looking on the sky (extremely important priors). But we also want to give users the option to supply their own priors on these quantities (with us providing some defaults).

Initial moving point source object fit

Fit moving point-source models to multi-epoch data to measure proper motion and parallax.
Need to consider whether and how to use priors on motion based on magnitude.
Need to consider whether to allow for variability.
Do we Monte Carlo sample here too?

Chromaticity included in multi-epoch model fitting

PSFs, WCSs, and galaxy morphology all have color dependency, and our transmission curves won't be the same for every object.
Modeling those correctly is an extremely hard problem from both algorithmic and bookkeeping standpoints.

Simultaneous model fitting of blended objects using only galaxy models

We'll almost certainly want to fit some groups of objects together.
Need to solve the same divide-and-conquer problem we encounter in the deblender.

Simultaneous fitting of joint galaxy/moving-star models

Stars can be blended with galaxies, but we don't want to simultaneously fit all combinatorial model assignment possibilities for each blend group.
Want a model that transitions smoothly from galaxy to star models (i.e. at zero radius, we free up the proper motion and parallax).

Complete moving point source object fit

Test and improve the moving point source models (which are now just a special case of the joint galaxy/moving-point-source model).

Forced photometry mode for multi-epoch model fitting

Rather than running forced photometry with exposures as the parallelization axis, it makes more sense to just tweak the multifit code to free up the per-exposure fluxes (and maybe fit some aperture fluxes, too?) since we'll have all the necessary data in memory there.

Space shortcuts

Page tree