Data Release Production WIP S15 release notes

These draft notes cover the major updates made by Data Release Production to the LSST stack since release 10.1 (Winter 2015). Please record significant updates here so that they can ultimately be incorporated into the notes accompanying the Summer 2015 release.

Major Functionality and Interface Changes

Improved semantics for loading `Exposure`s and `MaskedImage`s from arbitrary FITS files

The Exposure and MaskedImage represent image data with associated mask and variance information. When serialized to FITS, these are stored as three consecutive extensions in the FITS files. It is possible to load Exposures and MaskedImages from multi-extension FITS files which were not generated by LSST, but, due to the limitations of the FITS data model, it is not possible to ensure that the creator of the file adhered to the LSST convention: while an image object may be successfully instantiated, its contents may not be logically consistent.

We now go to greater lengths to check that the information in the file is consistent with the LSST standard, warning the user – and in some cases refusing to proceed – if it does not.

(DM-2599)

Improved support for non-standard FITS headers

The LSST stack is now capable of loading FITS files which contain non-standard headers of the form PVi_nn (i=1..x, nn=5..16), as written by SCAMP, and EQUINOX headers with a "J" prefix, as written by SkyMapper.

(DM-2883, DM-2924)

It is now possible to perform instrument signal removal on an `Exposure` which has no `Detector`

FakeAmp, a Detector-like object object which supports returning gain and saturation level, was added to make it possible to run updateVariance and saturationDetection if required.

(DM-2890)

`PVi_j` header cards are correctly saved to FITS files

This makes it possible to round-trip e.g. TPV headers.

(DM-2926)

Changes to compound fields and delimiters in Catalog Schemas.

In the older ("version 0") approach to table schemas, we had several compound field types (Point, Moments, Covariance, Coord) which behaved differently from other field types - the square bracket [] operators could not be used to access them, and they could not be accessed as columns (though their scalar subfields – e.g. "x" and "y" for Point – could be). In version 0, we used periods to separate both words and namespace elements in field names, but converted periods to underscores and back when writing to FITS. These schemas were mostly produced by the old measurement framework in meas_algorithms' SourceMeasurementTask, which was removed in the 10.1 release..

In the new ("version 1") approach, compound objects are simply stored in catalogs as their constituent scalars, with helper classes called FunctorKeys provided to pack and unpack them from Records (the FunctorKeys that replace the old compound fields are all in afw/table/aggregates.h). Unlike the original compound fields, there's no limit to how many types of FunctorKey we can have, or what package they can live in, making the system much more extensible. By making the constituent scalar objects what the Schema object knows about, it will be much easier to map a Schema to other table representations that don't know about LSST classes (e.g. SQL or Pandas). Most FunctorKeys can be used anywhere a regular Key can be used. Also, in version 1, we use underscores as namespace separators, and CamelCase to separate words, eliminating some ambiguity between word and namespace boundaries. The new measurement framework in meas_base's SingleFrameMeasurementTask and ForcedMeasurementTask uses version 1 tables exclusively.

In previous releases of the pipeline, version 0 schemas were deprecated but still supported. They have now been removed, but old catalogs saved as version 0 will still be readable - they will be converted to version 1 on read, with period delimiters converted to underscores, and all compound fields unpacked into scalar fields that can be used with a corresponding FunctorKey. This procedure obviously does not preserve field names, but all slot definitions will be preserved, so code that only relies on slot or minimal schema accessors (getCoord(), getCentroid(), getPsfFlux(), etc.) should not need to be modified.

(DM-1766)

Allow for use of Approximate (Chebyshev) model in background estimation

In previous releases, the only method for background estimation was to use an interpolation scheme (constant, linear, or various splines). These schemes tend to
lead to over-subtraction of the background near bright objects. The Approximate (Chebyshev) approach to background estimation greatly improves the background subtraction around bright objects. The relevant code to use this latter approach (including persistence and backwards compatibility issues) is now in place.

While the intention is to eventually set the Approximate background subtraction scheme as the default, there is some clean-up and restructuring that needs to be done before resetting the defaults (which may also require adjusting some defaults in the calibrate stage to be more appropriate for the approximation, as opposed to interpolation, scheme). Therefore, the default setting has not been changed (i.e. the default is still to use an interpolation scheme for background estimation). The Chebychev approximation can be selected for background estimation through configuration parameters in the obs_CAMERA packages, i.e. useApprox=True and, optionally, approxOrderX (approximation order in X for background Chebyshev), approxOrderX (approximation order in Y for background Chebyshev: currently approxOrderY must be equal to approxOrderX), weighting (if True, use inverse variance weighting in calculation).

(DM-2778)

Multi-Band processing for coadds

The motivation for and and detailed description of the functionality added here is fully described in:
https://confluence.lsstcorp.org/display/DM/S15+Multi-Band+Coadd+Processing+Prototype

Essentially, four new command-line Tasks have been added for consistent multi-band coadd processing:

DetectCoaddSourcesTask
- Detect sources (generate Footprints for parent sources) and model background for a single band.
MergeDetectionsTask
- Merge Footprints and Peaks from all detection images into a single, consistent set of Footprints and Peaks
MeasureMergedCoaddSourcesTask
- Deblend and measure on per-band coadds, starting from consistent Footprints and Peaks for parent objects.
MergeMeasurementsTask
- Combine separate measurements from different bands into a catalog suitable for driving forced photometry. Essentially, it must have a centroid, shape, and CModel fit for all objects, even for objects that were not detected on the canonical band. Will assume that all input catalogs already have consistent object lists.

(DM-1945, DM-3139)

Add `NO_DATA` mask plane

Previously, we have used the EDGE mask plane to indicate both pixels which are off-the-edge of the detector, and hence have no data available, and pixels near the edge which cannot therefore be properly searched for sources. Here, we introduce the NO_DATA plane to refer to the former case and now use EDGE strictly for the latter.

(DM-3136)

Add slot for flux used in photometric calibration

We define a new slot, CalibFlux, on SourceRecords. This slot may be used to refer to the flux used in photometric calibration, rather than hard-coding the name of a particular algorithm in the PhotoCal task.

(DM-3106)

Faster astrometry reference catalog loading

We ported to LSST some code that has been used by HSC for some time to optimise loading the reference catalog. Running processCcd.py on some HSC data with the SDSS DR9 catalog was taking 177 sec, with 144 sec being spent reading the reference catalog. This made it rather annoying for tracking down problems in astrometry. The problem was that we were reading the entire catalog in order to determine the healpix for each of the reference catalog files. The solution is to cache the healpix identifiers for each catalog file; this reduces the runtime to 45 sec.
The cache is saved as andCache.fits in the astrometry catalog directory.
The use of the cache can be disabled through the andConfig.py file (or the AstrometryNetDataConfig) by setting “allowCache” to False.
To generate a cache, setup astrometry_net_data and use the generateANetCache.py bin script that now comes in meas_astrom.

(DM-3142)

Bug Fixes

The following fixes resolve problems visible to end users.

Doxygen documentation now correctly includes LaTeX formatting

Correctly referring to MathJax means that LaTeX markup in documentation is nicely formatted.

(DM-2545)

Performance regression in `Footprint` dilation resolved

The previous release included improved algorithms for dilating Footprints. Unfortunately, in some circumstances (notably when dealing with particularly large Footprints) this code could actually perform more slowly than the previous implementation. This could have significant performance implications for many image processing operations. This regression has now been rectified, and the new dilation operations are significantly faster than the old ones in all circumstances tested.

(DM-2787)

Footprint fixes

The following updates/fixes to Footprint handling have been made:

The default 32-bit heap space used to store FITS variable-length arrays isn't large enough to store some of our extremely large HeavyFootprints. This persistence issue has been fixed the by switching to 64-bit heap descriptors, which is now supported by FITS.
Footprint::transform is now properly copying peaks over to the new footprint.
Footprint::clipTo is now properly removing those peaks lying outside the desired region.
Several parts of the pipeline assume peaks are sorted from most positive to most negative. We now ensure the cross-band merge code maintains this ordering as much as possible (even though the sorting may not be consistent across different bands).
The merging of a parent and its children’s Footprints was failing in cases where one or more child Footprints were themselves noncontiguous. This has been fixed by adapting the mergeFootprints code in afw such that it combines all the Footprints in the FootprintSet it uses in its implementation (instead of requiring that the FootprintSet have only one Footprint).

(DM-2606)

Fixed error in memory access in interpolation

An off-by-one error resulted in an attempt to read beyond the allocated memory.

(DM-3112)

Fixed truncated write of certain WCS information to FITS

(DM-2931)

Build and code improvements

These improvements should not usually be visible to end users. They may be important for developers, however.

Backend-agnostic interface to displays

The image display code no longer makes the assumption that display is carried out using ds9. Rather, an API is available which is independent of the the particular image viewer is in use. A backwards compatibility layer ensures that display through ds9 is still supported, while other backends will be added in future.

(RFC-42, DM-2709, DM-2849)

Measurement framework compiler warnings resolved

The measurement framework was refactored to avoid a series of warnings produced by the clang compiler.

(DM-2131)

Unsanctioned access to the display by tests suppressed

Some unit tests were attempting to write to a display, even when no display was available. On some systems, this directly caused test failures; on others, it could obscure the true cause of failures when a test did fail.

(DM-2492, DM-2494)

Unused & obsolete code has been removed from the `datarel` package

This package is effectively obsolete, but is still used in documentation generation which makes removing it entirely complex. For now, therefore, it has simply been trimmed of all unused functionality; it may be removed entirely following DM-2948.

(DM-2949)

Space shortcuts

Page tree

Major Functionality and Interface Changes

Improved semantics for loading `Exposure`s and `MaskedImage`s from arbitrary FITS files

Improved support for non-standard FITS headers

It is now possible to perform instrument signal removal on an `Exposure` which has no `Detector`

`PVi_j` header cards are correctly saved to FITS files

Changes to compound fields and delimiters in Catalog Schemas.

Allow for use of Approximate (Chebyshev) model in background estimation

Multi-Band processing for coadds

Add `NO_DATA` mask plane

Add slot for flux used in photometric calibration

Faster astrometry reference catalog loading

Bug Fixes

Doxygen documentation now correctly includes LaTeX formatting

Performance regression in `Footprint` dilation resolved

Footprint fixes

Fixed error in memory access in interpolation

Fixed truncated write of certain WCS information to FITS

Build and code improvements

Backend-agnostic interface to displays

Measurement framework compiler warnings resolved

Unsanctioned access to the display by tests suppressed

Unused & obsolete code has been removed from the `datarel` package

Space shortcuts

Page tree

Data Release Production WIP S15 release notes

Major Functionality and Interface Changes

Improved semantics for loading Exposures and MaskedImages from arbitrary FITS files

Improved support for non-standard FITS headers

It is now possible to perform instrument signal removal on an Exposure which has no Detector

PVi_j header cards are correctly saved to FITS files

Changes to compound fields and delimiters in Catalog Schemas.

Allow for use of Approximate (Chebyshev) model in background estimation

Multi-Band processing for coadds

Add NO_DATA mask plane

Add slot for flux used in photometric calibration

Faster astrometry reference catalog loading

Bug Fixes

Doxygen documentation now correctly includes LaTeX formatting

Performance regression in Footprint dilation resolved

Footprint fixes

Fixed error in memory access in interpolation

Fixed truncated write of certain WCS information to FITS

Build and code improvements

Backend-agnostic interface to displays

Measurement framework compiler warnings resolved

Unsanctioned access to the display by tests suppressed

Unused & obsolete code has been removed from the datarel package

Improved semantics for loading `Exposure`s and `MaskedImage`s from arbitrary FITS files

It is now possible to perform instrument signal removal on an `Exposure` which has no `Detector`

`PVi_j` header cards are correctly saved to FITS files

Add `NO_DATA` mask plane

Performance regression in `Footprint` dilation resolved

Unused & obsolete code has been removed from the `datarel` package