Preamble

During measurement, a series of plug-in algorithms are used to measure the raw properties of each source (e.g. fluxes, positions in pixel coordinates). The "calibration and ingest" system provides a means of transforming those raw measurements to calibrated units, such as magnitudes or positions in celestial coordinates.

An important consideration is that the information required to perform the transformation may not be available at measurement time. For example, it may depend on a global astrometric or photometric calibration which has not yet been performed. For this reason, measurement algorithms cannot be expected to simply write calibrated measurements to their output.

This work is tracked as DM-1074 - Getting issue details... STATUS & DM-1598 - Getting issue details... STATUS .

Goals

A task should be available which takes as input:

An afw::table::SourceCatalog describing raw measurements of source from a particular image;
An lsst::afw::image::Calib describing the photometric calibration of that image;
An lsst::afw::image::Wcs describing the world coordinate system of the image.

The task should produce an afw::table::BaseCatalog containing calibrated measurements. Note that:

The transformation from raw to calibrated units is not known a priori, but must rather be defined on a per-measurement plugin basis;
The relationship between input and output fields is not necessarily one-to-one – rather, some raw measurements may be combined to produce derived quantities;
The transformation from raw to calibrated units may depend on the photometric and WCS information supplied to the task, and on the configuration of the plugin which performed the raw measurement;
We do not copy slots from the input SourceTable.

In addition, one or more command line tasks should be produced which provide an appropriate interface between the calibration transformation and the end user, making it possible to specify the appropriate input data.

Design

We tackle this problem by:

Defining "transformation plugins", which describe the means of transforming the output of a measurement plugin.
Augmenting measurement plugins with a method which provides the caller with a transformation plugin appropriate to the measurement.
Providing a task which transforms an input catalog to an output catalog using the infrastructure defined above.

Transformation Plugins

Transformation plugins are written in Python, and inherit from the base class TransformPlugin. They are expected to adhere to the same interface.

class TransformPlugin(object):
    def __init__(self, name, mapper, cfg, wcs, calib):
		...
    def __call__(self, oldRecord, newRecord):
        ...

The name and cfg arguments to __init__ describe the name and configuration of the measurement plugin whose results we will transform. The wcs and calib arguments describe the WCS and calibration which will be available for use in the transformation. These four arguments are stored as instance variables within the object. The mapper argument is a SchemaMapper which describes the input (raw measurement list) and output (calibrated) schemas and the relationship between them. The __init__ method should:

Add mappings between fields which should be directly copied from input to output;
Add field definitions for those quantities which will be calculated during transformation and store their keys.

The __call__ method will be invoked once with each pair of raw and corresponding calibrated table records. It should perform whatever transformation is necessary to populate the latter given the former as well as the stored information in the transformation plugin (measurement plugin name & configuration, WCS, calibration).

For example, the following would:

Copy the contents of all fields beginning with the name of the measurement plugin from the raw source list to the calibrated output;
Add an additional field with the name example and the value 10.0 to all output records.

class Example(TransformPlugin):
    def __init__(self, name, mapper, cfg, wcs, calib):
        TransformPlugin.__init__(self, name, mapper, cfg, wcs, calib)
        for key, field in mapper.getInputSchema().extract(name + "*").itervalues():
            mapper.addMapping(key)
		outputSchema = mapper.editOutputSchema()
		self.newKey = newSchema.addField("example", type="D")

    def __call__(self, oldRecord, newRecord):
        newRecord.set(self.newKey, 10.0)

Note that hierarchies of transformation plugins can be built up in this way – for example, a FluxTransformer plugin could provide some basic transformation for flux fields which could be inherited & augmented by PsfFluxTransformer, etc.

Mapping measurement plugins to transformations

Measurement plugins are expected to provide a static method which returns the class of the transformation plugin which should be applied to their outputs. We modify BasePlugin to return a NullTransform which causes no data to be copied to the output:

class NullTransform(TransformPlugin):
	def __call__(self, oldRecord, newRecord):
		pass
 
class BasePlugin(object):
	...
    @staticmethod
    def getTransformClass():
        return NullTransform

This null operation is then the default for any measurements which do not define their own transformations.

Transformation task design

TransformTask defined an __init__ which takes two additional arguments: the configuration of the task which was used to perform the raw measurements and the registry of available plugins. The former is used to derive a list of plugins which were used to do the measurements and their configurations, which is stored as an instance variable.

class TransformTask(pipeBase.Task):
    def __init__(self, *args, **kwargs):
		...
        measConfig = kwargs.pop('measConfig')
        self.pluginRegistry = kwargs.pop('pluginRegistry')
        self.measPlugins = [(name, measConfig.value.plugins.get(name))
                            for name in measConfig.value.plugins.names]

The run method of TransformTask takes the list of sources to be transformed and the WCS and calibration to be applied as arguments. It constructs a mapper and adds some (configurable) fields which are copied as standard – in this way it is possible to preserve arbitrary data which is not the output of a measurement plugin in the output. It then uses the the registry to look up the measurement plugins by name, retrieve the corresponding TransformPlugin and configure it appropriately:

class TransformTask(pipeBase.Task):
	def run(self, sourceCat, wcs, calib):
		mapper = afwTable.SchemaMapper(sourceCat.schema)
		mapper.addMapping(sourceCat.schema.find('id').key)
		transforms = [self.pluginRegistry.get(name).PluginClass.getTransformClass()(name, mapper, cfg, wcs, calib)
					  for name, cfg in self.measPlugins]

Finally, we iterate over all sources, using a combination of the mapper and the transformation plugins to transform old to new:

class TransformTask(pipeBase.Task):
    def run(...):
		...
		newSources = afwTable.BaseCatalog(mapper.getOutputSchema())
		newSources.reserve(len(sourceCat))
		for oldSource in sourceCat:
            newSource = newSources.addNew()
            newSource.assign(oldSource, mapper)
            for transform in transforms:
                transform(oldSource, newSource)

Command line tasks

A series of command line tasks can be defined which feed the appropriate inputs to TransformTask, loading whatever plugin registry, source table, calibration and WCS is required by the end user.

Discussion & criticisms

The way in which the __init__ method of TransformPlugin derivatives takes a reference to a mapper and modifies is ugly: it's unfortunate for a constructor to modify an object other than the one which it's constructing, and, if a function modifies something, it would ideally return the thing being modified. An alternative would be to add another method (TransformPlugin.configure(mapper), say) which avoids the above, but this invokes more code for little practical benefit.

All transformations (other than a trivial copy) are performed by Python code. It is assumed that this is not a major bottleneck. However, future improvements to SchemaMapper could increase the variety of operations that be performed directly inside the mapper based on C++ code, thereby mitigating this. Likely it will always be necessary to iterate over the source list in Python since it will be desirable to continue to define some transformations in Python code.

Implementation

A prototype (lacking documentation, tests, etc) implementation of the above system is available on the u/swinbank/DM-1598 branch in meas_base and pipe_tasks.

Space shortcuts

Page tree

Preamble

Goals

Design

Transformation Plugins

Mapping measurement plugins to transformations

Transformation task design

Command line tasks

Implementation

21 Comments

John Swinbank

Jim Bosch

John Swinbank

Robert Lupton

Jim Bosch

John Swinbank

Jim Bosch

Jim Bosch

Jim Bosch

Jim Bosch

John Swinbank

Jim Bosch

John Swinbank

Jim Bosch

John Swinbank

Jim Bosch

Russell Owen

John Swinbank

Paul Price

Jim Bosch

John Swinbank