Overview
The last design review for the C++ part of the new measurement framework was essentially a delayed rejection of the design: it was considered too complicated, with much of the complexity due to an unnecessary focus on algorithms that needed to provide an interface that would allow them to be used without afw::table Schema and Record objects (for instance, we frequently run SdssShape on PSF postage stamps, and in that context it's a lot of extra overhead to set up a Schema, Table, and Record just to get thee results). It was noted at the review that most algorithms don't need to support that usage (and did not in the old framework), and this has allowed us to simplify the C++ interface considerably. It also has much more in common with the old interface.
All code in this design is taken from branch u/jbosch/DM-829 of meas_base, which includes a working prototype of the API and wrappers for two plugins using the new system, PsfFlux and SdssShape. Note, however, that it also contains the interface and utility classes for the last design, and all other plugins are still using that.
Python Plugin Interface
While the Python plugin interface has already been approved and is not being reviewed here, it's a necessary starting point for the C++ interface. Plugins will only be called by the measurement framework through this Python interface, so the C++ design we are proposing here is not binding (plugins developers can always provide their own Python class that calls whatever C++ code they like). Instead, the goal of the C++ API is to allow them to implement new plugins in C++ with a minimum amount of boilerplate and complexity.
Here is a synopsis of the base Python plugin class for single-frame measurement (from sfm.py):
class SingleFramePlugin: def __init__(self, config, name, schema, metadata): ... def measure(self, measRecord, exposure): ... def measureN(self, measCat, exposure): ... def fail(self, measRecord, error=None): ...
The __init__
method is called when the plugin is initialized, giving it an opportunity to add its fields to the schema. The measure
method is then called once on each detected source, and (if enabled) measureN
is called on each deblend family. The measurement framework wraps these calls in a try block, and calls fail
on any records that were part of a failure.
The forced measurement interface is very similar - the method signatures are slightly different, but the logic is the same (from forcedMeasurement.py):
class ForcedPlugin: def __init__(self, config, name, schemaMapper, metadata): ... def measure(self, measRecord, exposure, refRecord, refWcs): ... def measureN(self, measCat, exposure, refCat, refWcs): ... def fail(self, measRecord, error=None): ...
C++ Abstract Base Classes
The C++ interface is essentially the same as the Python interface. The fail
method shared by both single-frame and forced measurement is defined in a base class, BaseAlgorithm
:
class BaseAlgorithm { public: virtual void fail( afw::table::SourceRecord & measRecord, MeasurementError * error=NULL ) const = 0; virtual ~BaseAlgorithm() {} };
This class and all other classes in this section are defined in Algorithms.h.
SingleFrameAlgorithm
and ForcedAlgorithm
add the measure
and measureN
methods, inheriting fail
from BaseAlgorithm
:
class SingleFrameAlgorithm : public virtual BaseAlgorithm { public: virtual void measure( afw::table::SourceRecord & measRecord, afw::image::Exposure<float> const & exposure ) const = 0; virtual void measureN( afw::table::SourceCatalog const & measCat, afw::image::Exposure<float> const & exposure ) const; }; class ForcedAlgorithm : public virtual BaseAlgorithm { public: virtual void measure( afw::table::SourceRecord & measRecord, afw::image::Exposure<float> const & exposure, afw::table::SourceRecord const & refRecord, afw::image::Wcs const & refWcs ) const = 0; virtual void measureN( afw::table::SourceCatalog const & measCat, afw::image::Exposure<float> const & exposure, afw::table::SourceCatalog const & refRecord, afw::image::Wcs const & refWcs ) const; };
These use virtual
inheritance from BaseAlgorithm
, to allow concrete algorithm classes to inherit from both SingleFrameAlgorithm
and ForcedAlgorithm
, as we expect many plugins to be sufficiently simple that there's no need for different classes. In fact, in many cases, the algorithm won't need the extra reference arguments passed to ForcedAlgorithm
, because the forced measurement framework will already ensure that the slot centroid and shape on measRecord
correspond to the centroid and shape from the reference catalog. For these algorithms, we have provided an additional intermediate base class for convenience:
class SimpleAlgorithm : public SingleFrameAlgorithm, public ForcedAlgorithm { public: using SingleFrameAlgorithm::measure; using SingleFrameAlgorithm::measureN; virtual void measure( afw::table::SourceRecord & measRecord, afw::image::Exposure<float> const & exposure, afw::table::SourceRecord const & refRecord, afw::image::Wcs const & refWcs ) const { measure(measRecord, exposure); } virtual void measureN( afw::table::SourceCatalog const & measCat, afw::image::Exposure<float> const & exposure, afw::table::SourceCatalog const & refRecord, afw::image::Wcs const & refWcs ) const { measureN(measCat, exposure); } };
Algorithms that inherit from SimpleAlgorithm
need only implement the single-frame overloads of measure
(and, if desired, measureN
), and the forced measurement overloads will be provided automatically by SimpleAlgorithm
's implementations.
Example 1: PsfFlux
Here is an example of the simplest possible implementation of an algorithm, which we expect will be the approach used for the vast majority of algorithms. We inherit from SimpleAlgorithm
, then implement just measure
and fail
(from PsfFlux.h):
class PsfFluxAlgorithm : public SimpleAlgorithm { public: enum { FAILURE=FlagHandler::FAILURE, NO_GOOD_PIXELS, EDGE, N_FLAGS }; typedef PsfFluxControl Control; PsfFluxAlgorithm(Control const & ctrl, std::string const & name, afw::table::Schema & schema); private: virtual void measure( afw::table::SourceRecord & measRecord, afw::image::Exposure<float> const & exposure ) const; virtual void fail( afw::table::SourceRecord & measRecord, MeasurementError * error=NULL ) const; Control _ctrl; FluxResultKey _fluxResultKey; FlagHandler _flagHandler; SafeCentroidExtractor _centroidExtractor; };
There are a few features here worth discussing:
- We've defined an anonymous enum with a general
FAILURE
flag and several more flags corresponding to specific failure modes. All algorithms must have theFAILURE
flag first, as this is used by theFlagHandler
class (which we'll discuss later) and theMeasurementError
exception (which can be thrown by algorithms to indicate a known failure mode that should not generate a warning). - The constructor takes a control object, a string name that is used as the prefix for all field names, and a
Schema
object the algorithm's output fields should be added to. We'll discuss constructors more later as well. - As data members, the algorithms holds:
- its
Control
object; after an algorithm is constructed, it cannot be reconfigured, and it's responsible for saving those configuration options. - a
FluxResultKey
, aFunctorKey
that handles the output fields. This object creates a standard set of output fields for a flux algorithm (just a flux and its uncertainty), then stores the returnedKey
s for later use. Within themeasure
implementation, we save the flux and flux uncertainty into aFluxResult
, then use theFluxResultKey
to transfer those values to the record. - a
FlagHandler
plays the same role for the flag fields that theFluxResultKey
does for the flux and uncertainty fields, but it's not aFunctorKey
because algorithms typically want to set flags one or two at a time, instead of transferring all flag values at once from another object. - the
SafeCentroidExtractor
is used to get the centroid of the object from themeasRecord
argument. This essentially just callsmeasRecord.getCentroid()
to access the slot centroid value, but it also provides consistent error handling for all algorithms that need to get a centroid from a previously-run centroid.
- its
- We've defined the
measure
andfail
method overrides as private. That's just to keep Swig from being confused about method shadowing and what overloads of these methods to wrap - it should be equivalent to define them as public methods, and add a using declaration for the overload (which would be the standard way to do it), but Swig doesn't seem to understand that.
You can see precisely how these are used in PsfFlux.cc.
To make this algorithm available in Python, we just need to Swig it as we normally would, then put the following line in meas_base (currently in plugins.py in meas_base):
wrapSimpleAlgorithm(PsfFluxAlgorithm, Control=PsfFluxControl)
This method is defined in wrappers.py, where you can find more information on the many options we aren't using. It creates subclasses of SingleFramePlugin
and ForcedPlugin
that delegate to PsfFluxAlgorithm
, then registers these plugins with the appropriate registry for each. There are similar functions for wrapping algorithms that are only for single-frame or forced measurement.
Example 2: SdssShape
The implementation for SdssShape is significantly more complex, for two reasons:
- It has multiple outputs - a centroid, a shape, and a flux, as well as some other things that aren't covered by our predefined
Result
andResultKey
classes. - We want to provide an interface that doesn't require users to use
SourceRecord
objects for inputs and outputs. This will take the form of a static method, which return anSdssShapeResult
object. Themeasure
method will be implemented by delegating to this static method.
The interface for SdssShape involves several classes. We'll skip the Control object, which is pretty standard (and you can find it, and all the other classes here, in SdssShape.h). We'll start with the algorithm class itself, even though it depends on several other classes, to show the big picture first:
class SdssShapeAlgorithm : public SimpleAlgorithm, public SdssShapeFlags { public: typedef SdssShapeControl Control; typedef SdssShapeResult Result; typedef SdssShapeResultKey ResultKey; SdssShapeAlgorithm(Control const & ctrl, std::string const & name, afw::table::Schema & schema); template <typename T> static Result apply( afw::image::MaskedImage<T> const & image, afw::detection::Footprint const & footprint, afw::geom::Point2D const & position, Control const & ctrl=Control() ); template <typename T> static Result apply( afw::image::Image<T> const & exposure, afw::detection::Footprint const & footprint, afw::geom::Point2D const & position, Control const & ctrl=Control() ); private: virtual void measure( afw::table::SourceRecord & measRecord, afw::image::Exposure<float> const & exposure ) const; virtual void fail( afw::table::SourceRecord & measRecord, MeasurementError * error=NULL ) const; Control _ctrl; ResultKey _resultKey; SafeCentroidExtractor _centroidExtractor; };
Most of this is straightforward: the apply
methods provide the SourceRecord
-free interface, and measure
and fail
implement the plugin interface. Like PsfFluxAlgorithm
, SdssShapeAlgorithm
contains a control object instance, a FunctorKey
, and a SafeCentroidExtractor
. Unlike PsfFluxAlgorithm
, the FunctorKey
is a custom class, and there's no FlagHandler
(at least not directly - as we'll see later, it's held by the custom FunctorKey
. We also inherited from a struct that defines the failure flags, rather than defining them inline:
struct SdssShapeFlags { enum { FAILURE=FlagHandler::FAILURE, UNWEIGHTED_BAD, UNWEIGHTED, SHIFT, MAXITER, N_FLAGS }; };
We've moved these flags into a separate class just so we can inherit from them in both SdssShapeAlgorithm
and SdssShapeResult
, putting the enum values in both class scopes.
The next class is that result object, which we use to return the outputs in the interface that doesn't use SourceRecord
s. It inherits from several of our predefined result classes (see FluxUtilities.h, CentroidUtilities.h, ShapeUtilities.h), and adds POD data members for additional outputs not included in these. Flags are held in a std::bitset
, but Swig doesn't understand that, so we also provide an accessor method for the flags.
class SdssShapeResult : public ShapeResult, public CentroidResult, public FluxResult, public SdssShapeFlags { public: ShapeElement xy4; ///< A fourth moment used in lensing (RHL needs to clarify; not in the old docs) ErrElement xy4Sigma; ///< 1-Sigma uncertainty on xy4 ErrElement flux_xx_Cov; ///< flux, xx term in the uncertainty covariance matrix ErrElement flux_yy_Cov; ///< flux, yy term in the uncertainty covariance matrix ErrElement flux_xy_Cov; ///< flux, xy term in the uncertainty covariance matrix #ifndef SWIG std::bitset<N_FLAGS> flags; ///< Status flags (see SdssShapeFlags). #endif /// Flag getter for Swig, which doesn't understand std::bitset bool getFlag(int index) const { return flags[index]; } SdssShapeResult(); ///< Constructor; initializes everything to NaN };
The custom FunctorKey,
maps the custom result object to afw::table record classes. We'll use that in the implementation, to connect the SdssShapeResultKey
,apply
method that returns SdssShapeResult
to the measure
method required by the SingleFrameAlgorithm
.
class SdssShapeResultKey : public afw::table::FunctorKey<SdssShapeResult>, public SdssShapeFlags { public: static SdssShapeResultKey addFields( afw::table::Schema & schema, std::string const & name ); SdssShapeResultKey() {} SdssShapeResultKey(afw::table::SubSchema const & s); virtual SdssShapeResult get(afw::table::BaseRecord const & record) const; virtual void set(afw::table::BaseRecord & record, SdssShapeResult const & value) const; bool operator==(SdssShapeResultKey const & other) const; bool operator!=(SdssShapeResultKey const & other) const { return !(*this == other); } bool isValid() const; FlagHandler const & getFlagHandler() const { return _flagHandler; } private: ShapeResultKey _shapeResult; CentroidResultKey _centroidResult; FluxResultKey _fluxResult; afw::table::Key<ShapeElement> _xy4; afw::table::Key<ErrElement> _xy4Sigma; afw::table::Key<ErrElement> _flux_xx_Cov; afw::table::Key<ErrElement> _flux_yy_Cov; afw::table::Key<ErrElement> _flux_xy_Cov; FlagHandler _flagHandler; };
Most of SdssShapeResultKey
is just the standard interface for a FunctorKey
- equality comparison, isValid
, accessors. It's essentially all boilerplate, aside from addFields
, get
, and set
, and those would need to be implemented inside SdssShapeAlgorithm
itself if they weren't implemented here. Whether the boilerplace involved in providing a custom FunctorKey
for an algorithm is worthwhile depends on how much we expect that algorithm to be used outside the plugin framework. For SdssShape
, that's often enough that the convenience to users outweighs the boilerplate. For PsfFlux
and most other algorithms, it's not worthwhile. As we mentioned above, it's this class that holds the FlagHandler
, which it uses to manage the keys for the flags.
The full implementation for SdssShape
can be found in SdssShape.cc.
Finally, the line to add Python plugins for SdssShape
and register is almost identical to that for PsfFlux
:
wrapSimpleAlgorithm(SdssShapeAlgorithm, Control=SdssShapeControl, executionOrder=1.0)
The only real difference is the executionOrder
argument, which sets when this algorithm will be run relative to the others. The default is 2.0, which is appropriate for most flux algorithms, which generally expect centroid and shape algorithms to be run first. Centroid algorithms usually have executionOrder=0.0
, while shape algorithms like this one typically have executionOrder=1.0
.
Algorithm Construction
Both of our example algorithms have the same constructor signature: (Control const & ctrl, std::string const & name, afw::table::Schema & schema)
We expect most algorithms to use this signature, and it's what wrapSimpleAlgorithm
and wrapSingleFrameAlgorithm expect by default. If an algorithm needs to add something to the metadata saved with the catalog (a common case is saving the radii used in aperture fluxes), it can pass needsMetadata=True
to the algorithm wrapper generator (e.g. wrapSimpleAlgorithm/wrapSingleFrameAlgorithm/wrapForcedAlgorithm
), which will cause it to expect a different signature:
(Control const & ctrl, std::string const & name, afw::table::Schema & schema, daf::base::PropertySet const & metadata)
Algorithms that implement the measureN
method should also pass hasMeasureN
to the algorithm wrapper generator, which will cause it to pass a bool doMeasureN
argument to the constructor:
(Control const & ctrl, std::string const & name, afw::table::Schema & schema, bool doMeasureN)
or, if needsMetadata=True
is also passed,
(Control const & ctrl, std::string const & name, afw::table::Schema & schema, daf::base::PropertySet const & metadata, bool doMeasureN)
The doMeasureN
argument informs the algorithm up-front whether the measureN
method will be called, allowing to allocate additional output fields only when that is true. Passing hasMeasureN
to the algorithm wrapper generator also causes a doMeasureN
config option to be added, which is what's used to feed the constructor: even if an algorithm supports simultaneous measurement on multiple sources, we may not want to run it all the time.
Error Handling
When an algorithm encounters a problem, it can handle it in one of several ways:
- It can set flag fields directly, probably using a
FlagHandler
data member, and return. Algorithms should always set a failure-mode-specific flag here when setting the general failure flag, but need not set the general failure flag if the problem is sufficiently minor that it is not expected to affect the outputs at all. In the future, we may add a general "suspect" flag that should be set in intermediate cases, but we'll defer that question to a future review. - It can throw a
MeasurementError
, passing an enum value that will be carried by the exception and hence passed tofail
, where the flag field can be set. - It can throw a
FatalAlgorithmError
, which indicates a serious logic error in the configuration of the pipeline, such as running an algorithm that requires a Wcs in a way that doesn't provide one. This exception will not be caught by the measurement framework, and hence will generally propagate up and kill any calling code. - It can throw some other exception (though this is strongly discouraged except for extremely rare errors). This will call
fail
with a null error argument, allowing it to set a general failure flag, but nothing specific. This will also cause the measurement framework to output a warning to the logs containing the exception message. The main purpose of this behavior is to ensure that unexpected failure modes result in the general failure flag being set, while being noisy enough to ensure that "discovery" of a new failure mode results in a change to the code to avoid future warnings in the logs, in favor of setting a new flag field that indicates the reason for the failure.
Either (1) or (2) can be used to handle known failure modes that represent the inability to process a single source, and at present we do not express a preference between them. Setting flag fields directly is a bit more flexible (albeit in a way we expect would be rarely used), and would be hard to forbid without changing other aspects of the algorithm API. Throwing MeasurementError
allows error-handling code to be gathered into one location, without requiring algorithm developers to wrap their entire implementation in a try/catch block.
In some cases, an algorithm's failure is the direct result of the failure of a dependency: if an algorithm depends on the slot Centroid value, for instance, it can fail when that the centroider fails. In this case, the SafeCentroidExtractor
and SafeShapeExtractor
classes use aliases to flag fields to connect the failure in the dependency to a failure in the dependent (see InputUtilities.h for more information). This alias makes it appear like there's an extra failure-mode-specific flag for the dependent algorithm (e.g. "base_PsfFlux_flag_badCentroid"), which is just an alias to "slot_Centroid_flag", which is itself an alias to the slot centroid's general failure flag (e.g. "base_SdssCentroid_flag").
2 Comments
Kian-Tat Lim
One minor thing: I'm not sure I like the inheritance of the flags enum. I'm not sure it's worth it to make the usage shorter. (Now, with C++11, we should be using
enum class
more often, but this one needs to go to astd::bitset
, so I think it has to stay a traditional enum.)Jim Bosch
That's easy enough to change, and I don't feel particularly strongly about it either way. But it might be worth noting that I consider this to really just be a workaround for Swig: if Swig could deal with inner classes, the Result, Control, and ResultKey classes would be inner classes of the Algorithm, and then they'd all have access to the flags if there were just defined in the Algorithm class scope.