Skip to end of metadata
Go to start of metadata

This page is currently under development: text may change at any time! The content has not yet been reviewed for accuracy and completeness.

The daf_butlerUtils package is a DMS middleware component that abstracts data retrieval and storage from pipelines and other tasks. The following subsections describe key concepts of this package. 

Butler

The Butler provides a generic mechanism for persisting and retrieving data using mappers. A Butler manages a collection of datasets known as a repository. Each dataset has a type representing its intended usage and a location. Note that the dataset type is not the same as the C++ or Python type of the object containing the data. For example, an ExposureF object might be used to hold the data for a raw image, a post-ISR image, a calibrated science image, or a difference image. These would all be different dataset types. 

A Butler can produce a collection of possible values for a key (or tuples of values for multiple keys) if given a partial data identifier. It can check for the existence of a file containing a dataset given its type and data identifier. The Butler can then retrieve the dataset. Similarly, it can persist an object to an appropriate location when given its associated data identifier.

Note that the Butler has two more advanced features when retrieving a data set. First, the retrieval is lazy. Input does not occur until the data set is actually accessed. This allows datasets to be retrieved and placed on a clipboard prospectively with little cost, even if the algorithm of a stage ends up not using them. Second, the Butler will call a standardization hook upon retrieval of the dataset. This function, contained in the input mapper object, must perform any necessary manipulations to force the retrieved object to conform to standards, including translating metadata.

Camera Mapper

The focal plane of an physical camera is assumed to consist of one or more sections (rafts, in LSST-speak), each composed of multiple sensors (CCDs). Each CCD in turn contains one or more read-out amplifiers (amps). A raw image is the result of obtaining data from the camera after an exposure, as assembled into a file. Relevant camera attributes are described in policy files, including a geometry description (CameraGeom object), and a filter description (Filter class static configuration); an optional sensor defects description directory may also be provided. Information from the camera geometry and defects descriptions are inserted into all images (Exposure objects) that are returned. 

The mapper uses one or two registries to retrieve metadata about the images. The first is a registry of all raw exposures, which must contain the time (epoch) of the observation. One or more tables (or the equivalent) within the registry are used to look up data identifier components that are not specified by the user (e.g. filter) and to return results for metadata queries. The second is an optional registry of all calibration data. This should contain entries for valid start and end times for each calibration dataset, covering the time interval of the observation times. 

The CameraMapper manages datasets within a root directory. It can also be given an outputRoot. If so, the input root is linked into the outputRoot directory using a symlink named "_parent"; writes go into the outputRoot while reads can come from either the root or outputRoot. As outputRoots are used as inputs for further processing, the chain of _parent links allows any dataset to be retrieved. Note that writing to a dataset that is present in the input root will hide the existing dataset but not overwrite it. 

The mapper's behaviors are largely specified by a policy file. See the MapperDictionary.paf for descriptions of the available items. The 'exposures', 'calibrations', and 'datasets' subpolicies configure mappings (see Mappings class). Functions to map (i.e., provide a path to the data given a dataset identifier dictionary) and standardize (i.e., convert data into some standard format or type) may be provided in the subclass as "map_{dataset type}" and "std_{dataset type}", respectively. If non-Exposure datasets cannot be retrieved using standard daf_persistence methods alone, a "bypass_{dataset type}" function may be provided in the subclass to return the dataset instead of using the "datasets" subpolicy. Implementations of map_camera and std_camera that should typically be sufficient are provided in this base class.



  • No labels