This page addresses the current state of data identification in the Generation 3 middleware, and attempts to map this to the needs of our IVOA and IVOA-style image and data services.

That discussion is framed in the context of implementing data services with a new flavor of Pipeline[Task] "activator" that acts as a web service (following at least the lower-level DALI, VOSI, and UWS conventions, if it's not an actual standard data service like SODA), invoking appropriate PipelineTasks and/or Pipelines to perform the algorithmic work. 

Overview of IDs

The following table presents a approximate  mapping between Butler concepts and ObsCore concepts.  It's helpful to review before looking at the implementation notes on the use cases below.

For additional Butler documentation in this area, see Organizing and Identifying Datasets on lsst.io.

Plain language conceptButler conceptObsCore attributeObsCore definitionComments

Physical identifierURIobs_publisher_did"This is the identifier the publisher provides for this observation. It is generally different from the original identifier given by the creator of the dataset. (new reduction, new calibration, etc..). The corresponding Utype mapped from the Spectrum DM is Curation.PublisherDID and relates to the same definition.
"This field contains the IVOA dataset identifier (Plante and al. 2007) for the published data product. This value must be unique within the namespace controlled by the dataset publisher (data center). It will also be globally unique since each publisher has a unique registered publisher ID. The same dataset may however have more than one publisher dataset identifier if it is published in more than one location (the creator DID, if defined for the given dataset, would be the same regardless of where the data is published)."



Logical identifierUUIDobs_creator_did"IVOA dataset identifier given by its creator. See definition in the SpectrumDM specification"
SpectrumDM: "The [obs_creator_did] is the dataset ID defined internally by the creator and may be entirely different from the DatasetID described above. It is used to identify a particular original exposure in an archive and will not necessarily change even if the VO object in question is a cutout or is otherwise further processed."



Observation identifierDataIDobs_id

















Use Cases for Data Services

  • No labels