This page captures all the requirements at SRDLSR, OSS, and DMSR level that directly relate to Level 3 capabilities, updated as of March 2018 from the baselined documents.  As usual with Confluence, this page is itself not under change control and therefore is not the final word on anything.

Orange highlighting has been used to indicate places where a requirement applies more broadly than Level 3 but has a specific mention of Level 3 in it.

"Level 3" has been interpreted as referring to all three of "user data product support and storage", "resources for user computing", and "provision of a coding environment to users".

What were originally called the "Level 3" capabilities are now essentially a subset of the capabilities of the LSST Science Platform (LSP), together with requirements on the database and data access systems beneath it to support users' file-based and catalog-based data.

NB: We now say  "User-Generated Data Products" instead of "Level 3 Data Products" (per LPM-231), and the "Level 3" term is no longer used at all to refer to user computing resources or to the provision of an LSST DM coding environment (e.g., preinstalled stack releases in JupyterLab) to users. However, for the time being we are not planning to remove this language from the LSR or OSS; it is too pervasive.

SRD (LPM-17): Science Requirements Document

Excerpts:

From Section 3.5, Data Processing and Management Requirements:

• Level 3 data products will be created by the community, including project teams, using suitable Applications Programming Interfaces (APIs) that will be provided by the LSST Data Management System. The Data Management System will also provide at least 10% of its total capability for user-dedicated processing and user-dedicated storage. The key aspect of these capabilities is that they will reside “next to" the LSST data, avoiding the latency associated with downloads. They will also allow the science teams to use the database infrastructure to store their results. 

and

The catalogs will be released in a format that will allow efficient data access and analysis (such as a database and query system). 

LSR (LSE-29): LSST System Requirements

LSR-REQ-0032: Organization of Data Products

Requirement: The LSST data processing system shall provide the means for organizing the production of three classes of science data products: Level 1 (nightly cadence), Level 2 (data release cadence), and Level 3 (user-specified).

LSR-REQ-0041: Level 3 Data Products

Specification: The LSST Observatory shall support Level 3 Data Products that are the result of processing based on Level 1 and Level 2 Data Products, of a nature specified by users (by the provision of code and/or processing configuration data).

LSR-REQ-0106: Level 3 Data Processing

Specification: The LSST Observatory shall provide software, services, and hardware resources to enable the production and storage of Level 3 Data Products. It shall be possible to produce Level 3 Data Products using LSST computing resources or elsewhere, and bring them into federation with Level 1 and 2 Data Products at the LSST data center.

Discussion: Level 3 Data Products are the result of processing that utilizes Level 1 and Level 2 Data Products, of a nature specified by users (by the provision of code and/or processing configuration data).

LSR-REQ-0107: Level 3 Data Product Federation

Specification: The manner of production of Level 3 Data Products shall facilitate their federation with related Level 1 and Level 2 Data Products, when archived.

Discussion: The LSST project may, over time, promote selected Level 3 Data Products and their production to Level 2 or Level 1, subject to scientific justification and the availability of resources, and with the agreement of their originators.

LSR-REQ-0050: Level 3 Data Product Archiving 

Specification: Level 3 Data Products shall be archived, subject to project approval, based on user applications. An administrative mechanism shall be established to allocate a certain fraction of project resources for this purpose and to allocate that fraction to approved user requests based on their assessed usefulness to the project and the achievement of its science goals, and their value to the LSST user community.

LSR-REQ-0052: Public Data Release

Requirement: The LSST System shall provide open access to all LSST Level 1 and Level 2 Data Products, in accordance with LSST Corporation Board approved policies. This shall include access to all engineering, environmental, and ancillary data required for scientific interpretation of the Data Products.

Discussion: Level 3 Data Products may or may not be available for open access, depending on agreements with their creator. Whether the creator is willing to accept open access is a criterion that may be used to determine how the project's resources for Level 3 Data Product archiving and service are allocated.

The LSST Corporation reserves the right to retain confidential business records, proposals, personnel files, medical records, or other confidential documents, obtained from others.

LSR-REQ-0054: Data Product Access Interface

Requirement: The LSST shall provide access to all its public data products through an interface that utilizes, to the maximum practicable extent, community-based standards such as those for pixel-based images (e.g. FITS), as well as those being developed by the Virtual Observatory (VO) community, and that facilitates user data analysis and the production of Level 3 and other user-defined data products at LSST-provided facilities and at remote sites.

LSR-REQ-0055: Community Computing Services

Requirement: The LSST shall provide and maintain an amount of computing capacity equivalent to at least userComputingFraction (10%) of the total LSST data processing capacity (computing and storage) for the purpose of scientific analysis of LSST data and the production of Level 3 Data Products by external users.

Discussion: The detailed scope of this service is to be determined based on a representative set of system queries and analyses assembled from community input and based on MOUs with other organizations willing to serve part of the public access distribution.

The fraction set by this requirement refers only to project funded resources. The LSST Observatory expects and will facilitate community use of grid, peta-scale computing centers, etrc... (sic)

Notes:

  1. The variant use of the introductory words "Requirement" vs. "Specification" in the LSR seems to be without significance.
  2. I've used Confluence level 2 headers for the main requirement IDs and titles, with level 3 headers used to show cases where additional requirements that are relevant here are directly nested, in the LSR tree structure, under others that are shown here.

OSS (LSE-30): Observatory System Specifications

OSS-REQ-0118: Consistency and Completeness

Specification: The LSST data management system shall ensure that internal processing tasks are carried out on self-consistent and complete inputs, and that means are provided for users to achieve this in their own processing tasks.

Discussion: it should not be possible to inadvertently mix data, calibrations, and code from different data releases. All available data shall be used, and no piece of input data shall be inadvertently double-counted.

OSS-REQ-0121: Open Source, Open Configuration

Specification: All LSST-written data processing software shall be released under an open-source license. All configuration information necessary for users to be able to apply the software to reproduce LSST's processing shall also be made publicly available.

Discussion: The LSST software is permitted to depend on other open-source software packages, and will establish a configuration control mechanism for determining which are acceptable for use in the project.

Discussion: This specification does not prohibit the LSST production system from using infrastructure with a proprietary component, if that is justified by a cost-benefit analysis. The software itself must be open-source, and must be able to be run in at least small-scale production on open platforms.

OSS-REQ-0391: Data Product Conventions

Specification: LSST Data Products shall follow the conventions defined in LSE-163.

Discussion: LSE-163, the Data Products Definition Document, describes conventions for data products that cross level 1/2/3 boundaries.

OSS-REQ-0139: Level 3 Data Products

Specification: The LSST Observatory shall support Level 3 Data Products that are the result of processing based on Level 1 and Level 2 Data Products, of a nature specified by users (by the provision of code and/or processing configuration data).

Discussion: This is flowed down from LSR-REQ-0041.

The conceptual design description of the delivered data products is defined in LSE-163 Data Products Definition Document.

OSS-REQ-0140: Production

Specification: It shall be possible to create Level 3 Data Products either using external or internal (Data Access Center) resources, provided they meet certain requirements. LSST shall provide a set of specifications and a software toolkit to facilitate this.

Level 3 Data Products may consist of new catalogs, additional data to be federated with existing catalogs, or image data.

OSS-REQ-0141: Storage

Specification: The LSST Data Management system shall provide for the archiving of Level 3 Data Products that meet project-specified requirements.

OSS-REQ-0142: Access

Specification: Archived Level 3 Data Products shall be capable of being federated with and analyzed in conjunction with Level 1, Level 2, and other Level 3 Data Products. The LSST project shall support access controls for Level 3 Data Products that allow them to be restricted to specific individuals or groups as well as released for public access.

OSS-REQ-0143: Resource Allocation

Specification: The LSST project shall define a resource allocation policy and mechanism for arbitrating among the calls on Level 3 Data Product production, archiving, and analysis resources.

OSS-REQ-0176: Data Access

Specification: The LSST Data Management System shall provide open access to all LSST Level 1 and Level 2 Data Products, as defined in the LSST System Requirements and herein, in accordance with LSSTC Board approved policies. The LSST project shall make available open-source software for querying and processing the data products and for generating Level 3 Data Products, and limited computing and storage resources for performing such analyses and productions. 

OSS-REQ-0179: Data Products Processing Infrastructure

Specification: The Data Management System shall provide at least a fraction userComputingFraction (10%) of its total capability for user-dedicated processing and user-dedicated storage, including for the generation of Level 3 data products.

Discussion: This allocation does not include the resources needed to support the expected load of queries against the catalog database.

OSS-REQ-0187: Information Security

Specification: The LSST project shall ensure that Personally Identifiable Information (PII) and other sensitive data relating to individuals or business relationships are protected from unauthorized disclosure, as required by law and applicable standards.

Discussion: Data of this nature is not expected to be part of the science data set, but could arise in the engineering and facilities data collected as part of Observatory operations (e.g., information associated with the operations personnel). PII may also be associated with the resource management in Data Access Centers (e.g. names, addresses, etc. for researchers producing Level 3 data products).

Notes:

  1. I've used Confluence level 2 headers for the main requirement IDs and titles, with level 3 headers used to show cases where additional requirements that are relevant here are directly nested, in the LSR tree structure, under others that are shown here.

DMSR (LSE-61): Data Management System Requirements

DMS-REQ-0340: Access Controls of Level 3 Data Products

Specification: All Level 3 data products shall be configured to have the ability to have access restricted to the owner, a list of people, a named group, or be completely public.

Discussion: These features are supported by VOSpace.

Derived from Requirements: OSS-REQ-0142: Access; OSS-REQ-0176: Data Access; OSS-REQ-0187: Information Security

DMS-REQ-0290: Level 3 Data Import

Specification: The DMS shall be able to ingest tables from common file formats (e.g., FITS tables, CSV files with supporting metadata) to facilitate the loading of external catalogs and the production of Level-3 data products.

Derived from Requirements: OSS-REQ-0140: Production

DMS-REQ-0119: DAC resource allocation for Level 3 processing

Specification: The DMS shall provide a resource allocation mechanism for the DACs that allows the prioritization and allocation of resources to a variety of Level 3 processing activities.

Discussion: It is assumed that the DAC Level 3 processing resources will likely be oversubscribed, making this necessary.

Derived from Requirements: OSS-REQ-0143: Resource Allocation

DMS-REQ-0120: Level 3 Data Product Self Consistency

Specification: The DMS shall provide a means for ensuring that users’ Level 3 processing tasks can be carried out on self-consistent inputs - i.e., catalogs, images, metadata, calibrations, camera configuration data, etc., that match each other and all arise from consistent Level 1 and Level 2 processings.

Derived from Requirements: OSS-REQ-0120: Consistency; OSS-REQ-0118: Consistency and Completeness

DMS-REQ-0121: Provenance for Level 3 processing at DACs

Specification: The DMS shall provide a means for recording provenance information for Level 3 processing that is performed at DACs, covering at least all the DMS-provided inputs to the processing (e.g., catalog data used as inputs, dataset metadata, calibrations and camera data from the EFD).

Discussion: The DMS should also provide an optional means for Level 3 processing users at DACs to maintain basic provenance information on their own inputs to a processing task, such as code or additional calibration data.

Rationale: the DMS should facilitate Level 3 processing users in being able to carry out their work in a reproducible way.

Derived from Requirements: OSS-REQ-0122: Provenance (extended to cover Level 3)

DMS-REQ-0125: Software framework for Level 3 catalog processing

Specification: The DMS shall provide a software framework that facilitates Level 3 processing of catalogs. This framework shall provide a means for applying user-provided processing to catalog data, including measuring and ensuring the completeness of the application - i.e., that the specified processing was applied to all of, and only, the entire contents of the desired catalog(s).

Derived from Requirements: DMS-REQ-0120: Level 3 Data Product Self Consistency; OSS-REQ-0121: Open Source, Open Configuration; OSS-REQ-0122: Provenance

DMS-REQ-0128: Software framework for Level 3 image processing

Specification: The DMS shall provide a software framework that facilitates Level 3 processing of image data. This framework shall provide a means for applying user-provided processing to image data, including measuring and ensuring the completeness of the application - i.e., that the specified processing was applied to all of, and only, the entire contents of the desired dataset.

Derived from Requirements: DMS-REQ-0120: Level 3 Data Product Self Consistency; OSS-REQ-0121: Open Source, Open Configuration; OSS-REQ-0122: Provenance

DMS-REQ-0294: Processing of Datasets

Specification: The DMS shall process all requested datasets until either a successful result is recorded or a permanent failure is recognized. If any dataset is processed, in part or in whole, more than once, only one of the wholly processed results will be recorded for further processing.

Discussion: The criteria may be specified by DMS processing software, or by a scientist end-user for Level-3 production.

Derived from Requirements: OSS-REQ-0117: Automated Production; OSS-REQ-0118: Consistency and Completeness; OSS-REQ-0119: Completeness; OSS-REQ-0120: Consistency

DMS-REQ-0308: Software Architecture to Enable Community Re-Use

Specification: The DMS software architecture shall be designed to enable high throughput on high-performance compute platforms, while also enabling the use of science-specific algorithms by science users on commodity desktop compute platforms.

Discussion: The high data volume and short processing timeline for LSST Productions anticipates the use of high-performance compute infrastructure, while the need to make the science algorithms immediately applicable to science teams for Level-3 processing drives the need for easy interoperability with desktop compute environments.

Derived from Requirements: OSS-REQ-0121: Open Source, Open Configuration

DMS-REQ-0185: Archive Center

Specification: The Archive Center shall provide computing, storage, and network infrastructure to support, simultaneously: nightly processing including image processing, detection, association, and moving object pipelines, and the generation of all time-critical data products, i.e. alerts; the data release production, including Level-2 data product creation, permanent storage for all data products (with provenance), including federated Level-3 products; and serve data for replication to data centers and end user sites.

Derived from Requirements: DMS-REQ-0163: Re-processing Capacity; OSS-REQ-0004: The Archive Facility

DMS-REQ-0122: Access to catalogs for external Level 3 processing

Specification: The DMS shall facilitate Level 3 catalog processing that may take place at external facilities outside the DACs. This will principally be by facilitating the export of catalogs and the provision of tools for maintaining and validating exported data.

Derived from Requirements: OSS-REQ-0140: Production; OSS-REQ-0180: Data Products Query and Download Availability

DMS-REQ-0126: Access to images for external Level 3 processing

Specification: The DMS shall facilitate Level 3 image processing that may take place at external facilities outside the DACs. This will principally be by facilitating the export of image datasets and the provision of tools for maintaining and validating exported data.

Derived from Requirements: OSS-REQ-0140: Production; OSS-REQ-0180: Data Products Query and Download Availability

DMS-REQ-0123: Access to input catalogs for DAC-based Level 3 processing

Specification: The DMS shall provide access to all Level 1 and Level 2 catalog products through the LSST project’s Data Access Centers, and any others that have been established and funded, for Level 3 processing that takes place at the DACs.

Derived from Requirements: OSS-REQ-0140: Production

DMS-REQ-0127: Access to input images for DAC-based Level 3 processing

Specification: The DMS shall provide access to all Level 1 and Level 2 image products through the LSST project’s Data Access Centers, and any others that have been established and funded, for Level 3 processing that takes place at the DACs.

Derived from Requirements: OSS-REQ-0140: Production

DMS-REQ-0124: Federation with external catalogs

Specification: The DMS shall provide a means for federating Level 1, 2, and 3 catalogs with externally provided catalogs, for joint analysis. The DMS shall provide specifications for how external data must be provided in order for this to be achieved. The DMS shall strive to support community standards in this regard, including, but not limited to, virtual observatory facilities that may be available during the project lifetime.

Derived from Requirements: DMS-REQ-0125: Software framework for Level 3 catalog processing; OSS-REQ-0140: Production

DMS-REQ-0106: Coadded Image Provenance

Specification: For each Coadded Image, DMS shall store: the list of input images and the pipeline parameters, including software versions, used to derive it, and a sufficient set of metadata attributes for users to re-create them in whole or in part.

Discussion: Not all coadded image types will be made available to end-users or retained for the life of the survey; however, sufficient metadata will be preserved so that they may be recreated by end-users.

Derived from Requirements: OSS-REQ-0122: Provenance; DMS-REQ-0104: Produce Co-Added Exposures

DMS-REQ-0335: PSF-Matched Coadds

Specification: One (ugrizy plus multi-band) set of PSF-matched coadds shall be made but shall not be archived.

Discussion: These are used to measure colors and shapes of objects at ”standard” seeing. Sufficient provenance information will be made available to allow these coadds to be recreated by Level 3 users.

Derived from Requirements: OSS-REQ-0133: Level 2 Data Products



  • No labels