For LSST Science Pipelines documentation visit pipelines.lsst.io.
The LSST Stack consists of dozens of packages, and depends upon a large number of third-party libraries or utilities. This following guide to the packages in active use or development should be useful both to beginners and seasoned developers. This tour of the LSST Stack introduces users to its capabilities, how it is organized and built, and provides rough guidance on the maturity of the code base.
The LSST Software Stack consists of software written by the DMS team, as well as third-party libraries and utilities. The DM-authored code is written in two languages: python (currently v2.7) and C++11. All high-level code is written in python, which is the preferred language unless performance demands otherwise. The C++ code is made available to python via SWIG. The Stack makes extensive use of a number of selected external libraries and utilities, python modules, and tools.
The reference operating system for the LSST Stack is Red Hat Enterprise Linux (RHEL), using the gcc compiler, though this will soon change to CentOS. The Stack is built regularly on a number of platforms (see the list of supported platforms), and has been built successfully on a few other Unix-based platforms. Indeed, the DM Team uses multiple of these platforms for development.
The build system is SCons, and version control is managed with git. The LSST code repository is hosted on github. All code is wrapped into EUPS packages, which allows users to install multiple versions of packages, and to mix & match them. This is very useful for code development, testing and enforcing the compatibility of package versions, and managing package dependencies.
Source code-level documentation is generated using doxygen, which is intended to describe the APIs, classes, and basic task usage. Higher-level documentation is contained in this LSST Software User Guide, and in the companion LSST DM Developer Guide.
The LSST Stack is organized into a large number of packages that include applications and supporting libraries that are logically connected. The following diagram shows the architecture of the package categories and, for DM-authored packages, the primary implementation language:
These are python scripts for processing data that may be run from the command-line. Individual tasks are described elsewhere.The packages may include one or more tasks, or configurable applications that produce some scientifically useful product, which are the primary unit of code re-use for non-developers.
Tour of the Packages
The packages that have been developed for the LSST Data Management System are for the most part organized by function; a goal of this organization is to manage dependencies. See the following resources for details:
- See the list of LSST packages and tasks under active use
- Task-level (doxygen) documentation: http://lsst-web.ncsa.illinois.edu/doxygen/x_masterDoxyDoc/
- LSST/DM software repositories at: http://github.com/LSST
Certain groups of packages aggregate capabilities that are essential for building or using applications. These include the following:
The Applications Framework (afw) package provides basic functionality for an image processing system, including the representation of an image; methods for operating on them, displaying them; accessing pixels; and methods for detecting and cataloging sources within them.
The Data Access Framework (daf_*) packages provide utilities for managing access to images and calibration files, as well as persisting output products, in a way that is transparent to the image processing code. While these utilities are not strictly required for processing general user data with the LSST Stack (there are file-based alternatives), the Butler and related utilities are essential for large LSST productions.
Skymap is a class that can represent pixellated data covering most or all of the sky. The sky is divided into large, rectangular tracts that may overlap. Each tract is, in essence, a single large exposure though it may have been created by combining many overlapping exposures. Tracts are subdivided into patches that may also overlap. (In practice the patch size is selected to be manageable in-memory; imaging data are typically saved as one FITS file per patch.) Image templates and various deep co-adds are stored with this all-sky tessellation, and transformed into the geometry of single-visit images as necessary during processing.
Managing Task Execution
The ctrl_* packages control the LSST DMS software when processing in parallel on a large cluster of machines, including XSEDE platforms. Various native tasks (particularly, pipelines) can be invoked concurrently on multiple cores on many machines, and the communication and task management is performed with these packages. Many tasks can be run over multiple datasets in parallel, on a single machine, using the "
The LSST Stack is a prototype of the system that is intended to process, archive, and serve data from LSST at the start of survey operations (nearly a decade hence, as this is written). Thus, the maturity of the code base is uneven. The maturity of the components is noted below, with the following meanings:
|Good||Code has been exercised in science productions, and can produce technically viable results when configured properly. Enhancements and refactoring are likely.|
|Fair||Code has been exercised, though significant shortcomings are known; substantial enhancement, refactoring or replacement is expected.|
|Primitive||Code has been implemented as a proof-of-concept, or prototype, for a capability that has yet to be fully developed. Replacement is expected.|
Subject to Change!
The organization of the LSST Stack, as well as the dependencies on third-party software, is under constant review. Refactorings are to be expected from tagged-release to release.
Config & Data Access
Pipeline Execution Middleware
& Workflow Management