Date
Attendees
Goals
This is a design sprint to kick off the LSST Science Pipelines documentation reboot. Our goal is to a create a tangible vision of what the Science Pipelines documentation will be. Questions we want to answer are:
- Who are the users of Science Pipelines documentation? What does each group want to get out of the Science Pipelines and its documentation? Do those needs conflict? Do we need to prioritize one user group in the initial implementation?
- What are the boundaries of the Science Pipelines documentation (the site at https://pipelines.lsst.io)? What are adjacent documentation projects that the Science Pipelines documentation might link against?
- What's the curriculum for learning the Science Pipelines? What are the concepts that the Science Pipelines documentation site needs to cover? How are these concepts organized (hierarchically or as a bottom-up information network). Do different types of users need specific entry points into the documentation and Science Pipelines itself?
- What kinds of content are we going to be producing? What do the templates of these topic types look likes?
- How are concepts unique to Science Pipelines, like tasks and command line tasks, documented in both a code and information architecture sense?
Intended Sprint Products
These are suggested products and outcomes from the sprint:
- A map of the science pipelines site. This map should resolve into individual HTML/reStructuredText documents (topics in Every Page is Page One terminology). Each topic should be annotated with:
- Topic name.
- Content purpose and scope.
- Topic type (i.e., template).
- Adjacent topics (topics that link into this page; topics this page will link out to).
- Topic types and templates. Each template shapes how different types of topics are written. Examples can be: API reference, task, command line task, tutorial project, conceptual overview, recipe. See Every Page is Page One Chapter 9: EPPO Topics Conform to a Type.
- Timelines. Timeline for content and for documentation infrastructure.
Prep Work/Background Reading Material
- Read Every Page is Page One.
- Jonathan Sick's "Pipelines Documentation Site Organization Sketch" is on clo.
- LDM-493: Data Management Documentation Architecture.
- Potentially relevant design docs, which may be cross-referenced with or otherwise relate to Science Pipelines docs:
- validate-base documentation
- Astropy documentation
Meeting Logistics
- Tuesday December 6: campfire chat at Bentley's or elsewhere.
- Wednesday December 7. 9:00 am to 5:00 pm. LSST Workroom.
- Thursday December 8. 9:00 am to 5:00 pm. LSST Workroom.
- Friday December 9. 9:00 am to 5:00 pm (or as participants depart). LSST Workroom.
Discussion items
Time | Item | Who | Notes |
---|---|---|---|
What is the scope of the "Science Pipelines" documentation site?
|
| ||
Boundary between Pipelines docs and the Developer Guide Should the pipelines documentation cover developer and build-oriented topics currently in the DM Developer Guide? Do pipelines users need to be able to create Stack packages to make Level 3 data products?
|
| ||
Science Pipelines docs and LDM-151 |
| ||
Who are our users?
|
Summary
| ||
EUPS Packages as units of organization
|
|
| |
What is the structure of the documentation homepage?
| Frameworks.
Twinkles workflow.
Homepage structure. | ||
Where should concepts of science interest (such as algorithm details) be documented?
|
| ||
How should examples and tutorials be produced?
| We need additional prototyping and design discussion before we identify a pattern for producing and testing examples in documentation. | ||
How should C++/Python API reference documentation be produced? |
| ||
Listing topic types and templates
| Preliminary listing.
Task topic type.
README topic type + GitHub summary line.
Measurement framework topic example
Butler framework topic example. | ||
Community.lsst.org and the docs |
| ||
Tagging command line tasks | We'll have lots of lists of command line tasks in two places: module topic pages and in processing context sections of the home page.
| ||
Task configuration and re-targetting | |||
Command line task topic types vs task topic types | Task framework documentation should document the philosophy of tasks vs command line tasks
| ||
Measurement extensions listing | We can look at the registry of measurement plugins (extensions) | ||
Important frameworks | Important/interesting frameworks are the ones that span multiple modules
| ||
Implementation plan |
LSST Science Pipelines
- Installation and setting up
- Processing data: a tutorial
- Release Notes
- Community, and getting help
- How to report issues
- How to contribute
Processing Data
In the beginning, this will be a single page that describes each measurement context and the main processing tasks that are done here.
****
The Processing Data section is oriented around command line entrypoints (command line tasks or supertasks) and documents processing patterns and algorithmic considerations.
The sections are patterned around typical user pipelines and processing/measurement contexts (single frames, coadds, difference imaging, and multi-epoch datasets). Contexts are slightly different from LDM-151 Section 5 headers. For example, we treat coaddition and difference imaging as different contexts.
In each section, there will be:
- Overview pages that provide a narrative to command line processing and algorithms.
- Tutorials that illustrative command line tasks with realistic datasets.
- Lists of command line tasks, linking to their reference pages. Command line task reference pages are hosted inside package documentatation. Command line task reference pages also link to task reference pages. Organize command line tasks between:
- processing data
- measuring data
Data ingest
- Overview
- tutorials
Single Frame
- Overview — what do we do in a single frame context. Then link to processCcd.
- Tutorials
Coaddition
- Overview topic
- Tutorials
Difference imaging
- Overview
- Tutorial
Multi-epoch object characterization
- Overview. E.g. https://lsst-web.ncsa.illinois.edu/doxygen/x_masterDoxyDoc/pipe_tasks_multi_band.html
- Tutorial
Postprocessing
- Overview(s)
- tutorials
- May need finer grained organization
Frameworks
- Measurement framework
- Butler framework
- task framework
- obs framework
- modelling framework
- geometry framework
- validation framework
- Build, packaging and utility framework
API modules
- lsst.afw.image - Image data structures
- ...
lsst.module.name — Readable name
Context establishment paragraph.
Links to related modules, framework pages, and disambiguation.
Design/High Level Overview
If necessary?
Tasks
- Listing of tasks (autogenerated; alphabetical)
Using the lsst.module.name API
- Links to API concept pages
- If it has a C++/Python API
Python API reference
- list of API object reference pages
C++ API reference
- list of API object references pages
Packaging
- Link to EUPS package/GitHub repository
- Dependencies: auto-generated graph/list of EUPS dependencies
Related documentation
- Linked design documents
- Linked technotes
- Linked papers
- Linked Community conversations
TaskName
Summary/context (1 sentence).
Summary of logic/algorith in a paragaph and/or bullet list. Include a sentence about each step, which can be either a) retargetable sub-task, or b) method within task.
Configuration
- Document fields in associated config class
- For subtasks, provide list of everything to which this could be retargeted.
Entrypoint
- Link to API page for the "run" method
Butler Inputs
- dataset type + description of Butler gets()
- Best effort for now; hopefully auto-doc'd in SuperTask framework
Butler Outputs
- dataset type + description of Butler puts()
- Best effort for now; hopefully auto-doc'd in SuperTask framework
Examples
- self-contained example of using this task that can be tested
Debugging
- Debugging framework hooks
Algorithm details
- Extended description with mathematical details
Measurement Framework
Context sentence/short paragraph
Framework concepts
- Overview
- Measurement contexts
- ...
- Style guide (rules for creating measurement plugins)
Tutorials
- Simple tutorial for creating a measurement plugin.
- Another tutorial with a more complicated aspect tutorial.
- C++ based tutorial
- ...
Measurement plugins
- a measurement plugin; linking to its class API
- ...
Modules
- list of modules that build up the framework
- ...
Butler framework
Context sentence/short paragraph.
Framework concepts
- Overview
- Datasets
- DataIds
- Composite Datasets
- What it does when I get() and put()
- (there might be need for some concept pages that dive into internals)
Tutorials
- ...
Modules
- ...
package_name
Desciption from summary line in bold weight.
This package is part of the LSST Science Pipelines: https://pipelines.lsst.io.
Join us at https://community.lsst.org.
Module Documentation
- module homepage link
- for each module in the package
LSST Science Pipelines: Descriptive sentence.
https://pipelines.lsst.io
Command line tasks
- IngestCatalogTask
- IsrTask
- MeasurementDebuggerTask
- ProcessImageForcedTask
- DeblendAndMeasureTask
- BaseMeasureTask
- DumpTaskMetadataTask
- ReportImagesInPatchTask
- ReportImagesToCoaddTask
- ReportPatchesTask
- ReportTaskTimingTask
- Generating coadds:
- AssembleCoaddTask
- SafeClipAssembleCoaddTask
- GetRepositoryDataTask
- ImageDifferenceTask
- MakeDiscreteSkyMapTask
- MakeSkyMapTask
- MockCoaddTask
- Multi-band processing:
- DetectCoaddSourcesTask
- MergeSourcesTask
- MeasureMergedCoaddSourcesTask
- ProcessImageTask
- RunTransformTaskBase
- CoaddAnalysisTask
- CompareAnalysisTask
- ColorAnalysisTask
- ctrl_pool middleware tasks:
- BatchCmdLineTask
- BatchPoolTask
- ProcessCcdTask
More things to discuss/design
- Task list topic types
- Tutorials
- Troubleshooting (when something goes wrong). -> integrate into task lists, and into task reference pages.
Engineering needs
- Turn pipelines_docs into an EUPS package so it can use lsst.utils.getPackageDir rather than assuming that packages are in lsstsw
- Integrate doc builds with sconsUtils
- Branch dashboard pages