Checklist: Gen3 ready for general use (DM-DAX-12 Feature Parity)
Tickets indicate planned work. Checkboxes without tickets are (hopefully) all statements about functionality that is available.
- Gen3 Schema stability has been reached (changes requiring wholesale re-ingest could occur but sparingly... migration scripts may be possible)
- - DM-26407Getting issue details... STATUS
- - DM-21231Getting issue details... STATUS
- - DM-24432Getting issue details... STATUS
- - DM-22370Getting issue details... STATUS
- - DM-26476Getting issue details... STATUS
- - DM-26630Getting issue details... STATUS
- - DM-26600Getting issue details... STATUS
- - DM-21898Getting issue details... STATUS
- - DM-24329Getting issue details... STATUS
- - DM-21860Getting issue details... STATUS
- - DM-26692Getting issue details... STATUS
- presumably auto ingest of new raw data is now possible in Gen3
(currently blocked by ticket for expressing Filter in BOT data - DM-21706Getting issue details... STATUS )
- BPS Features needed for developers (single/multi-node)
- Gen2 runs can be recast/converted to Gen3 (so downstream development can proceed )
- RC2 gen2→gen3 possible
- Gen3 running of RC2 and DC2 -like processing should be possible (i.e. not at full scale but at tract scale)
DRP equivalent available (same thing runs in Gen3 and Gen2)recipe + example
- cp_pipe equivalent available (recipe + example)
- capability to run on single-node
- list of features for CommandLineTask that are needed?
- able to run in a notebook
- can run RC2 - like processing (not necessarily full-scale)
- can run DC2 - like processing (not necessarily full-scale)
- can run CPP - corresponding to appropriate above scales
- Assumes shared everything (schema, repository, data)
- list of needed Gen3 Butler functions/command-line tools
- - DM-26684Getting issue details... STATUS : simple command-line tool (prune-collection), high priority
- - DM-26685Getting issue details... STATUS : more complex command-line tool (query-datasets), unblocks many others, only slightly lower priority
- - DM-26690Getting issue details... STATUS : more complex command-line tool (query-data-ids), may not unblock as much
- - DM-26600Getting issue details... STATUS : butler method and command-line tool (prune/unregister/delete/remove-dataset-type)
- Documentation
- How to make a PipeTask
- How to make a pipeline
- How to find/identify/retrieve data in repo?
Below this point is open for discussion but pertains more to Gen2 deprecation (list of appropriate tickets)
- DM-26173 Write plan for Gen2 deprecation
Remaining Checklist: DM-DAX-13 (Gen3 ready for Gen2 Deprecation )
Select any items below that are required for Gen2/3 parity and hence need to be completed by Nov 1st.
- All Schema changes are accompanied but a migration mechanism (or have appropriate CCB approval).
- - DM-26406Getting issue details... STATUS
- Additional command-line tooling (functionality already accessible from Python, which we think is adequate for declaring feature parity in these cases)
- Gen3 continued development (would be on-going and even supplemented/aided by general DM users/developement)
- - DM-21333Getting issue details... STATUS
- - DM-21832Getting issue details... STATUS
- - DM-23985Getting issue details... STATUS
- - DM-26277Getting issue details... STATUS
- - DM-20695Getting issue details... STATUS
- - DM-21871Getting issue details... STATUS
- - DM-19470Getting issue details... STATUS
- - DM-21904Getting issue details... STATUS
- - DM-21872Getting issue details... STATUS
- - DM-26483Getting issue details... STATUS
- - DM-15257Getting issue details... STATUS
- - DM-25013Getting issue details... STATUS
- Support for core testing/development
- BPS Continue/Parallel development needed
- Full Gen3 (weekly-scale) runs of RC2 and DC2 occur and are now the source of truth weekly regressions.
BELOW THIS POINT are the initial points that were discussed for regarding Gen2/3 parity and Gen3 deprecation.
It has been kept here as there may still be some content that can inform the above.
It has been kept here as there may still be some content that can inform the above.
This page details the criteria/actions needed to complete the DM-DAX-12 (feature parity) and DM-DAX-13 (gen2 deprecation) milestones (i.e. the remaining development needed for the DM team to switch from using Butler Gen2 to Butler Gen3. In practice this should be broken down a little further than the two milestones. The current thinking along those lines follows:
Gen3 ready for pipeline developers:
ci_hsc runs (within a Gen3 Butler)personal repos are supportedsome documentation exists to help developers begin to switch between Gen2 and Gen3basic schema should be relatively stable (but changes on a weekly timescale can and will occur)single-node execution of pipelinesA log must be written out somewhere.The configs with which the tasks were run must be written out somewhereConfigs writeable/trackable by Butler (similar to pipeline schema files)
The stack version must be written out somewhere.Task metadata must be written out somewhere.The input repos must be written out somewhere (waiting clarification of this requirement)
Conversion of bi-weekly RC2 Gen2 repositories to Gen3 repository with registry inside OracleShared r/w Oracle schema with pre-friendly user authentication
Gen3 ready for friendly users:
RC2 (and DC2) "can" run in Gen3 (to zeroeth order as well as Gen2)DC2 data works in Gen3IngestionQuantumGraph generation
afw table environment is stableFull pipeline runs in both Gen2 and Gen3jointcal (DM-24300)pipe_analysis tasks "can" work (i.e. stability is sufficient that development necessary to bring these tasks forward can occur)
Multi-node processing possible on NCSA HTCondor pool with shared filesystem and shared database (NCSA staff)Run a 3 tract RC2 in under a week.Ability to use the intermediate outputs of an RC2, as the inputs to some one-off processing (not just repo chaining, but different users)
Multi-node processing possible on NCSA HTCondor pool with shared filesystem and shared database (friendly users) (i.e., a ctrl_pool replacement)
LIST STILL BEING GENERATEDUse new pipeline description yaml instead of previous command-line
Friendly-user HTCondor Pool
Shared r/w Oracle schema with friendly user authenticationNoteable features from single-node processing that must continue to be available:A log must be written out somewhere (at this point it is not a requirement to be anything other than file(s))The configs with which the tasks were run must be written out somewhereThe stack version must be written out somewhereTask metadata must be written out somewhereThe input repos must be written out somewhere (waiting clarification of this requirement)
Extra HTCondor user documentation (e.g., know why job isn’t starting or why job was killed)
Raw files can be auto ingested into friendly-user shared Gen3 repositories usable on LSP (and HTCondor Pool)Single-file ingestionChained repositories (? Users will not have write access to raw file storage area.)
Raw files can be auto ingested into friendly user Gen3 OODS (Maybe this moved to Deprecation Begins?)TBD
Gen2 Deprecation Begins (equivalent to DM-DAX-12):
Minimal changes to repository schema are expected (changes must come with upgrade mechanism or project agreement to start over from scratch)Gen2 + Gen3 runs can be compared as a sanity check (RC2/DC2 level)Multi-user registry (no more user-writeable production repositories)Updates to and development using Gen2 should ceaseci_hsc equivalents exist and run successfully (w/ extensions to show that critical datatypes are supported)HSCDECamLATISSComCam
Gen3 versions equivalent to their RC2 versions of the following pipelines run successfullyCalibration PipelineDRPAlert Processing???
PDR2 (scale) processing should be possible:Quantum Graph generationConversion to Workflow GraphTBD if running of part/all PDR2 processing
RIP Gen2 (equivalent to DM-DAX-13):
Note: There will still be Gen3 ToDo's, but they should not be known blockers to the removal of Gen2.Removal of Gen2 ingest of raw images into DBBDeprecate any Gen2 ingestion specific codeRemove Gen2 repository
Removal of Gen2 ingest of raw images into OODSStop running RC2 and DC2 in Gen2Remove production Gen2 repositories (users must remove their own)