You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 73 Next »

Checklist: Gen3 ready for general use (DM-DAX-12 Feature Parity)

Tickets indicate planned work.  Checkboxes without tickets are (hopefully) all statements about functionality that is available.

  • Gen3 Schema stability has been reached (changes requiring wholesale re-ingest could occur but sparingly... migration scripts may be possible)
    • DM-26407 - Getting issue details... STATUS
    • DM-21231 - Getting issue details... STATUS
    • DM-24432 - Getting issue details... STATUS
    • DM-22370 - Getting issue details... STATUS
    • DM-26476 - Getting issue details... STATUS
    • DM-26630 - Getting issue details... STATUS
    • DM-26600 - Getting issue details... STATUS
    • DM-21898 - Getting issue details... STATUS
    • DM-24329 - Getting issue details... STATUS
    • DM-21860 - Getting issue details... STATUS
    • DM-26692 - Getting issue details... STATUS
    • presumably auto ingest of new raw data is now possible in Gen3
      (currently blocked by ticket for expressing Filter in BOT data DM-21706 - Getting issue details... STATUS )
  • BPS Features needed for developers (single/multi-node)
    • DM-26397 - Getting issue details... STATUS
    • DM-24414 - Getting issue details... STATUS
    • DM-26458 - Getting issue details... STATUS
    • DM-26402 - Getting issue details... STATUS
    • capability to run on multi-node
  • Gen2 runs can be recast/converted to Gen3 (so downstream development can proceed )
    • RC2  gen2→gen3 possible
  • Gen3 running of RC2 and DC2 -like processing should be possible (i.e. not at full scale but at tract scale)
    • DM-26371 - Getting issue details... STATUS
    • Run a 3 tract RC2 in under a week.
  • DRP equivalent available (same thing runs in Gen3 and Gen2)
    • recipe + example
  • cp_pipe equivalent available (recipe + example)
  • capability to run on single-node
    • list of features for CommandLineTask that are needed?
  • able to run in a notebook
  • can run RC2 - like processing (not necessarily full-scale)
  • can run DC2 - like processing (not necessarily full-scale)
  • can run CPP - corresponding to appropriate above scales
  • Assumes shared everything (schema, repository, data)
  • list of needed Gen3 Butler functions/command-line tools
    • DM-26684 - Getting issue details... STATUS : simple command-line tool (prune-collection), high priority
    • DM-26685 - Getting issue details... STATUS : more complex command-line tool (query-datasets), unblocks many others, only slightly lower priority
    • DM-26690 - Getting issue details... STATUS : more complex command-line tool (query-data-ids), may not unblock as much
    • DM-26600 - Getting issue details... STATUS : butler method and command-line tool (prune/unregister/delete/remove-dataset-type)
  • Documentation
    • How to make a PipeTask
    • How to make a pipeline
    • How to find/identify/retrieve data in repo?
  •  



Below this point is open for discussion but pertains more to Gen2 deprecation (list of appropriate tickets)


  • DM-26173 Write plan for Gen2 deprecation

Remaining Checklist:  DM-DAX-13 (Gen3 ready for Gen2 Deprecation )

Select any items below that are required for Gen2/3 parity and hence need to be completed by Nov 1st.

  • All Schema changes are accompanied but a migration mechanism (or have appropriate CCB approval). 
  • DM-26406 - Getting issue details... STATUS
  • Additional command-line tooling (functionality already accessible from Python, which we think is adequate for declaring feature parity in these cases)
  • Gen3 continued development (would be on-going and even supplemented/aided by general DM users/developement) 
  • Support for core testing/development
  • BPS Continue/Parallel development needed
  • Full Gen3 (weekly-scale) runs of RC2 and DC2 occur and are now the source of truth weekly regressions.
  •  



BELOW THIS POINT are the initial points that were discussed for regarding Gen2/3 parity and Gen3 deprecation.
It has been kept here as there may still be some content that can inform the above.

This page details the criteria/actions needed to complete the  DM-DAX-12 (feature parity) and DM-DAX-13 (gen2 deprecation) milestones (i.e. the remaining development needed for the DM team to switch from using Butler Gen2 to Butler Gen3.    In practice this should be broken down a little further than the two milestones.  The current thinking along those lines follows:


Gen3 ready for pipeline developers:

  • ci_hsc runs (within a Gen3 Butler)
  • personal repos are supported
  • some documentation exists to help developers begin to switch between Gen2 and Gen3
  • basic schema should be relatively stable (but changes on a weekly timescale can and will occur)
  • single-node execution of pipelines
    • A log must be written out somewhere.
    • The configs with which the tasks were run must be written out somewhere
      • Configs writeable/trackable by Butler (similar to pipeline schema files)
    • The stack version must be written out somewhere.
    • Task metadata must be written out somewhere.
    • The input repos must be written out somewhere (waiting clarification of this requirement)
  • Conversion of bi-weekly RC2 Gen2 repositories to Gen3 repository with registry inside Oracle 
    • Shared r/w Oracle schema with pre-friendly user authentication

Gen3 ready for friendly users:

  • RC2 (and DC2) "can" run in Gen3 (to zeroeth order as well as Gen2)
    • DC2 data works in Gen3
      • Ingestion
      • QuantumGraph generation
    • afw table environment is stable
    • Full pipeline runs in both Gen2 and Gen3
      • jointcal (DM-24300)
      • pipe_analysis tasks "can" work (i.e. stability is sufficient that development necessary to bring these tasks forward can occur)
    • Multi-node processing possible on NCSA HTCondor pool with shared filesystem and shared database (NCSA staff)
    • Run a 3 tract RC2 in under a week.
    • Ability to use the intermediate outputs of an RC2, as the inputs to some one-off processing (not just repo chaining, but different users) 
  • Multi-node processing possible on NCSA HTCondor pool with shared filesystem and shared database (friendly users) (i.e., a ctrl_pool replacement)   
    LIST STILL BEING GENERATED
    • Use new pipeline description yaml instead of previous command-line
    • Friendly-user HTCondor Pool
    • Shared r/w Oracle schema with friendly user authentication
    • Noteable features from single-node processing that must continue to be available:
      • A log must be written out somewhere (at this point it is not a requirement to be anything other than file(s))
      • The configs with which the tasks were run must be written out somewhere
      • The stack version must be written out somewhere
      • Task metadata must be written out somewhere
      • The input repos must be written out somewhere (waiting clarification of this requirement)
    • Extra HTCondor user documentation (e.g., know why job isn’t starting or why job was killed)

  • Raw files can be auto ingested into friendly-user shared Gen3 repositories usable on LSP (and HTCondor Pool)
    • Single-file ingestion
    • Chained repositories (?  Users will not have write access to raw file storage area.)
  • Raw files can be auto ingested into friendly user Gen3 OODS (Maybe this moved to Deprecation Begins?)
    • TBD 

Gen2 Deprecation Begins (equivalent to DM-DAX-12):

  • Minimal changes to repository schema are expected (changes must come with upgrade mechanism or project agreement to start over from scratch)
  • Gen2 + Gen3 runs can be compared as a sanity check (RC2/DC2 level)
  • Multi-user registry (no more user-writeable production repositories)
  • Updates to and development using Gen2 should cease
  • ci_hsc equivalents exist and run successfully (w/ extensions to show that critical datatypes are supported)
    • HSC
    • DECam
    • LATISS
    • ComCam
  • Gen3 versions equivalent to their RC2 versions of the following pipelines run successfully
    • Calibration Pipeline
    • DRP
    • Alert Processing
    • ???
  • PDR2 (scale) processing should be possible:
    • Quantum Graph generation
    • Conversion to Workflow Graph
    • TBD if running of part/all PDR2 processing

RIP Gen2 (equivalent to DM-DAX-13):

  • Note: There will still be Gen3 ToDo's, but they should not be known blockers to the removal of Gen2.
  • Removal of Gen2 ingest of raw images into DBB
    • Deprecate any Gen2 ingestion specific code
    • Remove Gen2 repository
  • Removal of Gen2 ingest of raw images into OODS
  • Stop running RC2 and DC2 in Gen2
    • Remove production Gen2 repositories (users must remove their own)
  •  





  • No labels