Archive Ingest and SDQA

Ingest

Ingested into temporary database
- For use in other DRP steps or by SDQA
- "Patch" updates happen by execution of new Tasks
  - Must be possible to reproduce entire DRP, including any "patches"
Level 1 products ingested into internal Level 1 database
Level 2 products ingested into Qserv database
- Can remove a batch (with queries disabled), then reingest after "patch" update Task
- Need to track batches and their status

Transformed within 24 hours into Level 1 Science Data Archive EFD
- Transformation includes removal of sensitive data (e.g. personnel-relevant log entries)
- Transformation includes restructuring schema to be more science-query-friendly
  - Adding join keys
  - Denormalizing
  - Creating views
- Note: it's possible that the Science Data Archive EFD will not actually be in relational form. Something like a NoSQL document database or BigTable/Hypertable might be more appropriate.
Cleansed and transformed for Level 2 EFD as part of annual CPP
- Cleansing includes flagging of invalid data

Zeroth pass is metrics produced by Tasks
- Automatic flagging of metrics outside threshold
First pass is in temporary database
- Automatic metric generation (for metrics using data from multiple Task executions)
Second pass is in Qserv database for Level 2 products, internal Level 1 database for Level 1 products
- Verification of completeness and consistency
- Large-scale analyses across entire dataset
- Verification of performance for end users