The test data repositories we currently have are difficult to use for testing Gen3 functionality:

  • small obs-package test datasets like obs_test and testdata_subaru don't contain enough data or enough kinds of data to cover much Gen3 Butler functionality (i.e. no coadds, and nowhere near enough data to make them).
  • ci_hsc (after execution) contains everything we want, but is too big and slow for an efficient low-level test workflow, and its idiosyncratic build system makes it hard to extend to new tests.  And the logic in that idiosyncratic build system is precisely what PIpelineTask is designed to supplant.


Long-Term Testing in Gen3-only

  • Bulky data should be in a package that contains as little logic and has as few dependencies as possible to minimize version bumps and hence reinstalls in e.g. Jenkins.
  • We want a package with a frozen copy of the Registry DB that can appear earlier in the dependency tree than the PipelineTasks that are used to construct it.  This need not be accompanied by Datastore content, and would ideally be an (optional) dependency of daf_butler itself to allow Registry unit tests to be written there.
  • The suite of PipelineTasks for e.g. DRP (i.e. each Pipeline) should be executed using a real activator, and ideally validated with additional PipelineTasks of the sort that could someday feed SQuASH and/or other QA systems.
  • The outputs of each tested Pipeline should go somewhere that allows downstream test packages to depend on them.  That could just be EUPS install locations, or it could be a data repository filesystem accessible from test servers.
  • The centralized nature of Gen3 Registries actually makes it a little harder to split repos up across different packages; a downstream-processing test package would naturally want to add entries to the Registry database provided by raw-data package it depends on, rather than creating an independent one (i.e. the Gen3 Registry DB and EUPS DB both want to be the one to manage the filesystem).  But Registry DBs are sufficiently small that we can probably sidestep this just by copying them, either at the file level or by using Gen3 subset/transfer functionality.  I imagine we'd use symlinks to accomplish the same at the Datastore level.  When the Gen3 middleware and the testing dashboard infrastructure are more mature, we could start moving these tests out of the build/unit-test system and into a completely separate integration-test system (with Gen3 middleware providing workflow and (data) dependency management), but I imagine we'll want to continue to piggyback on the build/unit-test system for integration tests for a while to come.
  • We want new tests that provide high-level coverage for everything in obs_test, so we can retire all or most of that package instead of trying to translate its tests one-by-one (as many of these assume Gen2 design and wouldn't translate well).
  • We want new tests that provide high-level coverage for everything in the SDSS demo package, and these should be integrated with the more ci_hsc-like full-Pipeline tests.  For example, we should package an expected output file for some small subset of the outputs with each Pipeline test package.  That should means those packages should include scripts for updating the output files and comparing them to verify expected changes.

Transition Testing

  • Major changes to ci_hsc itself seem difficult; we should try to leave it as-is until we can retire it.  At most we could try to make it depend on a raw data repo provided by some other package.
  • Testing of the gen2convert tooling can probably only be done in the current ci_hsc.  This is already in place, and there is some testing that the conversion worked, but it remains pretty hard to add new tests to compare the Gen2 repo to its Gen3 view directly.
  • We could consider testing isolated PipelineTasks by adding them to the current ci_hsc, but this is at least a bit difficult (see "idiosyncratic build system" above).  The alternative is to defer automated testing until we can run them in a from-scratch Gen3 repo, which requires completing earlier PipelineTasks before we can test downstream ones.  Since we hope to have all of them (for DRP) done by end of January, that is probably feasible.

  • No labels