The Gen3 versions of the AP and DRP pipelines are already able to do at least as much as their Gen2 counterparts, and we currently have no reason to believe that they perform any worse at anything.

We also now have the green light to start removing the Gen2 versions of things, at least according to the release stability promised we've made in the past.

At present, however, some important test coverage still relies on Gen2 middleware and CmdLineTasks - this is a problem in its own right, because it means we're testing things under unrealistic, non-production conditions, and reliance on those tests make it harder to change our tasks in Gen3-only ways. This page attempts to capture all cases in which we use Gen2 as a test harness, and transform this information into a plan for removing different aspect of Gen2 middleware.

Known Gen2 integration tests and multi-task unit tests

ci_hsc_gen2

This:

Runs the full Gen2 DRP pipeline on HSC data.
Tests that the expected output datasets are actually produced and meet very minimal expectations.
Runs and tests the Gen2 → Gen3 repository conversion tools.
Tests the pipe.base.ShimButler transition convenience code.

(1) is already done by ci_hsc_gen3.

(2) is not, and this is probably the single biggest blocker for removing CmdLineTask inheritance from most hybrid PipelineTask/CmdLineTasks; we cannot remove ci_hsc_gen2 until ci_hsc_gen3 includes those checks ( DM-28806 - Getting issue details... STATUS ).

(3) and (4) are worth keeping around as long as long as the rest of ci_hsc_gen2 has to stay around, since they provide coverage for tools that may still be useful while we are converting Gen2 test code to Gen3.

pipe.tasks.mocks and nopytest_test_coadds.py

This test utility subpackage and the (only) test that uses it provide a lot of early coverage for our coaddition and coadd-processing tasks. They use very simple (i.e. Gaussian stars, little or no noise) simulations, in order to check that the bookkeeping for CoaddPsf is exactly right on a small amount of data, rather than statistically right on a lot of noisy data. There's a lot of infrastructure here just to mimic a Gen2 CameraMapper, but it's also tied up with the code that does the simulations.

The test script has also attracted a number of miscellaneous test methods that simply want to check something about the outputs of the tasks the test runs.

Porting all of this coverage to Gen3 would be a lot of work, and I'm skeptical that it's all useful:

The zero-noise, simple-PSF simulation framework was intended to be useful for more than CoaddPsf, but no other tests made use of that (that I know of, at least).
While that test was helpful in getting CoaddPsf up and running initially, CoaddPsf still shipped with a critical bug in weights that the test was too simple to catch, and a much bigger problem with CoaddPsf involving sigma-clipping plagued us for a while, too. My takeaway from all of this is that for something like CoaddPsf, large-scale integration tests on realistic data (~RC2 or test-med-1, not just ci_hsc or ci_imsim) are necessary; if we have those, they are also sufficient, and a smaller test like this is only useful as a way to catch some problems early, especially when there is high churn.
CoaddPsf isn't experiencing high churn at all, and in fact we hope to drop it in favor of cell-based coadds someday instead of develop it further.
The coverage from running the other tasks here is again mostly just coverage that they run at all, and hence it duplicates coverage we get from ci_hsc_gen3 and ci_imsim. Nobody should be making changes to these tasks without running those ci_ packages before merge, and is probably developing against real data, not this test, so I don't think discovering failures "earlier" from this test is a strong argument.

So, my tentative recommendation here is to port the test methods that perform checks on the processing outputs to ci_hsc_gen3 and/or ci_imsim, and scrap the rest.

pipe_tasks' test_processCcd.py

This runs ProcessCcdTask on data from obs_test. It seems to mostly check that the tasks run at all, but also has some purely empirical regression tests ("we got this value once; check that the new one isn't worse").

obs_test is a pure-Gen2 thing, and the pipelines_check package runs the exact same tasks already. Let's move the checks to tests in that package, and drop this test entirely.

Space shortcuts

Page tree

Known Gen2 integration tests and multi-task unit tests

ci_hsc_gen2

pipe.tasks.mocks and nopytest_test_coadds.py

pipe_tasks' test_processCcd.py

Guidelines for Gen2 content removal

Removing CmdLineTask inheritance from hybrid tasks

Removing Gen2 support from obs packages

Removing Gen2 support from obs_base

Removing Gen2 support from pipe_base

Removing daf_persistence

Post-Removal cleanups

Space shortcuts

Page tree

Gen2 Middleware Removal Planning

Known Gen2 integration tests and multi-task unit tests

ci_hsc_gen2

pipe.tasks.mocks and nopytest_test_coadds.py

pipe_tasks' test_processCcd.py

Guidelines for Gen2 content removal

Removing CmdLineTask inheritance from hybrid tasks

Removing Gen2 support from obs packages

Removing Gen2 support from obs_base

Removing Gen2 support from pipe_base

Removing daf_persistence

Post-Removal cleanups