Meeting at 14:00 (Project time).

Participants

Meeting Notes

As we discussed, I think our next step is to write a design sketch for the variety of QA systems that we have discussed.

I suggest that we split into small groups to do this. Each group is responsible for coming up with a few pages capturing as much detail as it can about the areas of the system identified. Try to form a coherent narrative, so that the material is readable, but be as inclusive as possible — where you can be specific about the design that's great, but also make suggestions and raise questions for discussion in the group. It would be useful also to identify where we already have tools that address the topic under discussion, and to identify gaps where new tools need to be added.

I'm listing three categories below with a request of who should work on them. I expect there to be some overlap between them (in particular, I'm guessing the drill-down group will ultimately drill down to a lot of overlap with pipeline debugging), but that's — we'll resolve overlaps when we meet as a group.

I've tried to provide guidance about what each subject area covers, but again, be inclusive don't limit yourself to my suggestions here.

I've made a Confluence page for each group to collect their notes. As we discussed earlier, it might make more sense to work on some of this material when Tim visits Tucson next week, but let's do what we can to get started over the next few days so we can discuss progress and how useful an exercise this is when we meet next Wednesday.

Drill down

This will cover everything needed to go from high level aggregated metrics (e.g. as displayed by SQuaSH) to an interactive environment (e.g. like qa_explorer). It should cover:

The types of metrics which should be extracted from running pipelines (e.g. scalars, vectors, spatially binned quantities, etc);
How those metrics are displayed on a dashboard (e.g. is the timeseries-per-metric view shown by SQuaSH adequate, or do we also need other types of plotting, all sky views, etc?);
How the user would drill-down from these aggregated metrics to find the source of the problem (e.g. how far do they click about in a dashboard, vs. getting dumped into a Jupyter notebook with some preselected data?);
Assuming the user ends up in an interactive environment, what are its capabilities?
What can we learn from the above about the data products that the pipelines need to persist (both in terms of metrics that are posted to SQuaSH, and regular pipeline outputs, Parquet tables, HDF5 files, etc)?

People to work on this: Angelo, Eric, Tim
https://confluence.lsstcorp.org/display/DM/Drill+Down+Design

Pipeline debugging

What tools do we need to help pipeline developers with their everyday work? How do you go about debugging a task that is crashing?

Is lsstDebug adequate?
Do we need an afwFigure, for generating plots, to go alongside afwDisplay, for showing images?
What additional capabilities are needed for developers running and debugging at scale, e.g. log collection, identification of failed jobs, etc.
What's needed from an image viewer for pipeline developers? Is DS9 or Firefly adequate? Is there value to the afwDisplay abstraction layer, or does it simply make it harder for us to use Firefly's advanced features?
How do we view images which don't fit in memory on a single node?
How do we handle fake sources? Can we simply say this is a provenance issue?

People to work on this: Lauren, Simon, Trey
https://confluence.lsstcorp.org/display/DM/Pipeline+Debugging+Design

Unit, integration and large scale tests

Do we need any changes to the way we handle unit tests?
How are datasets made available to developers? Git LFS repositories?
Only on the verification cluster? Where does a developer who just wants “some data” go? (This covers how datasets are managed, not what the contents of those datasets should be).
What's an appropriate cadence for running small/medium/large scale
integration tests and reprocessing of known data?
How is the system for tracking verification metrics (“KPMs”, if you must) managed? (Not in the sense of what SQuaSH does, but who is running the jobs to calculate verification metrics? How often? etc)
How should we monitor run-time performance?

People to work on this: Hsin-Fang, John
https://confluence.lsstcorp.org/pages/viewpage.action?pageId=3D73581056

Next Meeting

Wednesday 2018-05-16, 14:00 (Project), https://bluejeans.com/426716450

Space shortcuts

Page tree

Participants

Meeting Notes

Drill down

Pipeline debugging

Unit, integration and large scale tests

Next Meeting

Space shortcuts

Page tree

QA WG Meeting: 2018-05-09

Participants

Meeting Notes

Drill down

Pipeline debugging

Unit, integration and large scale tests

Next Meeting