There is ongoing discussion at the senior level on requirements on minimum values for L1PublicT. Agreed that there's no need to push this at the DM level until those discussions have converged.
Next JSR is tentatively scheduled for the last week of August (2 weeks post-JSR).
Discuss milestones and procedures for arranging testing of LSP deployments by substantial numbers of users (DM staff? Sci Collab members?)
To what extent is the existing (or some future) LSP a “production” service for the use of DM staff / other project members / outsiders, vs. a construction project?
Other datasets, beyond the list in Gregory Dubois-Felsmann slides, which might be considered for testing:
Kepler (& K2?)
Something for alert production?
This could mean diffims / DIASources / DIAObjects, and/or it could mean realistic tools for working with the alert stream / alert DB.
No consensus that the former needs anything beyond the diffim that will be carried out in DRP.
We could explore things like UI to the alert filtering service using the ZTF alert stream.
Solar system objects?
Most LSP requirements on SSOs were deleted in the replan.
Agreed:
We will plan on using LSST-processed HSC data, DR1 for now and DR2 when available
For externally-sourced processed data we will continue to use WISE/NEOWISE, and...
We will add Gaia DR2, as this dataset is now essential both for internal use and as context for anyone trying to do scientific work with survey data.
There is no requirement to continue to support the 2013 SDSS processing, but we'll keep it around until it becomes impractical. (This was not clearly brought out in the wrap-up, though.)
Requesting any further datasets will be handled by RFC.
Note that ingesting these datasets still requires data model work; see discussion tomorrow.
Can we expose some of this data by TAP queries to other archives, rather than copying the data to our own systems?
Discussion around general “providing a sandbox” testing vs. focused tests.
“Victims of our own success” — everybody wants to use notebooks. How are we supporting that? What are the resource implications?
Also impacts the priority of technical work.
Current tutorials are not providing useful technical input; they are just outreach to the community. Agreement from Wil O'Mullane that not every science collaboration should have effort allocated to support tutorial sessions (but e.g. the PCW will get supported).
Discussed support for “internal users” — Stack Club and/or DM developers. What are the expectations here? Suggestion that T/CAMs can intercede on the part of their developers, and hence there's less pressure (and less reputational risk) on LSP support. (Do we agree with this?)
Wil O'Mullane to discuss plans for future LSP demos and Tutorials (which take more effort) with Leanne Guy
Note that Leanne Guy is already in the process of a grand rebranding.
Gregory Dubois-Felsmann Also note that we are not planning to expose either "SUIT" or "DAX" - the equivalents for the other two Aspects - in the UI or in the user manual. The names, or their software-prefix equivalents, e.g., `dax_`, will appear in implementation documentation and in code.
The immediate aim here is just to have a name we can use in code and reporting; ultimate branding of the product release to the public will follow from Leanne Guy 's process (or some equivalent).
All DMLT: Send suggestions for renaming “Jellybean” to Simon Krughoff
Simon Krughoff — create a poll for the new name for Jellybean on on #dm-camelot, with a closing date of Monday .
It seems likely that some of those recommendations correspond to work which is already planned; some should be added to the plan; and some should be ignored.
We will review each QAWG recommendation in turn and decide:
Whether this is something that the DMLT wishes to schedule work to address as a matter of urgency;
If so, which T/CAM is responsible for managing the work;
The delivery timescale which is necessary to make this exercise useful.
Assuming 3 minutes per recommendation means we'll need 2 hours for this exercise.
Gregory Dubois-Felsmann and John Swinbank Re: QAWG-REC-20 - we should revisit
RFC-243
-
Getting issue details...STATUS
and capture what is now actually being done in that respect (e.g., via Hsin-Fang Chiang's regular runs), what parts can be achieved by improving the documentation of and access to the output data from those runs, and what parts might still require substantive work (e.g., regularly running the SDM-Standardization afterburner on those outputs).
Jim Bosch and Simon Krughoff should talk about the development path for metrics re: QAWG-REC-34. I.e. usually a plot motivates a metric not the other way around. How are metrics captured in this workflow?
Following an impressive demo in early February, what are the next steps for middleware development?
A rough timeline was presented for BG2 deprecation, which relies on effort being allocated by various T/CAMs. Have those resources been secured? Is this timeline now solid?
What longer-term development effort is needed? Where will long term maintenance responsibility lie?
And how does that represent to available developer effort / funding?
Are the S19 budgets that Fritz mentions still valid?
Yes, but note that this leaves Simon Krughoff fully budgeted for the next 3 months.
There will also be a request for resources from AP.
Commissioning team involvement?
Depends mostly on CPP and obs_lsst.
Need to capture the need to get Butler access to prompt data on distributors for AP.
This does not block conversion of AP pipeline tasks to PipelineTasks.
Aiming to have an early milestone for PipelineTask design review; after that, interfaces should be stable and porting can start in earnest.
General agreement that there should be a “portathon” at the PCW; likely policy is that beyond that there will be no new code developed in the Gen2 middleware (although old cold will likely be supported until ~November).
Futures:
General agreement that the tight loop between DRP and LDF is essential; not a strong appetite to have Architecture in the loop.
The DRP team suggests that the weight of development should move to NCSA; that could be supported by a new hire at NCSA, who wouldn't necessarily need to jump directly into the middleware “czar” role.
We will use this session as a very focused discussion around timelines for how versioned calib repositories, yaml camera, and general Gen3 improvements will be integrated.
The result of this session will be a set of broad guidelines for how to order effort in this area. It's expected that Simon Krughoff and Tim Jenness will do the work.
In addition, we would like to identify a specific obs package to act as the exemplar.
Identify an obs package to use as “the exemplar” to convert to new standards.
Note this is using the current implementation as the reference, but using this package as the basis of work to develop a future reference implementation. “Patient 0”.
Simon & Tim will then handle that, then we can figure out how to convert other packages.
obs_subaru? obs_lsst? obs_decam?
Discussion of the “special” aspects of all the various packages — test stands, version cameras, etc; all of them have different concerns.
Also calibrations from external pipelines on DECam and CFHT.
Consensus seems to be that obs_decam is the right choice.
Some concern that it might be more efficient to simply fork and drop Gen2 support; Tim & Simon to play this by ear.
YAMLCamera is a fact of life, but it needs more testing before we rely on it, and there is ongoing development and its schema needs to be formalized.
Four tasks to convert and update obs_decam:
Conversion to YAMLcamera
Versioned cameras
Integration of user generated calibs
Deal with config overrides in obs_ packages.
The useful end product is a document describing how an obs_ package works.
This is not blocked on other work; Tim, Simon and their T/CAMs can schedule this based on their availability.
Leanne Guy — work with Robert Lupton to extract a todo-list for YAMLCamera/obs_lsst as well as an ideas to whether he is planning to act on it.
Wil O'Mullane, Unknown User (gruendl) — review all dates for ops rehearsal #1 at (or in follow-on to) DMLT telecon; this will cover everything except pipelines, which will follow.
"SDM standardization" is part of a long chain of transfer of data organization / schema metadata from the point of generation in AP/DRP code (and image ingest) through databases to external service via TAP/SIAv2 and use in the Portal.
A complete architecture for this is not yet RFCd.
Concrete milestone: ingest HSC data.
Note that Object tables resulting from the current code are already ingestible (modulo concerns to come later).
Concerns are that some effort is needed to reconcile the DPDD and what the pipelines are actually producing.
The same basic model is meant to be used for image metadata (exposure, visit, coadd patch); this needs to be harmonized with Butler Gen3 development. The Gen3 database schema should be migrated to be derived from a Felis representation.
The DPDD does not currently specify the content of such tables (it just says they should exist). Something equivalent to such a specification is probably needed in order to define some effective requirements on this metadata (in part this is necessary in order to guarantee that our metadata can support the generation of ObsCore and CAOM2 representations).
The mid-2019 deadlines of the Gen3 project for convergence on its DB schema must be kept in mind.
Aiming to get an end-to-end processing through HSC data to TAP_SCHEMA into Portal/Topcat is a high priority; this should be associated with a level 3 milestone (or milestones; not necessarily a new one).
Qserv ingest improvements to support this are expected by the end of the cycle.
CSCs at the Summit will be using DM code and are likely to expect to use Data Butler interfaces. Are we ready to support this?
Scripts for the Script Queue are going to combine commands to CSCs with data processing. How and where should that processing occur? Where should the data being processed live?
Desirement has been articulated to execute DM code “directly in the script queue”.
Assertion is that e.g. CBP scripts could be run in OCS Controlled Batch to meet these goals.
We “hope” that this is not much more overhead.
But this will be required in ~July this year, before the OCS Controlled Batch service exists. This is “a worry”; the script queue machine will need access to a Data Butler.
Note that no database services are expected on the summit; Butler G3 repositories will need SQLite (except the DBB, which will use the Consolidated DB).
How to handle script execution for AuxTel (before OCS Batch)?
Can the AuxTel have an OODS Butler?
If you're willing to go the Base to get the data.
Seems unlikely that networks are a limiting factor.
K-T is reluctant to allow direct access to the script queue.
Seems convenient to have the script queue access the Camera Diagnostic Cluster Butler.
Commissioning Cluster is not due until 2020 at earliest.
Kian-Tat Lim — produce a DMTN describing DM summit services.
(NB I — John Swinbank — don't think it's useful for the whole DMLT to spend 8 hours on this! A smaller splinter session may be more effective.) Leanne Guy- I agree
NB Per DMLT discussion of 2019-01-28, we should expect this not just to be a 45 minute presentation of the report, but to grow into a larger & longer discussion of what we're going to do about it. Details TBD.
Following an impressive demo in early February, what are the next steps for middleware development?
A rough timeline was presented for BG2 deprecation, which relies on effort being allocated by various T/CAMs. Have those resources been secured? Is this timeline now solid?
What longer-term development effort is needed? Where will long term maintenance responsibility lie?
obs_lsst provides an updated vision of obs_ packages for the BG3 era. Can the relevant decisions be captured in a design document to guide potential updates to other obs_packages?
Gen3 middleware will also change how obs_ packages work; for that obs_subaru provides more of a prototype than obs_lsst. We need need to integrate these mostly-orthogonal changes.
"SDM standardization" is part of a long chain of transfer of data organization / schema metadata from the point of generation in AP/DRP code (and image ingest) through databases to external service via TAP/SIAv2 and use in the Portal.
A complete architecture for this is not yet RFCd.
Release maintenance, including back-porting of bug fixes to stable releases for science users
CSCs at the Summit will be using DM code and are likely to expect to use Data Butler interfaces. Are we ready to support this?
Scripts for the Script Queue are going to combine commands to CSCs with data processing. How and where should that processing occur? Where should the data being processed live?