For reference... our view of the get()/Read capabilities currently provided or desired to be provided by the Butler:
- Reconstitution of Python-domain objects
- Abstraction of storage (file formats, storage mechanisms, and naming) - currently only really abstracts naming
- Understanding of associations between datasets (e.g., temporal and spatial)
- Find synthetic flat appropriate to a raw image
- Find calibrated images contributing to a coadd patch
- Wildcarding (partially specified DataIds)
Capability (1) must by definition be provided by a Python language API. The others could be provided as a service. See below.
SUI feedback:
- When the Butler returns multiple results (e.g., from a wildcard or a 1:N association) we would like the results to be available as a list of references in a form directly (or at least requiring only a trivial transformation) usable with the LSST data retrieval Web APIs (DAX). The wildcard and 1:N resolution are useful in the SUI but we then need to be able to operate on the result (which may be lengthy) at the metadata table level.
- We would like API support for the ability to retrieve a dataset in multiple Python forms - e.g., we would like to be able to support retrieval of results in community-standard forms (e.g., pandas, astropy tables) as well as their "native" LSST forms – where these are different.
- Unknown User (npease): - DM-4551Getting issue details... STATUS
- We need to understand how put()/writing works when multiple repositories are made visible through a single Butler. For get()/reading a single search order makes sense. For put() it may be desirable to support alternative destinations (local disk, user workspace, Level 3 DB) or even multiple destinations for a single put().
- We need to understand how authentication information is passed through the Butler to background services it may access.
- Unknown User (npease): - DM-4552Getting issue details... STATUS
- We would be very interested in a "remote Butler" functionality, in which capabilities 2, 3, and 4 above are provided by a well-defined Web API, allowing capability 1 to be provided by a thinner Python wrapper around a Web API call. This would be useful directly to non-Python components of the SUI, as well as making it easier to provide Butler functionality to Level 3 users running remotely. In principle, this could be done by ensuring that the capabilities exposed by the DAX interfaces include Butler capabilities 2, 3, and 4 – or the DAX interfaces could be a lower level, providing capability 2 and perhaps capability 4, for specific repositories, with the "remote Butler" providing capability 3.
- Unknown User (npease): - DM-4554Getting issue details... STATUS
- Do get and put operations need to take versioning of an entity into account?
SuperTask feedback:
- We would like to decouple put() calls from persistence, allowing deferred writes and/or administrative control of whether put() just generates an in-memory temporary or actually writes to persistent store.
- Unknown User (npease): does allowing the in-memory destination (in - DM-4542Getting issue details... STATUS ) satisfy this need? If not we should have a voice meeting about this to discuss details.
- Thinking about the execution interface of the SuperTask, we believe we need something more sophisticated than just DataRef as a layer above the Butler.