DM-15475 - Getting issue details... STATUS

Due to the portal de-scope, it i not very useful to make a proposal for AuxTel data access for users. But for historical reason, the email exchanges between Robert Lupton, Gregory, and Xiuqin in July 2018 are collected here for future reference. 

Xiuqin

Looking at the LSST Science Platform requirement LDM-554, spectra data display was not specifically called out. But there is one requirement for SUIT to be able to display calibration image data products. According to LSE-163(DPDD) section 4.4.2, both raw and processed images with spectra will be preserved and made available for download. 

I feel SUIT should be able to display those data. Of course, the final decision should be made by SST since we are here to support science. 

Robert:

Thanks for the DPDD reference:

All auxiliary telescope data, both raw (images with spectra) and processed (calibrated spectra, derived atmosphere models), will be preserved and made available for download.


I'm not sure that this was well motivated, but we shouldn't hide anything so OK.  The raw data looks like any other imaging data.  The auxiliary data calibration frames are more complicated (as it's a slitless spectrograph) but are all 4kx4k images.  At the very least bias, dark, and some sort of flat or flats, and direct and dispersed spectra [probably in more than one configuration].

As regards extracted spectra, these will be low resolution (R c. 100) wavelength calibrated flux calibrated spectra, but probably with a variety of spectral coverage (due to the use of blocking filters) and possibly with calibration features superimposed (think notch filters).  It is not clear what models for these objects we'll use (or even if we will use models explicitly), but you should think about serving estimated spectra at some other resolution -- either higher, equal, or lower (sorry), and possibly based on models.  The number of models will be much smaller than the number of auxTel "spectra" as we expect to observe c. 2000 standard stars (but the details are TBD).

We will also provide the derived atmospheric absorption profiles.  These will both be synthetic absorption spectra with R c. 2000 and also sets of c. 6 numbers which will have names such as "Oxygen" and "Ozone" and will generate the models (but should not be interpreted directly as atmospheric parameters as it's the models that we fit and which matter.  We expect roughly one auxTel spectrum for each 8.4m visit.

As regards visualisation, I'm not sure that any will be useful in firefly.  We will provide tools using LSP.  If you want to use firefly, you'd want to provide access to the flat fielded data frames (dispersed and non-dispersed) and the extracted calibrated spectra (a graph of intensity v. wavelength).  Depending on exactly how we handle the data it may make sense to overplot the model.  People may also want to look at the derived atmospheric parameters.

I suppose people could use SUIT tools to look at the ensemble of spectra taken of a given object, but I'd be surprised.  Using SUIT to look at ensembles of synthetic absorption spectra grouped in interesting ways (e.g. by derived tau_a or (az, alt, t)) is possible, but again I'd really expect people to use raw python.

Is that a place to start?

Gregory:

There are two relevant perspectives:

* What will users actually want/need to do?
* What capabilities may already be documented as a requirement?

The first may evolve from the internal/expert users of the calibration-pipeline-development and commissioning eras to the operational era.

I'll try to deal with the existing requirements first, in this email (this is mostly for Xiuqin and for the record as my interpretation).  I'll write another email about user needs/wants.

The starting point is these two:

DMS-LSP-REQ-0001
Specification: The LSP shall provide the capability to access all the Project’s released data products, including, but not limited to, the data products enumerated in the DPDD (LSE-163), as well as all user data products to which a user has access.

DMS-LSP-REQ-0002
Specification: The LSP shall provide a Web-based "Portal" means of access to all the LSST
data products, and to user storage resources.


The key word here is "access".  Before going on to any specifics, I would say that there are clearly cases where I would be willing to argue that the Portal provides "access" simply by letting users determine that a particular type and instance of data exists, and assisting them in loading that data into a Python (notebook) environment.  That could be as simple as providing a dataId that could be copy/pasted or dragged into a Butler.get() call in Python.  Obviously this is inadequate for the mainstream astronomical data products, but for specialty items where there is no obvious or trivial default visualization available, it seems like it meets the requirements.

In addition, if there is a persisted artifact corresponding to that data product (e.g., an afw.table-format FITS file or an HDF5 file), a further basic "access" can be as simple as allowing someone to download it.  But in an LSST context, it seems more useful to facilitate Python Butler access in the Notebook just by providing the appropriate ID.


The relevant requirement is:

DMS-LSP-REQ-0010
Specification: The LSP shall facilitate the transfer to the Notebook aspect of references allowing retrieval in a notebook of the data explored in the Portal session.
Discussion: This allows a user to locate and preview data in the Portal environment and then readily transfer their work to the Notebook aspect for detailed analysis.


One naturally goes on to data discovery from there.  For this we have these:

DMS-LSP-REQ-0008
Specification: The LSP shall support the identification of linkages between data items that reflect their provenance and data dependencies.
Discussion: For instance, from a calibrated image it should be possible to identify the raw image from which it was generated, and the calibration data used in its processing; from a catalog entry it should be possible to identify the image(s) on which the measurement was made.

DMS-PRTL-REQ-0002
Specification: The Portal aspect shall provide the capability to discover and access all the Project’s released data products, including, but not limited to, the data products enumerated in the DPDD (LSE-163), the calibration database, and the Reformatted EFD, as well as all user data products to which a user has access.
Discussion: The Portal’s workflows should allow a user to learn what data exist: what data releases are available, what image and catalog data they contain, the names of all databases, tables, and columns, etc. For all tabular data products the Generic Query requirements below cover the basic level of access provided.


As Xiuqin said, we already have a requirement that is specific to calibration as well:


DMS-PRTL-REQ-0005
Specification: The Portal aspect shall enable access to Project calibration data products, both directly and via linkages from science data products generated using them.


For the AuxTel raw images and calibration frames, I think as long as the AuxTel image metadata is handled in generally the same way by the databases and DAX services as the main-camera data, we would expect that the Portal would provide essentially the same level of service as we provide for the main camera for discovering that these observations exist and, say, querying them by coordinates, date, time, and observation type.  For the raw frames, the Portal can then easily also provide its usual pixel-data visualization.


For processed data, it's clear to us that there are going to be certain types of processed data objects that are neither naturally images or naturally tabular, and the Portal is not required to provide a custom visualization for every LSST data product data type.  For data which _is_ naturally tabular, we are required to provide at least a "generic" visualization, i.e., display of the table itself, and generation of plots from combinations of table columns.  Whether this is useful or not in any particular case will vary from data type to data type and user to user.  For certain types of tabular data we make special efforts to go beyond "generic" display, e.g., for time series we won't just treat MJDs as numbers, but will allow the user to request a plot with human-readable dates.


So I think this means that if the processed data object in the archive is, say, a key-value representation of the parameters of an atmospheric model, we are _not_ required to provide a fancy visualization of the absorption spectrum represented by that model.  We _could_ do so by calling the appropriate Python code to generate a sampling of that spectrum over the appropriate wavelength range at appropriate intervals, and then using our standard plotting tools to display it, but I think that is a "optional upgrade" unless you all explicitly decide we should.  We _are_ required, in my interpretation, to let the user see that a model has been calculated for a certain date and time, transfer an ID for that model to the Notebook environment, or download the opaque persisted file that contains the model (or, if it's persisted in a database table, to see that table in the table viewer).


The relevant requirements (with incorporated leniency) are:

DMS-PRTL-REQ-0042
Specification: The Portal aspect shall provide the capability to visualize all tabular and image data products in the DPDD, as well as user data products.
Discussion: The products in the DPDD are the primary data products for use by the LSST users. The "tabular and image" qualification indicates that the Portal is not required to provide a dedicated visualization for all data products that do not naturally fall into one of those categories. For user data products, the amount of detail and labeling, and the amount of UI support, will be less if they lack the full level of metadata that comes with the Project’s own data products.


DMS-PRTL-REQ-0043
Specification: The Portal aspect shall include the ability to visualize selected ancillary information produced by the LSST pipeline including, but not limited to, image regions, image bit-planes, survey footprints, focal-plane footprints and PSF representations.
Discussion: The intent here is to call attention to the fact there is more than just the survey images and coadds that are have a “2-dimension” form that need to be visualized and presented to the user in the interface. The specific ancillary data products to visualize will be determined during construction, based in part on feedback received during PDAC operation and the use of the Portal tools by developers. It is desirable that custom visualizations be available for important and frequently used ones such as Footprints (which can readily be displayed as pixel overlays). Where dedicated Portal visualizations are not available, however, users should be able to use either LSST-provided or community libraries in the Notebook aspect to create custom visualizations.



Note that none of the the special calibration data products are called out in the "including, but not limited to" list at this time, but it explicitly says that we may add more during construction based on "the use of the Portal tools by developers".


So for Xiuqin's initial milestone I think the requirements can be satisfied just by providing access to the observational metadata tables, to tables of processed outputs, and to bare-bones display of the pixel data.  Depending on how the processed spectra are represented it may be ~trivial to display them as well.


I assume there will be a table somewhere of the O(2000) standard stars, so again as long as that table is accessible via DAX it will just directly work to provide in the Portal a visualization of those points on the sky, projected over main-camera images or over a HiPS map, allowing users to see what calibration stars are available within or near selected main-camera observations.

Just as we provide a way to go from an Object to the list of single-epoch observations of that Object, I think it'll be easy to provide a link from a calibration star to the list of spectra taken of that star.  This is the kind of "semantic linkage" that the above requirements envision.


  • No labels