Note that much more interesting use cases will emerge in DP0.2 with the presence of image metadata tables, and a real time-dependent forced-photometry table from which light curves can be retrieved.  The use cases on this page are kind of silly and are also limited by some temporary issues in Qserv and Firefly.

The proposal envisions a set of DataLink-related microservice endpoints under data-int.lsst.cloud/api/datalink/*.   This note does not constrain the implementation to have either a single server at the datalink/  node, with multiple endpoints, or multiple micro-servers one level down.   Just note for planning purposes that I expect that by DR1 there will be 10-20 such services that perform some URL translation to enable service-descriptor access to the underlying TAP and image services.  As discussed in DMTN-076, at some point we expect all these services to migrate to something like api.data-int.lsst.cloud/datalink/*  as we implement the transition to per-Aspect hostnames for security purposes. 

1) Put a service descriptor on the dp01_dc2_catalogs.object table's ra and dec columns

The purpose of this is to trigger a search on the same, or another table in a region around the selected row.

Implementation depends on rewriting the URL from the service descriptor into a synchronous TAP query URL (or employing a proxy service to do the same thing).  After discussion on   the near-term consensus is to deploy true microservices (not, e.g., nginx rewrite rules) which return redirects to the TAP or other appropriate endpoints.  As these microservices do not themselves return Rubin data, they will not be required to be themselves  limited to authorized users, though of course in general the targets of the redirects will be limited to data rights holders with authorization tokens.  Client code will be expected to recognize the need to include authorization data in the calls to the redirected URLs.

Thus, we want to deploy a service descriptor that takes, e.g.,

  • https://data-int.lsst.cloud/api/datalink/dp01_cone_tableX?RA=ra_val&DEC=dec_val&RADIUS=rad_deg

and transforms it into

  • https://data-int.lsst.cloud/api/tap/sync?LANG=ADQL&REQUEST=doQuery&QUERY=SELECT+*+FROM+dp01_dc2_catalogs.tableX+WHERE+CONTAINS(POINT('ICRS',ra_col,dec_col),CIRCLE('ICRS',ra_val,dec_val,rad_deg))=1  

where there would be a separate service descriptor for each target tableX from the following table (where, note also that the coordinate columns have different names in the different existing target tables):

DescriptiontableX stringActual TAP table namera_col namedec_col name
Query for nearby objectsobjectdp01_dc2_catalogs.objectradec
Query for detailed position informationpositiondp01_dc2_catalogs.positioncoord_racoord_dec
Query for simulation truthtruth_matchdp01_dc2_catalogs.truth_matchradec

Note the service endpoints, e.g., /dp01_cone_object, have a "dp01_" prefix.  This is OK and in fact preferable because DataLink endpoints are somewhat specific to the data model being served and should not be regarded as an entirely stable part of the long-term guaranteed user-facing API.  We will guarantee, rather, that appropriate service descriptors will be provided in the query results from the "main" services.  Having said that we of course realize that users will reverse-engineer knowledge of the simple APIs of the DataLink services and may find them useful, and therefore may come to refer to them directly.  Some care to avoid capricious redefinition of these services after, roughly, DP1, should of course be taken.

2) Put a service descriptor on the dp01_dc2_catalogs.object table's parentObjectId column, ra and dec columns

The purpose of this is to trigger a search for the parent object for a blended object.

This contains workarounds for three current issues in the RSP:

  1. Qserv cannot efficiently search on just a parentObjectId; a spatial restriction is necessary in order to make the search performant.
  2. Firefly handles 64-bit integers incorrectly in some situations (there's a round-trip through a 64-bit float) which causes a loss of precision.  This is fixed in the current development HEAD of Firefly, which is not yet deployed in the RSP.
  3. The parent objects are apparently only in the dp01_dc2_catalogs.position table.

It requires a micro-service (or a more complex rewrite rule) that does the following:

Take:

  • https://data-int.lsst.cloud/api/datalink/dp01_object_parent?PARENT=parent_id&RA=ra_val&DEC=dec_val

and transform it into

  • https://data-int.lsst.cloud/api/tap/sync?LANG=ADQL&REQUEST=doQuery&QUERY=SELECT+*+FROM+dp01_dc2_catalogs.position+WHERE+CONTAINS(POINT('ICRS',coord_ra,dec_col),CIRCLE('ICRS',ra_val,dec_val,0.02))=1+AND+parent+BETWEEN+(parent_id-2)+AND+(parent_id+2)

(This pulls in a partial spatial restriction, large enough to cover a substantial blend, but small enough to still allow the query to execute efficiently, and establishes a range around the parent ID large enough to capture the current rounding in Firefly.)

3) Put a service descriptor on the dp01_dc2_catalogs.object table's parentObjectId column

The purpose of this is to trigger a search for the parent object for a blended object.

This is what it should look like in the absence of the Qserv and current Firefly issues.  It's worth including just to see what a "proper" service descriptor would look like.

Take:

  • https://data-int.lsst.cloud/api/datalink/dp01_object_parent?PARENT=parent_id

and transform it into

  • https://data-int.lsst.cloud/api/tap/sync?LANG=ADQL&REQUEST=doQuery&QUERY=SELECT+*+FROM+dp01_dc2_catalogs.position+WHERE+parent=parent_id

At the moment this would only work about half the time, and would trigger a table scan and therefore be slow even when the ID was passed correctly.  This service should not be deployed until the back-end and Firefly issues have been addressed.

  • No labels