The following are two charts.  The first is my understanding of the current API design for the end-user into LSST.  This does not include L3 access. The second is a cleaned up proposal.  

Concerns about the above design:

  • 3 Groups Supporting 6 APIs into the System
  • User must learn a minimum of 4 APIs or use pure VO
  • Which APIs support SSO Login? All but QServ? Direct QServ is not designed to support SSO login.
  •  Who will have a VO table option? Only the VO? All HTTP based?  VO and SUI Enhance?
  • How are we going to manage parameter naming conventions?
  • Are all the Slac & SUI APIs going to be REST Based? It appears the Image Meta Data Search will be. Is that a decision?
  • Who supports backgrounding? Who should not background?

Comments:

  • The SLAC layers are treated as primitive layers
  • SSO and file formatting responsibilities are clear.
  • Backgrounding for all services is handled in one place.
  • The Users main entry point are the service layers
  • While it would VO layer to use the main service layer it might have to interact with the primitive directly.

 

 

 

 

 

  • No labels

15 Comments

  1. Yes, the second picture is more along the lines of what Gregory and I were thinking, I believe, except the following:

    • I think we're hoping that much of the green service layer can be a simple pass-through.
    • Many of the primitives will still need access to authentication and authorization credentials to do detailed resource management.  I'm not sure this can all be centralized at the service layer.
    • I'm not sure we would want to forbid advanced users from bypassing the service layer to directly access the primitives, particularly from code (which is what I guess you mean by L3 access not being included).  That means each of the primitives might have a bypass like the Qserv one.
    • Primitives will always provide a blocking-call (RPC-like) model but may also expose a "background" query model that may provide for more interaction.  As diagrammed, the service layer can turn blocking calls into "background" queries, of course, but it should take advantage of "background" interfaces to primitives where they exist.
    1. K-T wrote:

      "from bypassing the service layer to directly access the primitives, particularly from code (which is what I guess you mean by L3 access not being included)."

      I think - Trey should correct me if necessary - that Trey is talking only about callable interfaces here.  So I think all the user accesses in his pictures are in principle "from code".  Of course, the client-side LSST-provided code in a web GUI will very likely use the same callable interfaces.

       

    • Under SLAC DB API I'd add "Level 1 DB". 
    • I'd rename "Cutout/Image Retrieve" to "ImageServ"
    • I believe "Image Meta Data Search" and "QServ Catalog Search" is what we have been calling "MetaServ" so far
    • I thought ImageServ and MetaServ will be "public", "SLAC Non-public HTTP API..." indicates they won't be.
  2. For this API discussion, let's be sure to distinguish "public" as in "open source" from "public" as in "general science users can invoke these APIs against production systems" (presumably at a DAC).  Also, presumably all our APIs (i.e., in all the open source code) are well-documented (and therefore public in yet another sense), but perhaps there is a further level of user support provided for the "public for use against production systems" subset of the APIs (tutorials, hand-holding, ...?).  I don't think we've been particularly clear about that.

  3. Jacek:

    • I updated the diagrams with the correct naming
    • In the Alternate view, I am just proposing making the SLAC MetaServ and ImageServ to be non-public primitives. This would allow us to:
      • Do much of the background in one place.
      • Not have to worry about SSO at the SLAC layer
      • Not worry about catalog data file format (csv,tsv,vo,etc) at the SLAC layer.
      • pass through the most the exact same api
      • The service layer does not have to be written at IPAC, it could be written at SLAC

    Gregory:

    • I am only talking about callable interfaces here.

    K-T:

    • Most of the service layer would be pass though.  I don't think there is a significant performance penalty here.
    • The service layer also give us a place to put common functions such a background, format, and SSO in one place.
    • With my proposed diagram, I don't see a reason for a user to ever hit the primitives.  He should be able to do every WEB API operation at the service layer.
    • The DB layer is a different story.  It has it own passwords so it should be secure. The very advanced users would want to use the DB API directly.

     

     

     

  4. One more thought, to me ImgServ and QServ should be in the same box, as they both deal with data stores (one db, one img). so I guess I'd move ImgServ to the left box, and rename that box to SLAC DB&Image API

    1. The problem with that is that the same layer has a http API and a database API.  They could have two different forms to user validation. 

      I was thinking about this.  Isn't ImageServ really a front-end to the butler?

  5. What Level1 DB stands for? Is it going to store metadata for images? Vs. QServ, which stores metadata for catalogs?

    Then ImageServ (Image Retrieve and Cutout Services) might need to access it too.

     

  6. Level 1 =  *real time nightly alert pipeline* products. That includes difference images and database catalog (that is updated in
    real time every night)

    Level 2 = products produced by annual *data release*

    Level 3 = data produced (or brought from outside) *by users*

    1. I am trying to understand where on the diagram above is the metadata store we were talking about last week.

      Do I understand correctly, that QServ will be serving data from Level1 DB and Level 2 DB, ImageServ will be serving images (including difference images), using  Metadata Store?

      So Level1 DB and Metadata Store should be both in the left-most box?

       

       

  7. Should we discuss it at our next (after New Year) data-access meeting?

    There are several things that are not clear to me.

    1. Which data are going to be qserv-managed (populated into qserv), and which data will be populated into metadata store?
      a. Which level 1 products will be accessible from qserv and which from metadata store?
      b. For example, how do I search for calibrated science exposures that cover a particular area on the sky?
      c. Another example: how do I find sources (timeseries of positions or magnitudes), associated with a given object?
    2. What is the relationship between the qserv-managed data and the data in metadata store?
      ex. How do I find an image from which a given catalog source was extracted? 
    3. How much of QServ and Metadata Store schema and functions will we expose through Web API? Will we allow to execute arbitrary SQL SELECT statement through WebAPI?
    4. Can a user without access permissions to a particular dataset (image files) access the metadata about this dataset (like WCS information) or information derived from this dataset (like sources)? (If yes, the data in QServ and Metadata Store DB – except file location – should not be access-controlled.)

     

      1. a, b) Qserv will contain only Level 2 data products (from the annual Data Releases) and Level 3 catalogs.  This will include information (metadata) about calibrated science exposures that are part of the release.  Each data release will potentially have different metadata for the same exposures as calibrations are refined and algorithms improve.  Level 1 data products will be stored in a separate, non-Qserv database.  That database will also include (preliminary) information (metadata) about calibrated science exposures.  Any and all of this exposure metadata could be copied to the metadata store or otherwise made available through the metadata service.  I think it will ultimately be up to the user which metadata to query, although we can choose appropriate defaults depending on what else the user is trying to do. c) Sources (if we have them at all) and ForcedSources are obtained from Qserv.  For variable objects (transients, variables, asteroids), DiaSources are obtained from either the Level 1 database (in near-realtime, not thoroughly checked) or the Qserv Level 2 database (for annual Data Releases).
      2. Qserv will contain all information necessary to understand the Level 2 data products stored in it.  A Source (or ForcedSource) will contain an ccdVisitId that can be used to look up a particular visit in the CcdExposure table.  The information from that table might also be made available through the metadata service.
      3. We must provide at least one interface that allows almost all SQL queries  to be issued against Qserv and the Level 1 database.  We can restrict the queries to ones that are legal for the underlying database and not ridiculously expensive.  We do not have to allow direct SQL access to the metadata service back-end database, but it is not expected to have any useful science data that is not already in the Qserv or Level 1 databases.
      4. All LSST access APIs provided at the Data Access Centers, SUI or otherwise, data or metadata, are restricted to those with data rights, although in some cases the Education and Public Outreach system may make use of those APIs to provide limited access to the general public.

      1. Thank you very much for the thorough explanation. So should Metadata Store be among the SLAC DB API's (in addition to Level1 DB and QServ) or it's not there because it exists for performance optimization only, and its schema does not have to be published?

  8. Just to come back to this page for a moment: is it now accurate to say that the API page that's being elaborated in the Monday meetings represents the layer described as the "SLAC HTTP APIs" in the above diagrams?

  9. Yes, I think we have a working plan now. We still have not yet dealt with login or VO.  However we have discussed delivering any SUI API functions (not sure what that would be yet) as part of the system.