(Kian-Tat Lim will try to consolidate and rewrite this in the next day or two.)
Comm Cluster needs to access raw data with minimal latency as well as past raw images and calibrations
Comm Cluster needs to produce outputs like new master calibration images
Comm Cluster outputs are not usually and definitely not automatically part of the permanent record of the survey
OODS is accessible by Comm Cluster
But is not a place for temporary user outputs
SRP: OODS is a file cache; clients read things out
  • Also has an expiration manager
  • Additional complexity is from the Butler access
  • Requires understanding how to delete files from the Butler
  • Wants to keep things simple
Jim: What systems or people are getting data from the OODS
  • Comm Cluster
  • AOS, Camera Diagnostic Cluster, Visualization on Summit
GPDF: Need Butler access
  • Does the Gen3 database know of all the data or only data that is in OODS
Almost all clients will use Butler
Gen3 Registry and Database
  • One issue is the relationship between the raw and the calibration
  • Latest master calibration files are not sufficient; might want to use candidate files as well as past versions
  • The files themselves are not sufficient to give this relationship in all cases:
    • Raw files are sufficient from the headers; have code to insert into Registry
    • Need to have externally-provided metadata for calibrations (e.g. from DBB)
  • Outputs from the Comm Cluster needs to go to the same DBMS as the raw files and calibrations
Comm Cluster does not need to access DBB
There will be Consolidated DB instances in both locations
Questions about where OODS database needs to go
  • Would be different than Consolidated DB entirely because of service levels
  • Things will appear and disappear
  • Has to be able to generate its own ids and extract metadata itself
Image metadata tables
  • Are the Gen3 Butler image metadata tables sufficient for imgserv?
    • Should be
Requirements document
Have a secondary mechanism to deal with DBB downtime; use Consolidated DB with MyDB space in normal case
Does Chile need to know about all datasets, even if they're not in Chile?  Probably yes
Are we sure that we don't need multiple database servers in the Butler?
  • Main issue is that one type of dataset is created in one place; another type is created in another place; these need to be related, so a join will be required between the two Registries
  • OK that users take snapshot subsets (with transitive closures of potential relationships in the database); with the ability to later download manually additional files and their database entries
DBB file store is a filesystem or object store? Initially filesystem
SRP: Why can't OODS backing store be the DBB?
High service availability is for Comm Cluster
Summit systems have to copy data locally for their observing purposes
Ingestion into the DBB:
  • Would be nice to ingest at NCSA, which adds latency
  • Current Butler Schema design may also make multi-master replication difficult
Small number of Comm Cluster use cases that require greater DBB uptime
To support rest of the use cases, DBB needs to have low latency but not necessarily high uptime
  • When down, could do things like manually ingest into a local Registry
Linking the schemas of the DBB Registry and the online system is the major problem
  • But can be possible to have multiple schema versions at the same time
Visualization system needs imgserv, which uses a Butler
  • You already know concrete identifiers for the image, either directly or via ObsTAP
Camera diagnostic cluster
  • Doesn't necessarily need DBB access
  • Files should look the same
Summit ad hoc analysis during AuxTel commissioning is a separate use case that can be on a separate system
Summit systems are using CCS image files
OODS was invented to decouple online and offline
Conditions for OODS to be simple/tiny:
  • Ad hoc processing use cases need to be met by low-latency raw data
  • Non ad hoc processing don't need a sophisticated database
Summit systems
  • Doesn't need to be observing-critical
Ad hoc use cases
  • What is the real need for uptime and latency?
Archiver registration
  • Is this pull or push? Is this to DBB or OODS or both?
Could the OODS be another "replica" of the DBB?
  • Maybe, but could be excessive
Ingestion into DBB implies ingestion into the Butler?
  • If DBB schema could be independent of the Butler schema, via views or something, then that would allow decoupling
Need extract-transfer-and-ingest command for base-to-summit systems to go to DBB directly
  • But need that anyway to feed an OODS
Still need to deal with OODS deletion
  • No labels