Date

Attendees

Goals

  • Discuss Data Archive, with focus on SUI related aspects

Discussion items

JDBC-issues

  • problems talking to Qserv via JDBC
  • Tatiana posted list of internal queries that MySQL implementation of JDBC driver issues (show variable, set <various things> are the culprit)
  • short term (today/tomorrow), will try to patch Qserv - redirect these queries to mysqld used by mysql proxy  (Jacek)
  • long term - need to properly support it. Action item: file a new issue (Jacek)

DataCat

  • Catalog originally developed by SLAC for Fermi Gamma-ray
  • key features: metadata (blended mix of structured and unstructured), crawlers with pluggable project-specific components REST api, efficient searching.
  • Yesterday we had a very productive discussion with the team that built it / use it
  • relevant overview doc: https://github.com/brianv0/datacat-doc/blob/master/LSST-Datacat-overview.md
  • unless we missed it, there is no special optimization for spatial searches, will need to look into that
  • plan: evaluate for few weeks (mostly John, with help from Brian from the Fermi team)
  • need to consider L3 use cases
  • how to handle virtual files? Register and keep information in the catalog, generation data on the fly as needed
  • format of generated fits headers?

Naming

  • "global catalog"? NO! The worst choice
  • "metadata store". Yes, it is ok, at least for now

List-bases service

  • belongs to Data Access more than SUI, Data Access knows better how to optimize
  • tentative decision: don't keep as a standalone service, push into Qserv and Image Service
  • example for qserv: user uploads a list of (ra, decl, distance) points to a table, writes a joins with that table
  • worry: most obvious way for user to write this query will be very slow. There are ways to optimize it internally under the hood, Serge knows how. Not obvious how much complexity it will bring. Will discuss at Qserv meeting tomorrow

Security

  • every science user should have mysql credentials
  • yes, we can still do single sign on: web api will wrap it and upon successful authentication (through a cookie or whatever) it will find and use mysql-credentials associated with that user

Location of metadata

  • central store better than kept near the data
  • file systems are bad with searching
  • we can always use distributed key-value store if needed to scale

Future meetings

  • weekly, Mondays at 11:00 am pacific, starting this coming Monday Nov 24, via standard database hangout (nwb)