Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.




  • To discuss DataCat

Discussion items

  • have mysql instance on slac machine, datacat talking to it
  • install tomcat locally on db machine at slac (Brian/Tony need access)
  • new python client in works
  • need mechanism to load metadata about files residing at NCSA into our catalog
  • crawler can't be remote yet (it is on todo)
    • run at NCSA and transfer into via ssh
  • datacat key features we like best and want to use: mix of structured and unstructured, crawler, rest api
  • use case to implement in prototype: extract metadata from fits headers using afw, load to datacat, then reproduce fits files using metadata from datacat, without reading fits headers.
  • would be useful to have notion of hot and cold data in datacat
  • java packages that datcat is using for rest api: jersey, jackson
  • think more about "foreign tables"
    • foreign tables are tables defined via plugins, then joined during query, only touched when query explicitly asks for it
    • foreign tables still have to be in local db, e.g., didn't explore true functionality of oracle, postgres or mysql that'd access remote tables (planning to)
    • unclear where dividing line for us should be: exposures produced by pipelines, metadata in datacat etc
    • will work on that over the next few weeks