Date

Attendees

Notes from the previous meeting

Discussion items

DiscussedItemWhoNotes
(tick)Project news

PCW 2022?

SLAC performance review:

(tick)Status of DP02 team

Fritz Mueller:

  • DP02 is going on quite well. No user complaints so far on the missing truth tables

Fritz Mueller: status of the RefMatch table MatchesTruth?

  • the special partitioning tool sph-partition-matches was improved
  • the partitioned version of the table MatchwesTruth  was loaded into qserv-int 
  • CSS was manually fixed
  • the RelationGraph looks fine, the rest of the code looks fine too
  • however, the query evaluation for the table is still not working
  • this requires further debugging using gdb 
  • Igor Gaponenko will ingest these tables into the small cluster at NCSA to help Fritz Mueller's effort
  • Andy Salnikov has proposed adding the integration test for the RelationGraph 
  • Igor Gaponenko should add support for the RefMatch  tables to the Replication/Ingest system
  • Fritz Mueller will extend the integration test to support RefMatch  tables

Colin Slater: status of the input data of the ForcedSourceOnDiaObject table?

  • is about to run the final fixup stage
  • the data will be ready within a timeframe of 1 day
  • the number of the Parquet  files will be greatly reduced (no splitting tracts to patches) which would help a lot with speeding up the PArquet-to-CSV  translation

Igor Gaponenko: further improvements to the Replication/Ingest system based on the DP02 experience

Colin Slater: need to improve the monitoring of the Qserv usage in IDF

  • Fritz Mueller mentioned Fabrice Jammes's idea to use Prometheus monitoring by pulling metrics from Qserv
  • Colin Slater proposed to pull metrics from Qserv and ship them to Google for monitoring, aggregation, and visualization
  • Fritz Mueller right now we are not logging anything from Qserv since it's too expensive. So, we don't even have the searchable log
(tick)Development infrastructure for Qserv 

Igor Gaponenko: need a replacement for Qserv development platform that we're loosing at NCSA in 1 month (the August 15th deadline)

  • still, no progress on the temporary instance based on the loaner hardware at SLAC
  • the IT is quite busy working on the USDF infrastructure

Fritz Mueller will work with SLAC IT and Richard to accelerate this. Also, in September we are supposed to have 4 former master[1-4]machines as the temporary development platform. Aa the meantime, the IDF development cluster (qserv-dev?)  is the only option.

(tick)

NOTE: the topic will continue to be discussed next week after Fabrice Jammes will get back from vacation.

qserv-ingest

Context:

  • Igor Gaponenko: extending the versioning mechanism in REST API of the Replication/Ingest and the implementation. This requires making changes to the JSON config files (to store the version numbers) and the ingest workflows (including qserv-ingest). There is the PR on the Git package qserv-ingest in the JHIRA ticket mentioned below.
  • DM-35456 - Getting issue details... STATUS
  • The CI is failing on the PR due to Repl service version incompatibility. Two PRs exist in the scope of the same JIRA ticket. Both migrated to version 9 simultaneously.
  • Do we need the common Python API for the Replication/Ingest system? We have presently 2 separate implementations: one in the Qserv container's CLI (source path src/admin/python), and the other one in the GIt package qserv-ingest (source path rootfs/python). This seems like a duplicate effort.

Fritz Mueller proposed to:

  • give the high priority for moving qserv-ingest  into qserv 
  • (after that) work on the common Pythn API to the R-I system to be shared by the integration test and the workflow

On the Repl API versioning:

  • Fritz Mueller will prepare a write-up in Confluence to propose a solution
  • the topic will be discussed next week




Action items

  •