Date

Attendees

Discussion items

TimeItemWhoNotes

Project newsFritz Mueller 

Fritz Mueller:

  • DMLT: the construction schedule has been extended a bit
  • no final decision yet on the in-person format of PCW-2022

Qserv hardware, development machines at USDF:

  • Fritz Mueller made a presentation on Qserv at the USDF Arch meeting
  • 12-15 nodes cluster is expected to be purchased to start Qserv at USDF. The nodes will come with NVMe-based storage
    • the sizing model for the 1st year of operation is: 96 nodes, 46 TB per node
  • . There is an idea of getting beefier nodes, where we would explore:
    • "blast radius"
    • parallel I/O
  • We may get a test set of the NVMe-base nodes to experiment with:
    • experiment with higher worker-per-node density
    • experiment with not locking chunks in memory to see how it would affect the performance of the shared scan queries
  • 3 (or 4) existing master nodes may  be moved to USDF to serve as the development platform for the DAX team 
  • there was a proposal to experiment with different storage models at IDF

Progress on topics from the previous meeting Database Meeting 2022-04-20 team

Status of the parquet-to-csv translator:

  • Andy Hanushevsky has implemented an option to pull object identifiers from the data frame index
  • Igor Gaponenko will work with Hsin-Fang Chiang on accomodating the latest version of the translator into the Argo-based ingest workflow

Extended geometry support in Qserv:

  • Fritz Mueller: worked on performance studies using the geometry indexing scheme of the ObsCore table loaded directly into MySQL. The tests were based on a dataset of a few (18?) million rows prepared by Andy Salnikov.  Based on the results of the study,  a conclusion on the feasibility of storing these data as the fully-replicate table in Qserv will be made.

Unknown User (npease) Reformatting Qserv code (C++):

  • Certain formatting (function calls, line wraps) doesn't look nice
  • Namespace flatting has not been tested yet. It will be done later.
  • Fritz Mueller: the "green light" to reformat the code

Status of DP02team

Igor Gaponenko will resume working on the pre-ingest stage (translating and partitioning) for the Object table's data using the new version of the pq2csv  translator

Felis and schema status at this release:

A discussion on naming databases in Qserv, TAP, and Felis:

  • Igor Gaponenko: it's impossible to rename databases in Qserv after they're ingested
  • Fritz Mueller proposed to use the ticket number to name schemas in Felis and generate TAP schemas out of those
  • Colin Slater: it's complicated 
  • conclusion: the extra layer of indirection at TAP is hard to use as an extra later of indirection for aliasing database names. The database needs to be ingested with the final name like dp02.

Progress on qserv-operator  and qserv-ingest 

Working on ingesting fully-replicated ables into Qserv. Attempting to reuse one of the existing tables from the test data within Git package Qserv.

The latest version (added Replication system's Registry service) of the qserv-operator is waiting for the review: DM-34379 - Getting issue details... STATUS

The latest version (ASYNC-based ingest) of the qserv-ingest is waiting for the review: DM-31464 - Getting issue details... STATUS

Once both above-mentioned developments are finished, the next step would be to integrate the latest improvement DM-34085 - Getting issue details... STATUS into qserv-operator. Note that this improvement comes with a schema change.

Action items

  •