Date

Attendees

Notes from the previous meeting

Discussion items

DiscussedTopicNotes
(tick)Project news

Fritz Mueller:

  • No news from DMLT
  • LSST All hands December 15th
  • January 26: the office move day

Colin Slater:

  • Ongoing discussion on DP1 in light of the recent replanning for ComCam.
  • Some are concerned regarding the risks of not basing DP1 on the LSST data
(tick)A bug in Qserv czar when handling failed chunk queries

Any progress from the previous meeting?

Fritz Mueller a new Qserv release 2022.12.1-rc2 has been built. It supports extended logging that might help with investigating the problem.

Igor Gaponenko this version has been installed at SLAC (Qserv instance slac6).

John Gates a problem in query processing for sub-chunks may have been found. Could be a race condition. The fix is not ready yet. The current version of the worker-side monitoring may be accessing stale query statistics.

(tick)qserv-operator and qserv-ingest

Any news? Any progress for the proposed Parquet-to-CSV translator?

Fritz Mueller :

  • built both and ran into problems with the latest REST API version 17 of the R-I system.
  • this broke CI.
  • a solution to this would be to move qserv-ingest into the GIt package qserv. In this case, any incompatibilities between the REST API and the ingest workflow will be automatically detected by the CI once changes are pushed into GitHub.

Fabrice Jammes will look at improving the CI to address an issue with dependencies between the images of qserv , qserv-operator and qserv-ingest.

Fabrice Jammes on the proposed Parquet-to-CSV translator:

  • There will be a meeting on this subject with Sabine tomorrow.

Fritz Mueller we may need an external "tools" container for the translator.

Then there was a discussion on the streaming translate-partition-ingest model. Fabio (and others) have concerns regarding using the intermediate storage (object store, etc.) for storing translated and/or partitioned products for the duration of ingest campaign.

TBC...

(minus)Horizontal table partitioning in MariaDB vs MySQL

Context:

  • The current design/implementation of the Replication/Ingest system heavily relies on the horizontal capabilities of MariaDB as the key technology for addressing the performance and robustness requirements of the LSST-scale catalog ingests.
  • It turns out, support for this capability is not at the same level in MariaDB and MySQL. Specifically:
    • an option for ingesting data into the specified partition using LOAD DATA INFGILE... is not available in MariaDB
    • MySQL documentation on the latest stable version 8 says that partitions are not (warning) supported for MyISAM engine. Details in: https://dev.mysql.com/doc/refman/8.0/en/partitioning.html
  • Partitioning has proven to be a great technology for implementing super-transactions  (a simplified version of "distributed transactions").
  • Unfortunately, while using MariaDB, we're not benefiting from the partitions in areas where we're severy I/O constrained. These include: building the director indexes at czar, ingesting intermediate results into the result tables at czar when processing large result queries. The performance of the table ingest operations at workers could be also improved if MariaDB had the same capability as MySQL in regard to the partitions.
  • Altogether, this raises concerns regarding our plans for improving the performance of the Replication/Ingest system in a number of areas, especially in the long run.

Igor Gaponenko: one possibility that might improve the situation at czar would be switching from MariaDB to MySQL while retaining MariaDB at workers. This should help with improving the performance of the director index builder and (the future version) of the result builder (for user queries). Switching to MySQL at workers may be problematic as we will need to reingest 50 TB of data into the MySQL-based database.

Action items