Database Meeting 2022-04-20

Date

20 Apr 2022

Attendees

Igor Gaponenko Andy Hanushevsky Kian-Tat Lim Fritz Mueller Unknown User (npease) Andy Salnikov

Discussion items

Item	Who	Notes
Project news	Fritz Mueller	(budget) moving from the construction phase to the operation, Qserv goals are to be defined differently (from building to enhancing) (DM) Frossie will advocate cloud-ness to NSF Large Facilities Office
Progress on topics discussed at the previous meeting Database Meeting 2022-04-13	team	Igor Gaponenko: Tables `Visit` and `CcdVisit` ingested into qserv-int at IDF Fritz Mueller: TAP schema on the INT cluster needs to be updated Unknown User (npease): what's the status of DM-32600 - Getting issue details... STATUS ? Fabrice Jammes : on the `qserv-operator` upgrade the Replication system's worker registration service has been integrated and tested. Still waiting for the review on PR for: DM-34379 - Getting issue details... STATUS moving the Replication Controller to the `czar` node has solved the OOM (killing) problem Fabrice Jammes : on `qserv_ingest` ASYNC mode works well for ingesting the 40 TB size catalog added support for ingesting regular tables
Status of the `parquet-to-csv` translator	team	Igor Gaponenko: poor performance of the `parquet-to-csv` conversion application has been observed. Tested it on the 7 GB Parquet file. The conversion produced a CSV file of 14 GB in size. The write speed was 7 MB/s on average. The CPU utilization during the conversion was 100% (exactly 1 core, which is not a surprise considering this is a sequential application written in Python). What's actually worse is the memory consumption of the application: VM: 40 GB Physical memory: 8 GB ...which means it's impossible to utilize more than (say) 2 or 3 cores on the large memory machine (128 GB). In reality, we might not have such ("large memory") machines at GKE. This is an illustration of what it means in the context of translating the Parquet files of the DP02 catalog where it would take 80 hours to translate 1 TB of the input data (or to make 2 TB of the output data) using 1 core: 2 TB / (14 GB / 7 MB/s) = 285714 secons = 80 hours My impression is that this application may need to be rewritten in C++ using this API https://arrow.apache.org/docs/cpp/parquet.html ... unless someone will find a way to dramatically speed up the Python code. Igor Gaponenko : the second issue is that the translator can't find the values of the `objectId` column. Details on this subject can be found in the TAX team forum https://lsstc.slack.com/archives/G2JPZ3GC8/p1650422856573549 There are two options to frame Parquet files: Fritz Mueller data frames (we need to coordinate this with the Pipeline team) Kian-Tat Lim splitting a file into a dataset (many files) Fritz Mueller: it would be convenient to pass the Felis file to the translator Andy Hanushevsky: will evaluate this option. Fritz Mueller will provide a sample file to Andy Hanushevsky On the Parquet "index": Andy Hanushevsky : will add an option to enable the option on demand Igor Gaponenko: will work with Hsing-Fang on updating the Argo-based workflow to
Extended geometry support in Qserv	Fritz Mueller	Context: there is an ongoing discussion on this subject at https://lsstc.slack.com/archives/C8EEUGDSA/p1650411741492219 needed in the context of delivering image service support (OPS TAP service). Need to record regions for exposures. It's desired that Qserv supported queries selecting objects in a given region. Or queries allowing overlaps of the regions. Qserv supports binary data columns (polygons) that may be used for that we need (at least) to replicate OPS TAP (meta-)data in Qserv. At what table? How many rows are we talking about here? Fritz Mueller: the first option would be to put these into the fully-replicated table. This should work for the DP02. We will look for partitioning these data for the larger-scale catalogs. The overlap radius could be a problem here as it's much larger for image metadata. That would cause compatibility problem with other tables and won't let Qserv JOIN rows of these tables. Kian-Tat Lim: we may also investigate the possibility of using the HTM-based index as an alternative. Fritz Mueller: and finally, there is an option not to use Qserv and to merge at the TAP level(?) Andy Salnikov: was tasked to write the butler-based converter for extracting and exporting OPS TAP (meta-)data The bottom line: the DAX team will investigate options a solution will depend on the number of rows
Improving error reporting and query processing statistics at `czar`.	team	Context: DM-34464 - Getting issue details... STATUS collecting ideas John Gates will take over this ticket
Using `qserv-ingest` to ingest the `Object` table of DP02	Igor Gaponenko Fabrice Jammes	Igor Gaponenko: trying to understand how to use `qserv-ingest` for ingesting partitioned products of pre-ingest steps on IDF into Qserv in IDF. Andy Hanushevsky : ATLAS has S3-to-HTTP gateway that could be used here Kian-Tat Lim : GCS has a service allowing to read objects via HTTP. This can be configured in the Cloud Console permissions for a bucket and by selecting "Allow public access". Igor Gaponenko : the Ingest system could be improved to read S3 buckets directly

Action items

Space shortcuts

Page tree

Date

Attendees

Discussion items

Action items