|Progress on topics discussed at the previous meeting Database Meeting 2022-04-13
Fritz Mueller: TAP schema on the INT cluster needs to be updated
Unknown User (npease): what's the status of - DM-32600Getting issue details... STATUS ?
Fabrice Jammes : on the
Fabrice Jammes : on
|Status of the
Igor Gaponenko: poor performance of the
...which means it's impossible to utilize more than (say) 2 or 3 cores on the large memory machine (128 GB). In reality, we might not have such ("large memory") machines at GKE.
This is an illustration of what it means in the context of translating the Parquet files of the DP02 catalog where it would take 80 hours to translate 1 TB of the input data (or to make 2 TB of the output data) using 1 core:
My impression is that this application may need to be rewritten in C++ using this API https://arrow.apache.org/docs/cpp/parquet.html ... unless someone will find a way to dramatically speed up the Python code.
Igor Gaponenko : the second issue is that the translator can't find the values of the
There are two options to frame Parquet files:
Fritz Mueller: it would be convenient to pass the Felis file to the translator
On the Parquet "index":
|Extended geometry support in Qserv
The bottom line:
|Improving error reporting and query processing statistics at
John Gates will take over this ticket
qserv-ingest to ingest the
Object table of DP02
Andy Hanushevsky : ATLAS has S3-to-HTTP gateway that could be used here
Kian-Tat Lim : GCS has a service allowing to read objects via HTTP. This can be configured in the Cloud Console permissions for a bucket and by selecting "Allow public access".
Igor Gaponenko : the Ingest system could be improved to read S3 buckets directly