Date
Attendees
Discussion items
Time | Item | Who | Notes |
---|---|---|---|
Project news |
| ||
Progress on topics discussed at the previous meeting Database Meeting 2022-04-13 | team |
Fritz Mueller: TAP schema on the INT cluster needs to be updated Unknown User (npease): what's the status of - DM-32600Getting issue details... STATUS ? Fabrice Jammes : on the
Fabrice Jammes : on
| |
Status of the parquet-to-csv translator | team | Igor Gaponenko: poor performance of the
...which means it's impossible to utilize more than (say) 2 or 3 cores on the large memory machine (128 GB). In reality, we might not have such ("large memory") machines at GKE. This is an illustration of what it means in the context of translating the Parquet files of the DP02 catalog where it would take 80 hours to translate 1 TB of the input data (or to make 2 TB of the output data) using 1 core: 2 TB / (14 GB / 7 MB/s) = 285714 secons = 80 hours My impression is that this application may need to be rewritten in C++ using this API https://arrow.apache.org/docs/cpp/parquet.html ... unless someone will find a way to dramatically speed up the Python code. Igor Gaponenko : the second issue is that the translator can't find the values of the There are two options to frame Parquet files:
Fritz Mueller: it would be convenient to pass the Felis file to the translator Andy Hanushevsky: will evaluate this option. Fritz Mueller will provide a sample file to Andy Hanushevsky On the Parquet "index":
| |
Extended geometry support in Qserv | Context:
The bottom line:
| ||
Improving error reporting and query processing statistics at czar . | team | Context: John Gates will take over this ticket | |
Using qserv-ingest to ingest the Object table of DP02 | Igor Gaponenko: trying to understand how to use Andy Hanushevsky : ATLAS has S3-to-HTTP gateway that could be used here Kian-Tat Lim : GCS has a service allowing to read objects via HTTP. This can be configured in the Cloud Console permissions for a bucket and by selecting "Allow public access". Igor Gaponenko : the Ingest system could be improved to read S3 buckets directly |