| Next steps in DP02 | team | The context: - a new Qserv release will be built tonight to include the latest improvements made to the Web Dashboard
qserv-int may get more RAM (doubled from the current 64 GB) to improve the performance of the shared scan queries (especially those involving ForcedSource )
- Fabrice Jammes:
- what's the status of the Operator? We need to cut a new release to be deployed in IDF tomorrow during the usual "Thursday Patch Window" (3 pm, Pacific Time)
- locking tables in memory is not working in IDF. This badly affects the performance of the shared scans.
- Colin Slater, Igor Gaponenko: the status and further actions for ingesting the correct version of the truth match tables.
Discussed status of tables in DP02: - Regarding the truth and truth match tables
MatchesTruth and TruthSummary , Igor Gaponenko Colin Slater Joanne Bogart came to an agreement that a new collum combine values of the existing (id , truth_tupe ) would have to be created. The new key will be unique. It will have the string type. Its values will be built by concatenating the corresponding values of the existing columns. Otherwise, the input for the current (latest) version of both tables should be used. Igor Gaponenko will begin working on it today. The tables will be first loaded into qserv-int . - A new version of the table
ForcedSourceOnDiaObject is being made by the Pipeline. The ETA before seeing the input data is one week. Igor Gaponenko reported on issues in the performance and resource utilization when translating the Parquet files into CVS. Each translation (of a file) requires up to 50 GB of RAM, which utilizes 1 CPU core and reading/writing just a few MB/s, which makes it difficult to utilize more than 4 CPU cores (translate 4 files in parallel) of a machine. The translation time of one file is 40 minutes. There are about 160 such files. Altogether, this results in about 1600 minutes (1.5 days) needed to translate the complete collection of files. Joanne Bogart suggested using the row groups. Colin Slater has found that these files have 2 such groups each. Colin Slater proposed reducing the fil size by NOT merging patches of each tract into a single Parquet file. Given 47 patches per tract, this will result in increasing the number of the input files by a factor of 47 while reducing the memory requirements for the translator by the same number. This will allow running more translators in parallel. The machine where the translation is made has 28 cores. The potential speedup for the translation phase after such file size reduction would be 28 / 4 ~= 7 (a few hours instead of 1.5 days). Colin Slater will work with the Pipeline team on this subject. - An option of increasing the number of row groups in the
Parquet files will be evaluated later and discussed with the Pipeline team. It will require additional coding. - Regarding the problematic (for the Ingest system) tables
Source that have 1.6 million contributions after partitioning all files of the table, Colin Slater made an interesting suggestion. The idea is to reduce the number of files during the partitioning phase where the file "explosion" is happening. Igor Gaponenko thinks this is possible. He will evaluate the implementation of sph-partition .
On the observed problem for locking files in memory in the Kubernetes environment: - the original observation was made by Igor Gaponenko in IDF a few months ago.
- Fabrice Jammes investigate this, and it turns out locking requires the following conditions to be met, and actions to be done by Qserv:
- Qserv pod has to request two security capabilities (Fabrice Jammes will provide further details on this subject). This is already done by
qserv-operator. - The pod should request the desired limit by invoking the shell command
ulimit . - Qserv code has to call system functions
mmap and mlock to read and lock the desired table files in memory. Qserv already does this.
- A problem discovered by Fabrice Jammes is that we lost 2 after migrating to use the "lite" containers. Further investigation has shown that it's because Qserv containers are now run (in Kubernetes) under the unprivileged user
qserv . This user is not allowed to change the memory locking limit beyond what's set by default in the OS using ulimit . The operation needs to be done by the user root . An alternative option would be to configure the desired hard limit at a level of the Kubernetes VM or host, which is considered by Fabrice Jammes as the "inflexibility". - As a short-term solution to the problem, Fabrice Jammes will build a custom (branch-based) version of the Operator to launch pods under the user
root , change the locking limit, and call the entry point command after that. Jira |
---|
server | JIRA |
---|
serverId | 9da94fb6-5771-303d-a785-1b6c5ab0f2d2 |
---|
key | DM-35376 |
---|
|
- Fabrice Jammes will further discuss this issue with the Birwood/Google support team (Dan Speck) and Fritz Mueller.
- Colin Slater has expressed concern regarding deploying this version at
qserv-prod before fully testing it. This was noted. The test version will be deployed in qserv-int.
Other notes: - Colin Slater may need to add an extra column to one of the existing DP02 tables in Qserv.
- Igor Gaponenko will use the existing REST API of the Replication system for that.
|