Skip to end of metadata
Go to start of metadata



Discussion items

(minus)Project news

(to be reported after 1:30 pm)

(tick)Next steps in DP02team

The context:

    • a new Qserv release will be built tonight to include the latest improvements made to the Web Dashboard
    • qserv-int may get more RAM (doubled from the current 64 GB) to improve the performance of the shared scan queries (especially those involving ForcedSource)
  • Fabrice Jammes:
    • what's the status of the Operator? We need to cut a new release to be deployed in IDF tomorrow during the usual "Thursday Patch Window" (3 pm, Pacific Time)
    • locking tables in memory is not working in IDF. This badly affects the performance of the shared scans. 
  • Colin Slater, Igor Gaponenko: the status and further actions for ingesting the correct version of the truth match tables.

Discussed status of tables in  DP02:

  • Regarding the truth and truth match tables  MatchesTruth and TruthSummary , Igor Gaponenko Colin Slater Joanne Bogart came to an agreement that a new collum combine values of the existing (id , truth_tupe) would have to be created. The new key will be unique. It will have the string type. Its values will be built by concatenating the corresponding values of the existing columns. Otherwise, the input for the current (latest) version of both tables should be used. Igor Gaponenko will begin working on it today. The tables will be first loaded into qserv-int.
  • A new version of the table ForcedSourceOnDiaObject  is being made by the Pipeline. The ETA before seeing the input data is one week. Igor Gaponenko reported on issues in the performance and resource utilization when translating the Parquet  files into CVS. Each translation (of a file) requires up to 50 GB  of RAM, which utilizes 1 CPU core and reading/writing just a few MB/s, which makes it difficult to utilize more than 4 CPU cores (translate 4 files in parallel) of a machine. The translation time of one file is 40 minutes. There are about 160 such files. Altogether, this results in about 1600 minutes (1.5 days) needed to translate the complete collection of files. Joanne Bogart suggested using the row groups. Colin Slater has found that these files have 2 such groups each. Colin Slater proposed reducing the fil size by NOT  merging patches of each tract into a single Parquet file. Given 47 patches per tract, this will result in increasing the number of the input files by a factor of 47 while reducing the memory requirements for the translator by the same number. This will allow running more translators in parallel. The machine where the translation is made has 28 cores. The potential speedup for the translation phase after such file size reduction would be 28 / 4 ~= 7 (a few hours instead of 1.5 days). Colin Slater will work with the Pipeline team on this subject.
  • An option of increasing the number of row groups in the Parquet files will be evaluated later and discussed with the Pipeline team. It will require additional coding.
  • Regarding the problematic (for the Ingest system) tables Source that have 1.6 million contributions after partitioning all files of the table, Colin Slater made an interesting suggestion. The idea is to reduce the number of files during the partitioning phase where the file "explosion" is happening. Igor Gaponenko thinks this is possible. He will evaluate the implementation of sph-partition

On the observed problem for locking files in memory in the Kubernetes environment:

  • the original observation was made by Igor Gaponenko in IDF a few months ago.
  • Fabrice Jammes investigate this, and it turns out locking requires the following conditions to be met, and actions to be done by Qserv:
    1. Qserv pod has to request two security capabilities (Fabrice Jammes will provide further details on this subject). (warning) This is already done by qserv-operator.
    2. The pod should request the desired limit by invoking the shell command ulimit.
    3. Qserv code has to call system functions mmap  and mlock  to read and lock the desired table files in memory. (warning) Qserv already does this.
  • A problem discovered by Fabrice Jammes is that we lost 2  after migrating to use the "lite" containers. Further investigation has shown that it's because Qserv containers are now run (in Kubernetes) under the unprivileged user qserv . This user is not allowed to change the memory locking  limit beyond what's set by default in the OS using ulimit. The operation needs to be done by the user root. An alternative option would be to configure the desired hard limit at a level of the Kubernetes VM or host, which is considered by Fabrice Jammes as the "inflexibility".
  • As a short-term solution to the problem, Fabrice Jammes will build a custom (branch-based) version of the Operator to launch pods under the user root , change the locking limit, and call the entry point command after that.
  • Fabrice Jammes will further discuss this issue with the Birwood/Google support team (Dan Speck) and Fritz Mueller.
  • Colin Slater has expressed concern regarding deploying this version at qserv-prod  before fully testing it. This was noted. The test version will be deployed in qserv-int.

Other notes:

  • Colin Slater may need to add an extra column to one of the existing DP02 tables in Qserv. Igor Gaponenko will use the existing REST API of the Replication system for that.


Progress on the new version of qserv-ingest and running the integration test within Kubernetes.

  • reusing integration data within qserv-ingest . This step also includes partitioning the test case data. It's been tested and it's working for Case Test 1.
  • added support for ingesting the regular (fully replicated) tables. The integration test passes for that. Tested in DP01. Need a larger table from the DP02 for further testing.

Status of ingesting DP02 into Qserv at IN2P3:

  • The partitioned (and regular) tables' data were copied to IN2P3. Reorganized data to be compatible with the workflow's metadata definition model. 
  • About to test the new version of qserv-ingest in IDF using qserv-dev . After that ingest the catalog at IN2P3.

Action items