Fritz Mueller worked with the RSP team on the performance and optimization of Qserv
generating a lot of load (hundreds of parallel queries) on the qserv-int
using Object and Source tables only
considering switching the tests to qsderv-dev where the next/final version of the catalog s going to be ingested and retained for the purpose of the performance testing
was wondering about the possibility to use (as a whole or in parts) Kraken
Fabrice Jammes we need to improve qserv-ingest to allow incremental ingest of tables (one table after another in separate runs of the workflow) as the fix-ups to the catalog by replacing (deleting and re-ingesting) existing tables
qserv-dev has 10 workers. Is this okay for the load testing?
Fritz Mueller it's good since it would allow testing the scalability of Qserv
Treating the table MatchesTruth as the RefMatch table:
Colin Slater: the test version of one tract (split into 49 patch Parquet files) is available for testing. The correct YAML schema is already in the main branch. Links to the Butler collection were posted in the group Slack channel.
Igor Gaponenko will work ASAP on ingesting this slice into the separate table ForcedSourceOnDiaObject_v1 and report results to unblock processing of the rest of the collection
Andy Salnikov on the status of extracting data for the table ObsCore (in the long run, not specifically for DP02):
have investigated various options considering the performance and complexity of the solutions
working on the tech note to summarize those
the "quick-and-dirty" solution is available, and it's relatively fast as it only takes 1.5 hours to extract the required data for Qserv. Though, it's not correct.
the correct approach for extracting the data had problems with the performance - it runs "forever" (is taking the amount of time that's beyond any practical limits) compared with the quick and dirty
because of that, zero chance to implement the extractor as a view in PopstgreSQL
a potential option is to have the materialized view (with the drawback of updating the view each time an update to the underlying tables is made. Other issues with this idea may be seen as well.
another option - generate the table at a time when the ingest is happening, which will require extra steps for integrating this process with Butler. This has its own implications that would need to be further studied.
Fritz Mueller: the last missing capability in DP02 will be to add ~12 magnitude columns to the table Object:
first by using the R-I system's REST API to ALTER TABLE ... that's already available
In the next step, set the right values on the new columns by running UPDATE ... using a mechanism that may need to be devised (via the R-I system's REST API or manually)
Igor Gaponenko will work on this after ingesting the new version of the table ForcedfSourceOnDiaObject and after Fritz Mueller will be done with the ongoing investigation of the RefMatch option for the table TruthSummary.
Versioning of the Replication/Ingest system's REST API
Discussed obstacles in testing new features in the triplet of 3 packages qserv, qserv-operator, and qserv-ingest:
it's a three-way problem that needs to be (presently) addressed sequential, one package at a time
there is a recipe on how to tweak the CI for testing each package. However, it's complicated.
the right solution would be to merge both packages into qserv
Fritz Mueller will start working on prototyping the idea next week by merging/cloning qserv-operator into qserv. The updated CI will need to be implemented for that. After that, a similar effort will be made for qserv-ingest.