- Discuss Fritz Mueller's proposal for configuring Qserv containers.
|A proposal for configuring Qserv containers|
Ongoing tickets that are relevant in this context:
|Configuring and adding workers to Qserv cluster|
Fabrice Jammes raised the topic of updating a configuration of the Replication/Ingest system at run time. This is needed for two reasons:
Igor Gaponenko reported that there is an ongoing effort to improve the situation here. The first step is to migrate worker services (specifically cmsd and the replication system's worker) to self-configure themselves (learning their identities) from the unique (UUID-generated) dataset identifiers stored in the corresponding Qserv worker databases. For further details and the current status of this development see:
The second step will be to make changes within the Replication system's communication network to allow workers to log into a (yet to be implemented) redirector service. This will reverse dependencies within the system and eliminate a need for the explicit configuration of the workers. A preliminary plan for this development was discussed between Igor Gaponenko , Fritz Mueller ad Andy Salnikov before the Winter break. This project is still at an early stage. The actual work on it will start after Fabrice Jammes will finish migrating Qserv to the lite containers and their entry points.
|Schema initialization and migration||The topic was just briefly mentioned in the context of the Qserv configuration discussion as there is an overlap between both. It was decided to postpone the discussion till the next meeting.|
Lockups are seen in the latest version of the branch when testing mixed query loads in the large Qserv cluster at NCSA. Two types of queries are launched simultaneously in this round of tests:
The lockup is happening shortly (a few minutes) after launching the queries. The problem is reproducible.
The direct link to the most relevant comment: https://jira.lsstcorp.org/browse/DM-31537?focusedCommentId=443619&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-443619
- John Gates will work with Igor Gaponenko (if needed) to investigate the lockups.
- Fritz Mueller will lead a discussion for initializing and upgrading Qserv schemas at the next meeting. This will be preceded by a discussion among interested members of the group at the team's Slack channel.
- Igor Gaponenko will be looking at migrating the configuration system of the Replication/Ingest system from the database tables to a more conventional technique.
- Fabrice Jammes will work on finalizing migrating the operator-based Qserv deployment tools to the lite containers and the new configuration model.
- Unknown User (npease) will finish improving the parameter handling in the entry points as per Fritz Mueller's proposal.