1. The motivation

The document presents one option for the initial deployment and subsequent management of Qserv instances. The primary motivation for writing the document was the ongoing transition from deploying Qserv within predominantly stable traditional host-based environments (including the ones based on pure Docker) to the less deterministic ones (Kubernetes, Docker Compose, etc.) in the clouds. The transition has revealed certain aspects in the current Qserv implementation (including to some extend its Replication & Ingest system) which introduce a spectrum of instabilities during the initial deployment and subsequent operations of the software. Most problems arise from dependencies within the application: between services, between services and databases. 

A proposal outlined in the document is meant to present a solution that might address these issues and answer a question on what needs to be further improved in the implementation to make Qserv suitable for the new environments.

2. The proposal

In the nutshell, the idea is to separate operations over the persistent state (databases and configurations) from the normal regime of Qserv. An approach to the persistent state should be the same as the one for the FSA (Finite State Automata).

Specifically, there should be 3 well-defined states in which Qserv could operate:

  1. The initial deployment
  2. The normal operation mode
  3. The maintenance mode

Qserv would be explicitly put into one of these states using the corresponding tooling. The rest explains each state.

2.1. The initial deployment

This would be the special regime in which only the database services and the related management tools (schema initialization and upgrade) will be running.

Here is what's expected to be done at this stage:

  • data directories of the database instances (MySQL or MariaDB) of the Qserv czar, workers, and the Replication/Ingest system will be initialized
  • MySQL user accounts required by Qserv and the Replication/Ingest system will be set up
  • privileges will be granted to the accounts
  • table schemas will be loaded into the databases
  • (warning) unique identifiers of Qserv workers (SELECT id FROM qservw_worker.Id) will be harvested from all worker databases and be registered in the Replication system's table qservReplica.worker_config.
  • (optionally) the setup will be validated

Note, that actions performed at this stage will be done once in a lifetime of a given Qserv instance. Any further changes (should they be needed) should be performed in the Qserv maintenance mode.

2.1.1. Docker-compose

There should be a special sub-command that needs to be run once before starting up Qserv:

qserv install

(warning) The implementation of the subcommand should check if the databases didn't exist and refuse to initialize the persistent state if they did.

After that Qserv would be started/stopped normally as needed using:

qserv up
qserv down

Qserv instance (including its persistent state) would need to be deleted using the following subcommand:

qserv delete

2.1.2. Kubernetes

It's not clear yet how this should be done. One option would be to create Qserv instance configured with a limited set of pods (database and init containers), which would be followed by running Kubernetes jobs to finalize the installation and verification of the setup.

2.1.3. Qserv instances at NCSA

Support for this regime is yet to be implemented if needed.

  • (warning) At the moment, we have no plans for installing new Qserv instances based on this deployment technology.

2.2. The normal operation mode

All databases are expected to be fully configured before turning Qserv into this mode. Note that database services may not be up at this time. It should be the responsibility of the Qserv and the Replication system's code to check the status and availability of the services (and act accordingly) before starting normal operations. Improvements in the implementation of Qserv are needed for that.

Note the following:

  • no schema upgrades or database setup changes are allowed at this stage
  • Qserv workers can't be removed from the setup
  • minor (a definition of what it means is yet to be refined) software upgrades not affecting the persistent state would be allowed
  • no major upgrades affecting Protobuf or database schemas would be allowed here

2.2.1. Docker-compose

Qserv would be started/stopped as needed using:

qserv up
qserv down

(warning) Database containers 

2.2.2. Kubernetes

All required pods/containers will be started at this stage.

Note the following:

  • rolling updates should be allowed at this stage (provided these are going to be the minor changes as stated earlier)

2.2.3. Qserv instances at NCSA

If nothing is run then start all services using:

./start --all [--all-debug]

or restart:

./restart --all [--all-debug]

2.3. The maintenance mode

No Qserv services (but the database ones) would be running at this stage. The state is used for:

  • schema upgrades
  • major software upgrades as required by the Protobuf definition changes or database schema changes
  • reconfigurations:
    • config files of Qserv and the Replication system
    • LSST logger configurations
    • Qserv workers could be removed or added to the Replication system's table qservReplica.config_worker as needed

2.3.1. Docker-compose

Most changes to the container versions and configuration parameters/files should be done by editing the corresponding entries in the Docker-compose YAML file.

(warning) Mechanisms and procedures for upgrading persistent data and table schemas would have to be developed. Though, considering the limited use (mostly to run the integration tests) of this deployment model, the maintenance mode may not be needed here.

2.3.2. Kubernetes

TBC...

2.3.3. Qserv instances at NCSA

This regime is turned on by the following operations:

./stop --all --all-debug

After that Qserv, Replication system or database versions could be updated by:

vi config/env

Configuration files of the services would be updated by:

./config_service ...
./config_logger ...

The database upgrades and schema upgrades require starting database services:

./start --czar-db --worker-db --repl-db

Note that no script-based automation exists for the data and schema upgrades at the time of writing this document. All operations would need to be run manually on the relevant databases:

  • data upgrades due to changes in the MySQL/MariaDB version
  • table schemas upgrades


2.4. Why the state/operation separation is important?

The simple answer is - because it eliminates a need to address a complex matrix of possible failure modes imposed by the variety of the infrastructures within the code. The code would expect the underlying persistent state as ready to be used (rather than programming for the worst-case scenarios of running on the "shaky ground"), so that it would only need to be modified to address specific failure modes, specifically the ordering problems in starting the dependent services.





  • No labels