1. Motivation

The current API and the implementation (which is driven by the API) don't appear to be scalable. Adding new parameters (still, after some earlier refactoring of the service) requires a substantial amount of work in the following classes (files):

Configuration.h       (for both general parameters, and parameters of workers, databases, etc.)
Configuration.cc
ConfigurationTypes.h  (for general parameters only)
ConfigurationTypes.cc (for general parameters only)
ConfigurationMySQL.h
ConfigurationMySQL.cc
ConfigurationStore.h
ConfigurationStore.cc
ConfigurationFile.h
ConfigurationFile.cc
ConfigApp.h
ConfigApp.cc
testConfiguration.cc
sql/replication.sql
sql/replication_config_ncsa.sql
HttpConfigModule.h
HttpConfigModule.cc

Besides, some of these files (notoriously, ConfigurationMySQL.cc have quite a bit of code duplication.

All of this is a direct result of a design decision to implement a type-safe Configuration API that has the following benefits:

avoiding text-based parameter lookup
avoiding run-time type conversion

In this model all "missing" parameters are caught by a compiler far before a missing (or wrongly typed) parameter might cause problems in production.

2. Possible solution

Unfortunately, this problem doesn't seem to have a simple solution that wouldn't require losing some (or all) benefits of the current API stated above. However, there is still room to reduce/eliminate code duplication and to refactor the code to minimize the amount of code per parameter.

3. Redesigned configuration service

The new design and implementation have been made in the context of the JIRA ticket https://jira.lsstcorp.org/browse/DM-28860.

This section explains the basic ideas behind the improved API, implementation, and database management procedures.

3.1. The general design

The following diagram outlines the components of the service and its clients.

Replication configuration service

As it's shown on the diagram, the centerpiece of the service is class Configuration (along with two supplementary classes ConfigurationSchema and ConfigurationExceptions. The transient state of the configuration is implemented as the JSON object with the class Configuration. The current schema definition (along with the parameter types, restrictors, default values, and documentation) is stored within the class ConfigurationSchema. The class ConfigurationSchema also has a definition of the persistent (MySQL) schema, and it provides operations over the persistent database, such as database creation, loading the initial schema, initial parameters, schema migration, and some other operations that are of no direct interest of the regular clients of the configuration service's API. Subsections below will explain the roles and interfaces of each of these classes in more detail.

3.2. JSON as the transient state

While choosing a dynamic data structure to represent a state of the C++ class may seem questionable at the surface, this presents multiple benefits for the application. The benefits fare overweights added complexity of achieving the built-in type safety provided in the nlohmann JSON library. In reality, the extra steps a developer has to take to tell the library a specific data type when fetching a stored value of an item is the only disadvantage of the library.

One may also argue that the performance of the library may be another potential drawback of the design.However, this is not an issue for this particular application due to unfrequent traffic to the configuration service.

The rest are only the benefits, namely:

The built-in type safety. Compared with the traditional approaches used for configuration purposes in the rest of the Qserv codebase, the JSON library is type aware. It does know about the actual types of each noded in a JSON structure. This allows automatic type enforcement when fetching data from the JSON nodes, and it does allow to enforce this when updating data in existing nodes. A type of value stored in a node is set at a time when the node is getting created. Moreover, the library provided types of introspection, which again looks exactly like what Python provides in its dynamic data type system.
The hierarchical organization of the JSON structure. Here JSON looks and (in many ways) behaves exactly like the Python data structures made of nested dictionaries, collections, collections, dictionaries, primitive data types, etc. This opens flexibility of grouping parameters in the JSON structure in ways that closely resemble the logical hierarchies of parameters within the applications. Moreover, the library provides convenient (STL-alike) techniques for iterating over and traversing the nodes of the JSON structure. Along with the introspection mechanism mentioned in the previous item, this forms a solid foundation for building collections of parameters that can't be even matched by the primitive key-value techniques.
The built-in serialization/deserialization. The library provides methods for creating a JSON structure from a string representing a serialized JSON structure in a reliable way. And it does so for the opposite operation as well - for dumping an existing JSON structure into a string representing the structure in the serialized form. The serialized form could be returned to a client of some REST service, stored in a file, sent over a network protocol to another user, printed onto the standard output, etc. This feature is very important for the configuration service because the needs to transfer its full or partial state into other forms (persistent, REST clients, and potentially - from the Master Replication Controller to the Replication worker should the "over-the-wire" configuration model be implemented in the system. The latter is presently under consideration.).
Very low maintenance burden. For example, the JSON structure representing the transient schema is very easy to maintain as it's observable. A reader can clearly see a hierarchy of entities in the schema, can easily add/remove nodes, modify parameter descriptions, attributes. Most of the code in the configuration service's code would automatically pick up changes in the parameter definition. The only exception would be the integration and unit tests. But that's quite normal as those should be able to see more than regular users of the API.
A possibility of implementing the directory service or the parameters. The class Configuration provides a special method returning a hierarchy of parameter names grouped by their categories. This allows the service management code to automate operations over the parameters w/o explicitly mentioning each parameter in the code like this was a case in the older version of the configuration code. This alone allowed the dramatic simplification of the code in those management tools.

One may see a few uses of JSON as data members in classes Configuration and ConfigurationSchema. For example, the latter class has a static data member defining the transient schema as the JSON object. This structure has a cohesive definition of the parameters. Here is an excerpt from that scheme for a few select groups of parameters:

json const ConfigurationSchema::_schemaJson = json::object({
    {"meta", {
        {"version", {
            {"description",
                "The current version of the configuration. Must be greater than 0."},
            {"read-only", 1},
            {"default", ConfigurationSchema::version}
        }}
    }},
    {"common", {
        {"request_buf_size_bytes", {
            {"description",
                "The default buffer size for network communications. Must be greater than 0."},
            {"default", 131072}
        }},
        {"request_retry_interval_sec", {
            {"description",
                "The default retry timeout for network communications. Must be greater than 0."},
            {"default", 1}
        }}
    }},

Note how this is done. The schema defines:

A hierarchy of the parameters grouped into logical categories.
Human-readable descriptions of the parameters defined next to the corresponding parameter. The descriptions are used to document parameters in the command-line tool qserv-replica-config designed to interact with the configuration database w/o messing with low-level details of the MySQL schema, etc. The very same description strings are used for viewing parameter definitions in the Qserv Web Dashboard application. They could also be used for diagnostic reporting in the log streams if needed.
The default value of the parameter. The value also defines the type of the parameter. The very same type will be enforced in all operations over this parameter in the configuration service's API and during the persistent-transient transformations.
Optional attributes of the parameters, such as read-only and security-context.

This is an example, how the Web Dashboard prints the documentation for each parameter:

3.3. The API

The API has two roles:

serving the configuration information to the regular clients
managing the state (including both the persistent and the transient ones) of the configuration

The primary (and for most - the only) interface to the configuration for the regular clients are represented by the class Configuration. All public methods are synchronized (guarded by a lock on an internal mutex) to ensure no race conditions would be happening while multiple clients from different threads were simultaneously reading and writing configuration parameters, or managing the global state of the configuration. Parameters served via the class's interface are of two kinds:

single value parameters (primitive data types)
object-type parameters of classes WorkerInfo, DatabaseFamilyInfo, or DatabaseInfo.

The single value parameters are located in a namespace by:

the name of the parameter's category
the name of the parameter within its category

Values of the parameters are retrieved or changed using the template methods of the class:

class Configuration {
public:
    template <typename T>
    T get(std::string const& category, std::string const& param) const;

    template <typename T>
    void set(std::string const& category, std::string const& param, T const& val) const;
};

The types of the parameter's values are fixed within the configuration, and they're enforced through the API methods and the internal type checking. This is possible because the nlohmann JSON library is type-aware. Any inconsistencies between the actual and attempted (by clients) types are reported via the C++ exception mechanism.

The primary source of the parameter definitions is within the class ConfigurationSchema. This will be explained later.

The method Configuration::get<T> always returns parameters from the transient state (the JSON object) of the class. The behavior of the opposite method Configuration::set<T> depends on the source the configuration object was loaded from. If that was a transient JSON object (see the diagram above) then only the transient state will get updated. If the configuration was read from the database then the change will also propagate to the persistent backend.

The class also provides the directory method for the parameters. The method returns the nested dictionary of the parameter names within their categories:

class Configuration {
public:
    std::map<std::string, std::set<std::string>> parameters() const;
};

Clients may also inspect properties of the parameters by calling the following methods of the class ConfigurationSchema.

class ConfigurationSchema {
public:
    /// @return A documentation string for the specified parameter or the empty string
    ///   if none is available in the schema.
    static std::string description(std::string const& category, std::string const& param);

    /// @return A 'true' if the parameter can't be modified via the 'set' methods
    ///   of the Configuration class. This information is used by class Configuration
    ///   to validate the parameters.
    static bool readOnly(std::string const& category, std::string const& param);

    /// @return A 'true' if the parameter represents the security context (passwords,
    ///   authorization keys, etc.). Parameters possesing this attribute are supposed
    ///   to be used with care by the dependent automation tools to avoid exposing
    ///   sensitive information in log files, reports, etc.
    static bool securityContext(std::string const& category, std::string const& param);

    /// @return A 'true' if, depending on the actual type of the parameter, the empty
    ///   string (for strings) or zero value (for numeric parameters) is allowed.
    ///   This information is used by class Configuration to validate input values
    ///   of the parameters.
    static bool emptyAllowed(std::string const& category, std::string const& param);
};

The Doxygen-style comments shown above should be sufficient to explain the role of each method.

3.3.1. Static accessor/modifier methods

Besides two groups of parameters explained in the previous section, there is a handful of special named parameters defined via static methods in the class Configuration. Values of these parameters are meant to be set at a startup time of an application before loading the content of the configuration. Most parameters are required to bootstrap the configuration loading or carry sensitive parameters (such as database passwords) that can't be kept in the database or a JSON file. Values of these parameters are normally passed as the command-line options directly to the application.

Though the lifecycle (and a scope) of these parameters is not nearly the same as for the rest of the configuration, it's convenient to keep them within the same class Configuration. Besides, the static parameters are also the configuration parameters. This approach also allows to reduce the number of classes in the module.

Each parameter has the corresponding "getter" and "setter" method as it's illustrated in the following excerpt from the class's definition:

class Configuration {
public:
    /// @return The database password for accessing Qserv czar's database.
    static std::string const& qservMasterDatabasePassword() { return _qservMasterDatabasePassword; }

    /**
     * @param newPassword The new password to be set for accessing Qserv czar's database.
     * @return The previous value of the password.
     */
    static std::string setQservMasterDatabasePassword(std::string const& newPassword);
};

3.3.2. Instantiating objects of the class Configuration

The transient configuration object of class Configuration can be created from two sources. The first factory method constructs an object that is disconnected from any persistent backend. This way of constructing configurations is required for unit testing and (in the future) for configuring workers "over the wire" from the configuration data pushed to the workers by the Master Replication Controller. The interface is shown below:

class Configuration {
public:
    /**
     * The static factory method will create an object and initialize its transient
     * state from the given JSON object. The parameters will amend the default state
     * of the object.
     * @note Configuration objcts created by this method won't have any persistent
     *   backend should any changes to the transient state be made.
     * @param obj The input configuration parameters. The object is optional.
     *   If it's not given (the default value), or if it's empty then no changes
     *   will be made to the transient state.
     * @throw std::runtime_error If the input configuration is not consistent
     *   with expectations of the application.
     */
    static Ptr load(nlohmann::json const& obj=nlohmann::json::object());
};

Note that parameters of the configurations constructed from the JSON objects can only be updated in the transient state of the corresponding process.

The second approach is to load the configuration from a database:

class Configuration {
public:
    /**
     * The static factory method will create a new object and initialize its content
     * from the following source:
     * @code
     *   mysql://[user][:password]@[host][:port][/database]
     * @code
     * @note Configuration objects initialized from MySQL would rely opn MySQL as
     *   the persistent backend for any requests to update the state of the transient
     *   parameters. A connection object to the MySQL service will be initialized
     *   at the corresponding data member of the class.
     * @param configUrl The configuration source.
     * @param autoMigrateSchema The optional flag that if 'true' would result in an attempt
     *   to automatically migrate the database schema up to the level of the current
     *   configuration's should the persistent stores schema versin be lower than
     *   the expected one. Note that no schema downgrade is supported in the current
     *   implementation. The flag is supposed to be set to 'true' when instantiating
     *   the configuration object in the Master Replication Controller.
     * @throw std::invalid_argument If the URL has unsupported scheme or it
     *   couldn't be parsed.                          
     * @throw std::runtime_error If the input configuration is not consistent
     *   with expectations of the application.
     */
    static Ptr load(std::string const& configUrl, bool autoMigrateSchema=false);
};

Any changes made to parameters of configurations loaded from a database will be saved to the very same database. The primary user of this kind of configuration is the Master Replication Controller.

3.3.2.1. Reloading/updating the configuration

The service has a mechanism that allows updating the state of the transient configuration object. The primary use case addressed by this technique is a scenario when the content of the persistent configuration is changed and the components of the replication/ingest system (workers and/or the Master Controller) need to be notified about this change. Some of the existing Replication system's algorithms (or so-called "jobs") rely upon and utilize this mechanism. The following three methods implement state update:

class Configuration {
public:
    /**
     * Reload non-static parameters of the Configuration from the same source
     * they were originally read before.
     * @note If the object was initialed from an in-memory object then
     *   the method will do noting.
     */
    void reload();

    /**
     * Reload non-static parameters of the Configuration from the given JSON object.
     * @note If the previous state of the object was configured from a source having
     *   a persistent back-end (such as MySQL) then the association with the backend
     *   will be lost upon completion of the method.
     * @param obj The input configuration parameters.
     * @throw std::runtime_error If the input configuration is not consistent
     *   with expectations of the application.
     */
    void reload(nlohmann::json const& obj);

    /**
     * Reload non-static parameters of the Configuration from an external source.
     * @param configUrl The configuration source,
     * @throw std::invalid_argument If the URL has unsupported scheme or it couldn't
     *   be parsed.
     * @throw std::runtime_error If the input configuration is not consistent with
     *   expectations of the application.
     */
    void reload(std::string const& configUrl);
};

3.4. The schema

There are two kinds of schemes in the configuration service. They both are defined within the utility class ConfigurationSchema:

JSON schema (transient)
MySQL schema (persistent)

Both schemas are observable. It also helps that they're kept within the same class.

3.4.1. Creating/initializing the persistent configuration in MySQL

There are two ways to create a configuration database in MySQL and populate it with the default values of the parameters. The first (the traditional) mechanism (like the one in the original design of the service) would be to get the schema definition statements by calling the following method of the class:

class ConfigurationSchema {
public:
    /**
     * Return the current MySQL schema and (optionally) the initialization
     * statements for the minimum set of teh default parameters required by
     * the Replication/Ingest system to operate.
     * @note This method is designed to work w/o having an open MySQL connection
     *   to generate proper quotes for SQL identifiers and values. Please consider
     *   the optional parameters 'idQuote' and 'valueQuote' allowing to explicitly
     *   specify the desired values of the quotes. The default values in the signature
     *   method are set to be consistent with the present configuration of
     *   the MySQL/MariaDB services of Qserv and the Replication/Ingest system.
     * @param includeInitStatements If 'true' then add 'INSERT INTO ...' statements
     *   for initializing the default configuration parameters.
     * @param idQuote The quotation symbol for MySQL identifiers.
     * @param valueQuote The quotation string for MySQL values.
     * @return An ordered collection of the MySQL statements for creating schema and
     *   initializing configuration parameters. The order of statements takes into
     *   accout dependencies between the tables (such as FK->PK relationships).
     */
    static std::vector<std::string> schema(bool includeInitStatements=true,
                                           std::string const& idQuote="`",
                                           std::string const& valueQuote="'");
};

The simplest way to get this info would be by invoking the following application:

qserv-replica-config MYSQL_SCHEMA_DUMP

The application will print out the schema definition CREATE TABLE and parameter initialization INSERT INTO statements that one would have to apply to the corresponding MySQL database. Though this low-level technique is still available it's discouraged in the new implementation of the service. There is a safer and more efficient mechanism for doing the same:

qserv-replica-config MYSQL_CREATE --config=<config-url> [--reset]

This technique provides a few advantages of the older one:

it uses the same connection string (config-url) to specify a location and access parameters to the database.
it hides the implementation details of the operation
it reduces the possibility of accidentally altering the schema definition should the low-level process was more complex than just dumping the schema and loading it into MySQL.

3.4.2. Verifying and enforcing the schema

When loading the configuration from MySQL the service would always cross-check the contents of the database against the transient schema definition. Any inconsistencies in the names, types, or values of the parameters would be reported as errors.

The primary "source of truth" for the schema is the transient JSON object.

3.4.3. Schema migration

In order to simplify the management of the persistent representation of the configuration, the new implementation of the service has provisions for the schema evolution. This is based on two parameters identifying the schema:

the transient schema version
the persistent schema version

The transient schema version number (the version required by the application) is retrieved as the value of the following attribute:

class ConfigurationSchema {
public:
    /// The current version number required by the application.
    static int const version;
};

The transient schema version number is also reported by the following application:

qserv-replica-config SCHEMA_VERSION

The persistent version is stored in one of the tables in the configuration database.

When the content of the corresponding MySQL is being read and analyzed by the service (see the method Configuration::load(std::string const& configUrl, ... )) would also cross-check the version read from MySQL versus the one expected by the application. Any differences would be reported as errors and result in throwing the following exception:

/// File: ConfigurationExceptions.h

/**
 * The class ConfigVersionMismatch represents exceptions thrown on the expected versus
 * actual version mismatch of the configuration found in the peristent store.
 */
class ConfigVersionMismatch: public ConfigError {
public:
    int const version;
    int const requiredVersion;
    explicit ConfigVersionMismatch(int version_, int requiredVersion_)
        :   ConfigError("Configuration version " + std::to_string(version_) + " found in"
                        + " the persistent state or a JSON object doesn't match the required"
                        + " version " + std::to_string(requiredVersion_) + "."),
            version(version_),
            requiredVersion(requiredVersion_) {
    }
};

Normally this exception is not handled by the configuration loading method resulting in terminating an application. However, the application may allow automatic schema migration. This is done by passing true as a value of the optional parameter bool autoMigrateSchema to the configuration loading method. Should this parameter be used the following method will be invoked:

class ConfigurationSchema {
public:
    /**
     * Upgrade persistent configuration schema up to the currnet one.
     * @throws std::logic_error If the persistent schema is not strictly less than
     *   the one that is expected by current implementation of the class.
     */
    static void upgrade(std::string const& configUrl);
};

The method would read the persistent configuration, compare the findings against expectations found in the transient schema definition, and perform the upgrade if needed and if possible (if the persistent schema version is strictly less than the transient one).

Normally, only one process at a time is expected to do the schema upgrade. Hence, it's recommended to pass the following optional command line flag only to the Master Replication Controllers:

qserv-replica-master-http --config=<config-url> --auto-migrate-schema

When the above-explained schema upgrade mechanism was being designed it was recognized that such automatic schema migration may not be an option in some data management scenarios. Sometimes the migration has to be an explicit process invoked by a data administrator at an appropriate moment of time. To address this scenario the following command-line tool has been made available:

qserv-replica-config MYSQL_SCHEMA_UPGRADE --config=<config-url>

If the schema is up to date then it's safe to invoke this operation as many times as needed as it has no side effects. Usually one may include this statement as one of the steps before starting the Master Replication Controller.

3.5. Testing

There two tests in the new implementation of the service

The traditional unit test meant to be used to catch any issues in the transient implementation or the API of the configuration service during compilation/build time. This test won't test the functionality or correctness of the MySQL-based persistent back-end
And the newly added integration test that requires making changes to some MySQL database. This test does a subset of the tests made by the unit tests. Though, the primary focus of the integration test is to test how the changes to the parameters propagate to the MySQL database

The integration test is invoked as the following application:

qserv-replica-config MYSQL_TEST [<scope>] --reset --config=<config-url>

Any problems would be reported into the standard output. The overall completion status is returned to the Unix shell as usually: 0 - for success, and any other number as an indication of an error.

3.6. Adding/removing/modifying parameters

The key question here (and one of the objectives for this refactoring) is how much effort is needed to make changes to a collection of the configuration parameters. One of the main issues of the original design of the service was that adding even a single parameter required making changes to all classes mentioned at the beginning of the document. Most changes were purely the boilerplate code mostly meant to add another named method to the multiple configuration files. The redesigned service has dramatically reduced that.

This is what needs to be done now to add a simple parameter:

Add an entry to the JSON (the transient) schema. The entry includes the name of the parameter in its category (new categories can be added as well), the documentation string for the parameter, its initial (the default) value, and optional attributes (if any required), such as the "read-only" flag, the "security-context" flag.
Add a line of the code in the MySQL reader for the parameter.
Add a few lines of code for testing the parameter in the unit test.
Add a few lines of code for testing the parameter in the integration test.

There is a possibility that a need in the last three actions may be eliminated in the future if the code will be further improved to auto-detect parameters based on the previously mentioned "directory" service method Configuration::parameters(). After that, the configuration of the regular (single value) parameters will be completely driven by the transient JSON schema.

The amount of work for modifying the complex object-type parameters (classes WorkerInfo, DatabaseFamilyInfo, or DatabaseInfo) is presently a bit higher. Though, not significantly as each of these classes has:

a constructor for constructing an object from a JSON object.
a serialization method returning a JSON representation of the object's state.

Therefore, most of the work would be in modifying these two methods.

Adding a new class of the object-type parameters would be the only scenario where a moderate amount of work would be needed. Though, as of today, there are no need in adding new classes.

Space shortcuts

Page tree

1. Motivation

2. Possible solution

3. Redesigned configuration service

3.1. The general design

3.2. JSON as the transient state

3.3. The API

3.3.1. Static accessor/modifier methods

3.3.2. Instantiating objects of the class Configuration

3.3.2.1. Reloading/updating the configuration

3.4. The schema

3.4.1. Creating/initializing the persistent configuration in MySQL

3.4.2. Verifying and enforcing the schema

3.4.3. Schema migration

3.5. Testing

3.6. Adding/removing/modifying parameters

Space shortcuts

Page tree

Refactoring the Configuration service

1. Motivation

2. Possible solution

3. Redesigned configuration service

3.1. The general design

3.2. JSON as the transient state

3.3. The API

3.3.1. Static accessor/modifier methods

3.3.2. Instantiating objects of the class Configuration

3.3.2.1. Reloading/updating the configuration

3.4. The schema

3.4.1. Creating/initializing the persistent configuration in MySQL

3.4.2. Verifying and enforcing the schema

3.4.3. Schema migration

3.5. Testing

3.6. Adding/removing/modifying parameters