Skip to end of metadata
Go to start of metadata

1. Introduction

In the original implementation of Qserv nearly any catalog ingest requirements were mostly driven by the needs of the development and integration tests. These requirements were successfully addressed by a simple tool qserv-data-loader.py backed by the Python-based service wmgr run at each worker node. This mechanism is still available since it works fairly well for scenarios it was developed for. Unfortunately, it quickly falls short in many areas where ingesting large quantities of data is needed, or where the high performance or reliability of the ingests is mandatory. It's also clear that the simple mechanism won't be able to address the primary target - ingesting the LSST's Data Release Products. The number and variety of these cases started growing over the last year as the Project (LSST) was approaching its critical stages, and more complex use case scenarios were showing up for Qserv. Eventually, a need in having a more sophisticated and versatile Ingest System emerged.

The current article documents the new system while retaining a main focus at the practical aspects of using the system. More documents on the requirements and the low-level technical details of its implementation (unless it's needed for the purposes of the document's goals) can be found elsewhere.

And the final comment is on the information flow in the document. It's recommended to read the document sequentially. Most ideas presented in the document are introduced in section An example of a simple workflow. The section is followed by a few more sections covering advanced topics. And The API Reference section in the very end of the document should be used to find complete descriptions the REST services and tools mentioned in the document.

2. What's new in the new Ingest System?

A fundamental difference of the new system from the older one is that it's not just a fixed "loader" tool (backed by a simple set of services). It's rather a set of open interfaces and tools allowing to construct a variety of the ingest workflows suitable for specific deployments and use cases of Qserv. In the new paradigm, the older "loader" tool mentioned in the Introduction section would be just a simplest workflow.

At the very high level, the system is comprised of:

  • A REST server integrated into the Master Replication Controller. It provides a collection of services for managing metadata and states of the new catalogs to be ingested. The server also coordinate its own operations with Qserv and the Replication System to prevent interferences with those and minimize failures during catalog ingest activities. 
  • The Data Ingest services run at each Qserv worker alongside the Replication System's worker services. A role of these services is to actually ingest client's data into the corresponding MySQL tables. The services would also do an additional (albeit, minimal) preprocessing and data transformation (where or when needed) before ingesting the input data into MySQL.

Implementation wise, the Ingest System  heavily relies on many functions of the Replication System by using many functions of the later, including the Replication System's Controller Framework, various (including the Configuration) services, and the worker-side server infrastructure of the Replication System.

Client workflows interact with the system's services via open interfaces, and use ready-to-use tools to fulfill their goals of ingesting catalogs.

Here is the brief summary of features of the new system:

  • It introduces a well-defined semantics into the ingest process. With that, a process of ingesting a new catalog now has to go through a sequence of specific steps maintaining a progressive state of the catalog within Qserv while it's being ingested. The state transitions and the corresponding enforcements made by the system would always ensure that the catalog would be at a well-defined consistent state during each step of the process. Altogether, this model increases the robustness of the process, and it also makes it more efficient.
  • To facilitate and implement the above mentioned state transitions the new system introduces a distributed checkpointing mechanism named as super-transactions. These transactions allow for incremental updates of the overall state while allowing to safely roll back to a prior consistent  state should any problem occurred during data loading within such transaction.
  • In its very foundation, the system has been designed for  constructing high-performance and parallel  ingest workflows w/o compromising consistency of the ingested catalogs.
  • Unlike the original system, the new data loading protocol is binary, thus resulting in a more efficient data transfer. To interact with the Data Ingest services users may chose either the C++ API or ready-to use file loading tools.

2.1. What's not done by the new system

As per its current implementation (which may change in the future) the system will not  automatically partition input files. This task is expected to be a responsibility of the ingest workflows.

Also, the system will not  (with a very small exception of adding an extra leading column qserv_trans_id required by the implementation of the new system) pre-process the input TSV / CSV files sent to the Ingest Data Servers for loading into tables.  It's up to the workflows  to sanitize the input data and to make them ready to be ingested into Qserv.

3. Configuring the system

Since the Ingest system is a functional extension of the Replication system then all aspects related to configuring them both are covered in section 3.2 of Deploying, configuring and operating the Qserv Replication system. This section is here mainly to highlight parameters which are relevant for the Ingest system. Some of these parameters may be uses used for configuring ingest workflows.

The following two parameters are directly related to the Ingest System:

  • The number of workers in a setup.
  • The number of threads the Data Ingest server run at worker.

Knowing these two number would allow configuring workflows in a most efficient way to achieve the highest possible bandwidth  when ingesting table contributions into the system. More ideas on this subject are presented in Key configuration parameters of the Ingest System.

4. API version 4

Please, note that the versioning mechanism presented here applies to both the metadata services (the REST API) and the loading tools. Changes in any would result in a change in the version number reported by the service explained hereafter.

The current version of the API could be fetched using a special metadata service documented in section Obtaining the current version of the API. The current documentation expects the service to return (upon successful completion of a request as explained in Error reporting in the REST API) a JSON  object which has the following value of the attribute version:

{...
 "version":3
}

4.1. What's new in version 2

The new version adds the workers' REST API for directly ingesting files either from an object store via http/https protocol, or by reading files from a locally mounted filesystem (NFSGPFS , Lustre , Ceph , etc.). See details in Ingesting a single file.

Responses from services Allocate/locate chunks of the partitioned tables and Locate regular tables have been extended to return connection parameters of the workers' REST services. 

4.2. What's new in version 3

The rest service meant to Locate regular tables has been modified. In the previous version, the service required an identifier of a transaction to be provided in the path:

/ingest/regular/:id

The new version has been made compatible with the REST service Allocate/locate chunks of the partitioned tables.

4.3. What's new in version 4

4.3.1. Removed option "--columns-separator" in the command-line ingest tool

(warning) The changes may break existing applications.

The command-line syntax of the application  qserv-replica-file INGEST has changed:

  • The option --columns-separator  got removed.
  • The 5 new options were added: 

      --fields-enclosed-by=[<value>]
      --fields-escaped-by=[<value>]
      --fields-terminated-by=[<value>]
      --lines-terminated-by=[<value>]
      --record-size-bytes=[<value>]
  • (warning) The option --fields-terminated-by replaces the removed option --columns-separator.
  • (warning) The default value of the option  --fields-terminated-by is \t.

4.3.2. Changes in parameters of the table uploading service

The worker-side REST service meant for Ingesting a single file has been changed to support additional attributes:

"fields_enclosed_by":<string>
"fields_escaped_by":<string>
"fields_terminated_by":<string>
"lines_terminated_by":<string>

(warning) The option column_separator has been preserved for backward compatibility with existing Ingest workflows. This option is now considered deprecated. The workflows need to migrate to use the option fields_terminated_by instead. If both options are found in a request then the column_separator will be assumed.






5. Error reporting

The error reporting mechanism implemented in the System serves as a foundation for building reliable workflows .

5.1. Web services

All services explained in the document adhere to the usual conventions adopted by the Web community for designing and using the REST APIs. In particular, HTTP code 200 is returned if a request is well formed and accepted by the corresponding service. Any other code shall be treated as an error. However, the implementation of the System further extends the error reporting mechanism by guaranteeing that all services did the fine-grain  error reporting in the response objects. All services of the API are guaranteed to return a JSON object if the HTTP code is 200. The objects would have the following mandatory attributes (other attributes depend on a request):
{"success":<number>,
 "error":<string>,
 "error_ext":{...},
 ...
}

Therefore, even if a request is completed with HTTP code 200, a client (a workflowmust inspect the above mentioned fields in the returned object. These are the rules for inspecting the status attributes:

  • A successful completion of a request is indicated by having success=1 in the response. In these case the other two fields should be ignored.
  • Otherwise, a human readable explanation of a problem would be found in the error field.
  • Request-specific extended information on errors is optionally provided in the error_ext field.

Note that, depending on a service, additional attributes may be present in the object.

5.2. Data loading tool(s)

All command line tools return 0 to indicate a successful completion of the requested operation. Other values shall be treated as errors.  

6. An example of a simple workflow

This section illustrates ideas mentioned earlier with an example of a very simple workflow. All interactions with the System demonstrated hereafter will be deliberately done using the core API to the System w/o any automation or wrapper scripts. This is a graphical representation of the workflow:

basic worflow

TBC...

6.1. A setup (Qserv)

For the sake of simplicity all examples in this section assume a small instance of Qserv comprised of one master and 2 workers:


hostportservice
masterqserv-master0125080REST API for managing Qserv
db01qserv-db0125002Data Ingest Service
db02qserv-db0225002Data Ingest Service

In order to make operations secure, the system had been configured with the following value of the authorization key:

auth_key=SECURED

Let's also assume that all operations with the REST API of the system will be initiated from the same machine where Qserv master runs. In this case the base URL to the services will look like this:

http://localhost:25080/

6.2. An input data set

In this test we're going to ingest a new database called test101. The catalog will have a single table called Object which, in Qserv terms, will be also known as the so called director table (a catalog with at least one table - the director one is the minimum requirement for catalogs served by Qserv). The table will have just a few simple columns:

objectId BIGINT NOT NULL
ra DOUBLE NOT NULL
dec DOUBLE NOT NULL
property DOUBLE
chunkId INT UNSIGNED NOT NULL
subChunkId INT UNSIGNED NOT NULL

Note, that of these columns, only one column named property is not treated by Qserv as the special  column. Others have special meaning in Qserv:

  • objectId  is a unique identifier of entries in the director  table. It's used for optimized (the so called secondary index optimization) lookup of the objects if values of the identifiers are known a priori (as a result of other queries).The name of the column doesn't need to be the same as in tis example. Though, it's important to let Qserv know what is that name.
  • two columns ra  and dec  represent a location of an object in the spherical  coordinate system supported by Qserv. And, like in case of the above mentioned objectId, the names of these two columns are defined by a user.
  • the last two columns chunkId and subChunkId  have predefined naming 

The input data to be loaded into Qserv were already partitioned using the sph-partition tool using the following configuration:

# Partitioning parameters.
part = {
  num-stripes = 340
  num-sub-stripes = 3
  chunk = chunkId
  sub-chunk = subChunkId
  pos = 'ra, dec'
  overlap = 0.01667
}

# Input file format
in.csv = {
  null = ''
  delimiter = ','
  escape = '\\'
}

# Output CSV format.
out.csv = {
  null = '\\N'
  delimiter = ','
  escape = '\\'
  no-quote = true
}

# List of table column names, in order of occurrence.
in.csv = {
  field = [
    objectId
    ra
    dec
    property
  ]
}

By looking at that configuration, one may notice that the previously stated columns chunkId and subChunkId are not explicitly mentioned section in.csv. This shouldn't be a concern since  the partitioning tool will automatically extend each row of an output CSV files with these columns.

Let's also assume the partitioning stage resulted in 4 chunks 187107187108, 187109, and 187110 as shown below:

% ls -1 input/
chunk_187107_overlap.txt
chunk_187107.txt
chunk_187108_overlap.txt
chunk_187108.txt
chunk_187109_overlap.txt
chunk_187109.txt
chunk_187110_overlap.txt
chunk_187110.txt

6.2.1. Notes on the partitioning parameters

There are 3 parameters passed into the partitioning tool which require a special attention:

  • num-stripes: 340
  • num-sub-stripes : 3
  • overlap : 0.01667

Choosing right values of the parameters is important, and not only for a purpose of optimizing a distribution of tables within Qserv (in a respect of the total number of chunks, sizes of those tables, sizes of the overlap  areas of the tables), but also to allow JOIN in queries involving other similar catalogs. Databases which are configured exactly in the same way as others are said to belong to the same database family. The Replication system makes a special treatment of databases - members of the same family, to ensure chunks of different databases are placed at the same workers. This allows Qserv to correctly implement queries involving two or many members of the same database family.

It's always the right thing to do before ingesting a new catalog into Qserv to decide to which family the new catalog should belong to. If a mistake is made here then the input data of the catalog would have to be repartitioned and another attempt to ingest the catalog would have to be made.

This topic is covered in more details in Section 3.2.3 of Deploying, configuring and operating the Qserv Replication system.

Also read section Databases and database families of the current document to get help on retrieving info on which databases and database families exist in a setup.

6.3. Registering a new database in Qserv

Once all input data are in place and ready to be ingested then registering the new database will be the very first direct interaction with the the Ingest System. In this case a workflow would need to sent a request to the Database registration service by passing a description of the new database:

{"database":"test101",
 "num_stripes":340,
 "num_sub_stripes":3,
 "overlap":0.01667,
 "auto_build_secondary_index":1,
 "auth_key":"SECURED"
}

Note the optional attribute auto_build_secondary_index  which is telling the service to automatically build the secondary index. Please, read section Building the secondary index for further details on this subject.

This can be done by using the command-line tool curl like it's shown below:

curl http://localhost:25080/ingest/database \
  -X POST -H "Content-Type: application/json" \
  -d '{"database":"test101","auto_build_secondary_index":1,"num_stripes":340,"num_sub_stripes":3,"overlap":0.01667,"auth_key":"SECURED"}'

If everything goes well (see Error reporting in the REST API) then the service is expected to return a description of the database as explained in the documentation for the service. What's important for the purposes of this small example is to watch for a value of key success, which is expected to be set to 1 in case of successful completion of the operation.

After hat usersAfter registering the new database in the system, Qserv still doesn't know anything about the new database. The database will be staying "invisible" all the way through the whole ingest process up until the very last step when the database is published in Qserv. After that users will be able to launch queries agains the new database. Any attempts to submit queries (via Qserv) before that will result in errors.

6.4. Registering a table

If a catalog has more than one table then one should be aware that there is no such requirement in the new system as registering all tables at once. The only requirement is that a table needs to be registered before loading any data into that table. In case if there are many tables in a catalog then operations with the tables are completely independent for all these tables. Hence table registrations (and subsequent data loading into the tables) can be also done in parallel on different tables.

And these are some notes on the naming conventions (restrictions) for the tables:

Tables are registered using a service documented in section Register a new table.

The service requires a table description in a form of a JSON  object sent to the service. Here is how this description would look like for table Object of the current example:

{"database":"test101",
 "table":"Object",
 "is_partitioned":1,
 "chunk_id_key":"chunkId",
 "sub_chunk_id_key":"subChunkId",
 "is_director":1,
 "director_key":"objectId",
 "latitude_key":"dec",
 "longitude_key":"ra",
 "schema":[
  {"name":"objectId","type":"BIGINT NOT NULL"},
  {"name":"ra","type":"DOUBLE NOT NULL"},
  {"name":"dec","type":"DOUBLE NOT NULL"},
  {"name":"property","type":"DOUBLE"},
  {"name":"chunkId","type":"INT UNSIGNED NOT NULL"},
  {"name":"subChunkId","type":"INT UNSIGNED NOT NULL"}
 ],
 "auth_key":"SECURED"
}

If this description was put into a text file called Object.json then one could use the command-line tool curl  to initiate the request:

cat Object.json | curl http://localhost:25080/ingest/table -X POST -H "Content-Type: application/json" -d@-

See a description of the service Register a new table for further details on a result object of the service.

6.5. Starting a transaction

One distinctive feature of the new System is a mechanism of the so called super-transactions. Strictly speaking, these aren't per se the same transactions as the ones implemented in the traditional RDBMS (like MySQL, Oracle, etc.). Though, the idea is basically the same - to provide well defined boundaries for modifications made to the database's content. Unlike the normal transactions, the super-transactions are limited in what they can be used for. Specifically, their only use is to allow ingesting rows into tables (of both regular  and partitioned  kinds) in layers, where each layer is represented by a MySQL partition (see Evaluate MySQL table partitioning for the reliable ingest with the rollback option for further details). When one such transaction is started then all contributions (rows) into tables ingest within the transaction would be put into the corresponding MySQL partition of the tables. After the transaction is committed the rows will stay in the tables. Otherwise (if it's aborted) the whole partition will be eliminated within all relevant tables.

Tables created during the aborted transaction won't be deleted even if this was the only transaction open for the corresponding database. There are many reasons why it's implemented in this way. The main reason is that the new System allows implementing complex and highly-parallel workflows which may have multiple open transactions at a time. Therefore deleting tables due to a single failed transaction while having data ingest activities in a context of other transactions may (and most certainly - will) cause disruptions for the later. 

This mechanism requires the table ingest operations to be associated with some transaction  by providing a unique identifier of the transaction. This section of the document illustrates a coordination between the transaction management and data ingest operations.

A few more notes on the transactions:

  • transactions are open in a scope of a database
  • it's allowed to have more than one transaction open at each time
  • contributions made into the same table via different transactions are independent, and as such they would be committed  or aborted independently of each other.
  • all transactions must be committed  or aborted  before publishing a database
  • when a transaction is being committed  there is an option (it will be explained later) to add contributions to the secondary index  as well (though, this is not ( presently recommended).

Transactions are started using a service explained in section Start a transaction.

The request expects a JSON  object describing a scope of the new transaction. For the purposes of the on-going example the request would be initiated like this:

curl http://localhost:25080/ingest/trans \
  -X POST -H "Content-Type: application/json" \
  -d '{"database":"test101","auth_key":"SECURED"}'

In case of successful completion (see Error reporting in the REST API) the service is expected to return a JSON object with a description of the new transaction:

{...
 "databases":{
   "test101":{
     "transactions":[{
       "id":123,
       "state":"STARTED",
       ...
     }],
     ...
   },
   ...
 }
}

Where the most important attribute in the object is a unique identifier of the transaction represented by key id. The path to the key would be:

import requests
import sys
url='http://localhost:25080/ingest/trans'
database='test101'
request={'database':database,'auth_key':'SECURED'}
response = requests.post(url, json=request)
responseJson = response.json()
if not responseJson['success']:
    print "failed:", responseJson['error'], responseJson['error_ext'],
    sys.exit(1)
id = responseJson['databases'][database]['transactions'][0]['id']
print id

It's important to capture the number (which would be 123 in this example) since the identifier will be required for subsequent data ingest operations.

6.6. Allocating chunks

This step is only required for loading chunk  contributions into the partitioned  tables. The section can be skipped when loading the regular tables. For loading contributions into the regular  tables, one would have to follow instruction explained in section Notes on loading regular tables.

At this stage a client workflow would ask the System to advise on a placement of a chunk. The system will respond with the DNS name or an IP name of the corresponding worker  host and the port number of the Data Ingest Service  run on that worker. The workflow is supposed to use this information to connect to the worker's service in order to upload a contribution into a requested chunked table.

Here is some additional information on the chunk allocators:

  • It's safe to request chunk placement for the same chunk multiple times. In the current implementation of the System it's guaranteed  that the System will always respond with the same set of connection parameters or the chunk.
  • It's safe to request chunk placement for the same chunk from multiple processes or within multiple transactions. The System will take care of proper synchronization when processing such concurrent requests.
  • Chunk placements can be obtained either for a single chunk or for a collection of chunks. The later is more efficient if the number of chunks is on a scale of thousands.

There are two services in this group (one for allocating a single chunk, and the other one for the batch  allocation of multiple chunks). The services are documented in section Allocate/locate chunks of the partitioned tables. Due to a small number of chunks to be allocated the current example, the single chunks allocation technique would be sufficient. For example, if a request was made for chunk 187107 in a context of  the current example, then the request object would look like:

{"transaction_id":123,
 "chunk":187107,
 "auth_key":"SECURED"
}

The result could be reported as:

{...
 "location":{
   "worker":"db01",
   "host":"qserv-db01",
   "port":25002
  }
}

To learn about a more advanced technique for the batch-mode chunk allocation one should read section Batch mode for allocating chunks.

6.7. Uploading chunk contributions

This example is based on an assumption that chunk contributions are located at a filesystem that is directly accessible by the ingest client. Please, be informed about alternative options for uploading data into Qserv tables from other sources, including worker filesystems and remote object-stores. These are explained in section Ingesting files directly from workers.

For uploading data into tables, one would have to use the binary tool qserv-replica-file INGEST which is packaged into the Docker image qserv/replica:tools (other tags based on this name are also possible). The application could be launched like this:

docker run --rm -t --network host -u 1000:1000 \
  -v /etc/passwd:/etc/passwd:ro \
  -v /etc/group:/etc/group:ro \
  -v $PWD/input:/input/:ro \
  qserv/replica:tools \
  qserv-replica-file INGEST <parameters> <option> <flags>

Note that there is some variability in actual values of the parameters of the Docker 's run command. One has to adjust them to suit a specific environment of a workflow. In particular:

  • the above-shown mapping for the input data directory doesn't have to be the same as shown above. The only requirement here is to allow the uploader tool to access files through some path.
  • the above-shown mapping for the host's files /etc/passwd  and /etc/group  may not work to resolve the -u uid:gid=1000:1000  mapping that is presented in the example. This example is used mainly to demonstrate one option to allow the tool to read the input files. Other options are also possible.

Depending on requirements to a workflow, the application can be used in two ways:

  • For ingesting a single TSV / CSV file
  • Or, for ingesting many files in the batch mode.

A potential downside of the single-file approach would be an overhead for launching the container. If an input dataset has many small files then the amount of time spent for launching the container may far exceed the one of actually reading a file and sending to the System for upload. To deal with this issue one may consider loading files in batch mode. The idea here is that the application launched just once, and it's given a list of files to be read and sent to the corresponding workers.

Though, one should be aware that uploading contributions in the batch  mode have its own (potential) problems:

  • Files are loaded sequentially in the same order they're specified in the input list. It's always a good idea to find a good balance between an overhead of loading a single file (contribution) versus the number of parallel loaders. This topic is covered later in a separate section Key configuration parameters of the Ingest System.
  • Any failure when loading a single file should be treated as a failure of any transactions mentioned in the input lists. Hence it's better not to use this mechanism to load files (contributions) in the context of different transactions. (NOTE: it's possible that the command-line syntax of this application would be restricted in the future to disallow using different transactions in the same list).

Here is an example of how the file list could look like in the context of the current example:

[{"worker-host":"qserv-db01","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187107_overlap.txt"},
 {"worker-host":"qserv-db01","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187107.txt"},
 {"worker-host":"qserv-db02","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187108_overlap.txt"},
 {"worker-host":"qserv-db02","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187108.txt"},
 {"worker-host":"qserv-db01","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187109_overlap.txt"},
 {"worker-host":"qserv-db01","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187109.txt"},
 {"worker-host":"qserv-db02","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187110_overlap.txt"},
 {"worker-host":"qserv-db02","worker-port":25002,"transaction-id":123,"Object","P","input/chunk_187110.txt"}
]

If the above loading plan was stored in the file Object_load.json then the loading command could look like this:

qserv-replica-file INGEST FILE-LIST Object_load.json --verbose

It's recommended to always use flag --verbose to get more info on the progress of the operation. With that flag a performance on loading each file in the plan will be reported to the standard input as the table contributions (files) will be being loaded into Qserv.

Please, read section Error reporting in the command-line tools to learn how to interpret error codes reported by the tool.

6.7.1. What should be done if a file upload fails?

The answer is simple. First of all, one has to abort super-transaction as explained in the next section. Note that this will scratch off all uploaded made within the transaction (which means one has to implement proper bookkeeping in a workflow in order to account for possible losses caused by this scenario).

The next step is to figure out what went wrong. Most likely a problem could be caused by the content of the input data, values of parameters of the container's run command, or parameter values passed into the application. If the problem is caused by the System then one would have to contact an expert in charge of the corresponding Qserv deployment. Once the problem got resolved a new transaction would need to be started and all uploads planned for that transaction would need to be repeated.

6.8. Committing/aborting the transaction

Be advised that both operations are irreversible.

This step is needed for checkpointing or rolling back results of the partial table upload operations made since the beginning of a transaction in a context of that transaction. Note, that it's a responsibility of a workflow to do a proper bookkeeping for which table contribution uploads were made during a particular transaction. This would be needed to prevent ingesting the same contribution more than once, or missing ingesting contributions.

Transactions are aborted or committed using a service explained in section Commit or abort a transaction. In a context of the current example, the current transaction 123 would have to be committed using:

curl http://localhost:25080/ingest/trans/123?abort=0 \
  -X PUT -H "Content-Type: application/json" \
  -d '{"auth_key":"SECURED"}'

6.9. Publishing the database

A database is allowed to be published only when no outstanding transactions exist in a scope of the database. Otherwise the System will refuse the publishing requests.

This would the last step in ingesting catalogs. Here is what's happening to the database at this stage:

  1. All data tables at all workers get flattened by eliminating MySQL partitions from the tables.
  2. The empty chunks list  (a special file or a database table at Qserv master node) is built.
  3. (optionally, if requested) The secondary index  table is consolidated at Qserv master node.
  4. Proper authorizations are set to allow Qserv MySQL account access the new database and relevant tables at all (including the master and workers) MySQL servers used by Qserv.
  5. CSS is configured to allow Qserv process user queries mentioning the new database.

Be advised that some of these steps (especially 1 and 3) may take a significant amount of time since they would (depending on the amount of data ingested into the database) require processing many tables.

Databases are published by calling a service explained in section REST service for publishing databases.

In a context of the current example, the current transaction 123  would have to be committed using:

curl http://localhost:25080/ingest/database/test101 \
  -X PUT -H "Content-Type: application/json" \
  -d '{"auth_key":"SECURED"}'

A result of the operation will be returned in a JSON  object having the same schema as explained in section REST service for registering new databases. The object will carry an updated state of the database.


In the current implementation of Qserv and the Replication/Ingest system there is a few minutes delay before a new database becomes fully available for user queries. An origin of this delay is still unknown. Please, take this into account before proceeding to further steps explained in this section. 

6.9.1. Testing

The first action to be taken once the database has been published is to run a simple query to count rows in the Object table. This is done by connecting (either directly via the mysql command line utility, or via the corresponding TAP service) and executing the following queries:

SELECT COUNT(*) FROM `test101`.`Object`;
SELECT * FROM `test101`.`Object` LIMIT 1;
SELECT * FROM `test101`.`Object` WHERE `objectId`=<known-id-reported-by-previous-query>;

Compare a number reported by the first query with the one expected in the input data set. The numbers should match.

6.9.2. Index management

The next action would be to add indexes to the ingested tables as explained in a separate document Managing indexes. Note that indexes are not created automatically.



7. Notes on ingesting regular (fully-replicated) tables

Regular (also known in Qserv as fully-replicated) tables do not have any chunks. A general assumption here is that each of these tables is small enough to afford having a complete copy of the whole table at every worker of a Qserv deployment. And because of that, it's going to be a responsibility of the workflows to ingest instances of such tables into each worker. This chapter explains how this could be done in the new system.

7.1. Registering regular tables in Qserv

The table registration process for this type of tables slightly differs from the one of the partitioned  tables. Let's assume we need to ingest a simple table Filter into database test101 , and that table has the following schema:

filterId TINYINT NOT NULL
filterName CHAR(3) NOT NULL
photClam FLOAT NOT NULL
photBW FLOAT NOT NULL

Then the corresponding table description object should look like this:

{"database":"test101",
 "table":"Filter",
 "is_partitioned":0,
 "is_director":0,
 "schema":[
  {"name":"filterId","type":"TINYINT NOT NULL"},
  {"name":"filterName","type":"CHAR(3) NOT NULL"},
  {"name":"photClam","type":"FLOAT NOT NULL"},
  {"name":"photBW","type":"FLOAT NOT NULL"}
 ],
 "auth_key":"SECURED"
}

Note that some of the attributes which were mandatory when registering the partitioned  table Object aren't present in the description of table Filter. Also, the table description does not  have special columns chunkId and subChunkId which are mandatory for the partitioned  tables.

7.2. Loading data

As in the case of the partitioned (chunked) tables, contributions into the tables could be loaded through a single or many super-transactions, as it's needed by a particular workflow. Though, it's probably more natural to do this via a dedicated transaction. It's very IMPORTANT here is to make sure that exactly the same set of rows is ingested into all workers.

One should also know that the Master Replication System has a special compensatory mechanism to ensure that all tables have a consistent state across all workers. It's implemented by running a periodic check verifying that all workers have the same copy of each table and reporting any discrepancies. The algorithm runs on all types of tables of all databases which are found in the published state. Though, the current implementation of this algorithm will not fix problems should it find them. It will just report them.

In order to know which workers exist in Qserv and what are the connection parameters of the Data Ingest Service of the workers, one has to use a service explained in Locate regular tables.

For example, if an identifier of the super-transaction is equal to 123 (the number is used in An example of a simple workflow) the request could be initiated like this:

curl http://localhost:25080/ingest/regular -X GET -H "Content-Type: application/json" -d'{"transaction_id":123}'

In this case, the service would return the following object:

{"success":1,
 "error":"",
 "error_ext":{},
 "locations":[
  {"worker":"db01","host":"qserv-db01","port":25002},
  {"worker":"db02","host":"qserv-db02","port":25002}
 ]
}

With that info in hands, and assuming all rows of the table are stored in a single CSV file input/Filter.txt, one would have to run these two commands to upload a copy of the file into each worker:

docker run --rm -t --network host -u 1000:1000 \
  -v /etc/passwd:/etc/passwd:ro \
  -v $PWD/input:/input/:ro \
  qserv/replica:tools \
  qserv-replica-file INGEST FILE qserv-db01 25002 123 Filter R /input/Filter.txt --auth-key=SECURED 

docker run --rm -t --network host -u 1000:1000 \
  -v /etc/passwd:/etc/passwd:ro \
  -v $PWD/input:/input/:ro \
  qserv/replica:tools \
  qserv-replica-file INGEST FILE qserv-db02 25002 123 Filter R /input/Filter.txt --auth-key=SECURED 

Pay attention to the mandatory parameters of the FILE sub-command:

  • 123 is a sample identifier of a super-transaction that is supposed to be STARTED before the operation
  • the letter R tells the service that this is the regular  table Filter 
  • and /input/Filter.txt is an internal  path to the CSV  file inside  the container (note that internal path /input maps to the host's path $PWD/input as specified in the container's run command)
  • the corresponding authorization key is specified via option --auth-key= 

More info in the file ingesting tool can be found in section qserv-replica-file INGEST.

8. Deleting incomplete databases

There will always be cases when one would want to stop ingesting a database and delete what's been ingested. A service documented in section Delete a database which is not fully ingested (not published) could be found handy in this case:

Here is an example based on the simple workflow presented in section An example of a simple workflow:

curl http://localhost:25080/ingest/database/test101 \
  -X DELETE -H "Content-Type: application/json" \
  -d '{"auth_key":"SECURED"}'

9. Advanced use cases

These are a few items to be covered:

  • Parallel ingests into tables (including contributions  into the same chunk) within a super-transaction. Mention limitations (the number of ingest services at the worker nodes). Mention the Replication System's API for pulling a configuration information for the system in order to determine how many workers and how many Ingest services are available.
  • Multiple super-transactions. Their benefits. Address questions regarding making contributions to chunks.
  • Building the secondary index. Why should this be done as the last step.

9.1. Multiple transactions

Limits on the number of transactions:

  • The total number of transactions per each instance of Qserv is limited by the unsigned 32-bit number. Transaction identifier 0 is reserved by the Ingest System. Hence the total number of transactions won't exceed 4294967295.
  • The total number of transactions per database is limited by 8192. This limit is imposed by the corresponding limit on the number of MySQL partitions  per table. In practice, opening more than 100 transactions per database wouldn't be a good idea due to various overheads imposed by the MySQL partitioning.

Increasing the overall stability (in a respect of failures) of workflows was one of the areas which was explicitly addressed in a design of the new system. A mechanism of so called super-transactions is the key technique here. The transactions have been already illustrated in section An example of a simple workflow. The current section takes it the next step further by explaining what other benefits one may gain with the technique.

Super-transactions were designed to encapsulate contributions (rows) into tables, regardless if these are the partitioned or regular tables. After a transaction had been committing all relevant contributions would stay in Qserv. Otherwise (if the transaction had been aborted), they would be removed.

The transaction abort won't revert all modifications made to tables. It will only remove rows ingested within the corresponding transaction. For instance, any tables created during transactions will stay in Qserv. Any chunk allocations made during transactions will also stay. Leaving some tables empty after this operation won't confuse Qserv even if the tables will remain empty after publishing the database.

When designing a workflow for a specific catalog or designing a general purpose workflow, one would always take into account possible failures which might arise during ingests. A good estimate on a probability of running into troubles would always help in making a right decision on how to structure the ingest. The general rules should be obvious:

  • If chances of running intro failures are low then it's better to divide an input dataset into larger portions and let each portion be ingested in a separate transaction.
  • Otherwise, having many smaller transaction may be the right way to go.

One may also consider building a self-tuned workflow which would make decisions on transactions dynamically based on a feedback from launching previous transactions. For example, a workflow may start with a few small transactions treated as probes and then gradually increase (or decrease) the amount of contributions per transaction.

By the way, the very same technique could be used to dynamically optimize a workflow to achieve higher performance.

Obviously, there are other factors which may drive the decision making process, such as:

  • Availability of the input data. For example, contributions into the catalogs may arrive in portions over an extended period of time.
  • The amount of temporary disk space for keeping intermediate products (partitioned CSV / TSV files) may be limited.
  • A configuration of Qserv in terms of the number of worker  nodes.

Transactions are also important in designing high-performance workflows relying on the Parallel ingests option.

9.1.1. What's a reasonable number of transactions per catalog ingest?

As it was already mentioned in the parent section the absolute cap for the number of transactions per Qserv instance is equal to 4294967295.

There is another limit to consider. Super-transactions directly map to MySQL table partitions. Each such partition is represented by 2 files in a worker's filesystem where the corresponding table resides. In the extreme case the number of files representing chunked tables would be roughly equal to the total number of chunks multiplied by the number of transactions open per catalog. For example, if there are 150,000  chunks in a catalog and 10  transactions were open during the catalog ingest then the total number of files spread across all workers would be on the order of 3,000,000. If the number of workers is equal to 30  then there would be 100,000  files per worker's filesystem, all in a single folder. In reality things may not look that bad because the chunks-to-transactions "matrix" would be rather sparse, and the actual number of files per directory would be x10  times smaller. Besides, all MySQL table partitions will be eliminated during the catalog publishing  phase. After that each table will be represented (assuming the MyISAM  engine) with the usual 3 files (<table-name>.frm , <table-name>.MYD , and <table-name>.MYI). Nevertheless, it would be a good idea to avoid opening (say) thousands of transactions per each catalog ingest, even though the hardware limit for the MySQL partitions per table is 8192.

9.1.2. Multiple contributions into the same table

When ingesting rows into a table (either a  partitioned or the regular one) a workflow doesn't need to do this in one step from an exactly one input file. The new ingest system allows adding rows as contributions  from many files, where each such file is a contribution. Moreover, the contributions could be made from different transactions. Alternatively, multiple contributions could be ingested within the same transaction. It's up to a workflow to do a proper bookkeeping of what's been ingested into each table and within which transaction, and which transactions failed (if any).

9.1.3. Transaction identifiers in result sets of queries

When launching queries against tables ingested via the new system one may see a special column qserv_trans_id. For example:

SELECT `qserv_trans_id`,`objectId`,`ra`,`dec`,`chunkId`,`subChunkId` FROM test101.Object LIMIT 2;
+----------------+-------------------+-------------+------------+---------+------------+
| qserv_trans_id | objectId          | ra          | dec        | chunkId | subChunkId |
+----------------+-------------------+-------------+------------+---------+------------+
|            113 | 42648956529830898 | 337.0639945 | -0.0095519 |  115555 |         26 |
|            113 | 42648956529830904 | 337.0187488 | -0.0087719 |  115555 |         25 |
+----------------+-------------------+-------------+------------+---------+------------+

This information may be potentially used by the post-ingest QA-ing of catalogs. 

9.2. Parallel ingests

This section covers some parallel ingest scenarios which may increase the overall performance of a workflow.

9.2.1. Ingesting ALL chunks independently (single transaction)

Workflow organization:

  • All chunk contributions are ingested within the same common  transaction.
  • Each chunk is allocated and ingested independently of the others.
  • The common  transaction is committed when all contributions have been successfully uploaded.

Here is a diagram illustrating the idea:

one_transaction_parallel_allocation_parallel_chunks

Things to consider (keep in mind):

  • The chunk allocation operations are serialized in the current version of the system. The may introduce indirect synchronization  between parallel chunk-specific ingests. The total latency incurred by such synchronization is a latency of allocating one chunk multiplied by the number of chunks.
  • The proposed scheme may not be very efficient if the number of chunks is large (heuristically, many thousands) while chunk contributions are small. In this case latency of the chunk allocation requests along with a non-negligible startup time of the Docker container (with the file ingest tool) would be the most significant factor limiting the performance of a workflow.

Best use:

  • When the number of chunks is small and the amount of data to be ingested into each chunk is large, or if the number of files to be ingested into each chunk is large.

9.2.2. Ingesting ALL chunks independently (multiple transactions)

This is an extension of the previous use case Ingesting ALL chunks independently (single transaction).

Workflow organization:

  • All chunk contributions are ingested within multiple independent  transactions.
  • All contributions of a particular chunk are ingested via the same transaction.
  • Each chunk is allocated and ingested independently of the others.
  • The transactions are committed independently when all contributions to the corresponding chunks have been successfully uploaded.

Here is a diagram illustrating the idea:

many_chunks_many_transactions

Things to consider (keep in mind):

  • Even though this scheme assumes that each chunk is "assigned" to some transaction this is not strictly required. The system allows allocating the same chunk and ingesting contributions into that chunk from any (or many transactions). Just make sure not  to ingest the same set of rows (the same set of contributes) within more than one transaction. Though, the very same rule applies to any workflow anyways.

Best use:

  • When a workflow ingests large amount of data which can be separated into independently ingested groups based on chunks. Remember that transactions provide a mechanism for mitigating failures. 

9.2.3. Scattered ingest of chunk contributions within multiple transactions

This is an extension of the previous use case Ingesting ALL chunks independently (multiple transactions).

Workflow organization:

  • Contributions into chunks are scattered between multiple independent transactions.
  • Chunks are allocated within each transaction as needed if there are contributions assigned to the transaction.
  • The  transactions are committed independently when all contributions assigned to the corresponding transactions have been successfully uploaded.

Here is a diagram illustrating the idea:

scattered_chunks_many_transactions

Best use:

  • When inout data fed into a workflow have spatial overlaps which need to be processed separately, or if it's more convenient for the workflow to process its input as separate streams of data. 

9.3. Optimizations in using the REST services

As a general rule, when designing a workflow one has to avoid overloading the REST services with repeated or inefficient requests. It would be always a good idea (when this is possible) to make certain requests requests once, and reuse their results. This is especially true for workflows designed for the parallel ingests. In that case results of some requests could be shared among parallel activities (processes, etc.) of the workflows. It's beyond a scope of this section to cover all possible optimizations which may be employed when interacting with the services. It's up to a developer of a workflow to decide what can and what can't be cached or shared based on a progressive state of an ingested catalog and on an organization of the workflow. Though, some of the most useful techniques are presented below.

9.3.1. Batch mode for allocating chunks 

In an example of the simple workflow presented earlier in the document, chunk allocations were made on the per-chunk basis (see section Allocating chunks for details). Even though it works well for scenarios where the number of chunks is small, employing the very same technique may potentially slow down the overall performance of workflows ingesting large numbers of chunks or making many requests to the chunk allocation service. This is because the chunk allocation operations may be potentially expensive, especially for Qserv setup where  large number of chunks had already been deployed. In the later case chunks allocation requests may take a non-negligible amount of time. To mitigate  this problem the system offer an additional service for allocating batches of chunks as explained in the documentation for service Allocate/locate chunks of the partitioned tables.

In a context of the earlier presented An example of a simple workflow the request object would look like:

{"transaction_id":123,
 "chunks":[187107,187108,187109,187110],
 "auth_key":"SECURED"
}

The result could be reported as:

{"success":1,
 "error":"",
 "error_ext":{}
 "location":[
  {"chunk":187107,"worker":"db01","host":"qserv-db01","port":25002},
  {"chunk":187108,"worker":"db02","host":"qserv-db02","port":25002},
  {"chunk":187109,"worker":"db01","host":"qserv-db01","port":25002},
  {"chunk":187110,"worker":"db02","host":"qserv-db02","port":25002}
 ]
}

The request could be made once, and its results could be distributed between parallel activities of the workflow to ingest the corresponding chunk contributions.

When designing a workflow one may consider partitioning input data as a separate stage. This stage could itself be made parallel. A benefit of this approach would that all possible chunk numbers would be known in the end of the stage. After that a single request to the batch chunk allocator could be made. Results of the request could be used for planing the next stage - the actual ingesting of chunk contributions into Qserv.


9.4. Building the "secondary index"

There are two approached for building the secondary index  during catalog ingests:

  • let the ingest system do it automatically
  • do it manually via a dedicated REST service

By default (unless instructed differently as explained later in this section) the System will build the index automatically. Here is what's happening in this case during a typical workflow (assuming the current MySQL's InnoDB engine's table implementation of the index):

While the automated build process is handy for most workflows, some users may still benefit from deferring this operation till some later stage. This may have the following benefits:

  • The commit time of the super-transactions may become noticeably shorter.
  • The transaction commits may become more reliable as there are (potential) many ore additional failure modes associated with building the secondary indexes. This includes:
    • Failures when conducting additional operations for harvesting input data for the index from multiple (or all) chunks of a catalog spread across multiple (or all) workers of a setup.
    • Running out of space on an underlying filesystem a database service used to store the indexes is located.
    • Incorrect values of the object identifiers (such as duplicated values) ingested as a chunk contribution during a super-transaction.

9.4.1. Building the index manually

There are two basic steps that need to be taken when choosing the manual build. First of all, one has to tell the Ingest system that the index would be built manually. This is done at the database registration step by setting the following parameter in the database description:

{...
 "auto_build_secondary_index":0,
 ...
}

For example:

curl http://localhost:25080/ingest/database \
  -X POST -H "Content-Type: application/json" \
  -d '{"database":"test101","auto_build_secondary_index":0,"num_stripes":340,"num_sub_stripes":3,"overlap":0.01667,"auth_key":"SECURED"}'

That would be enough to prevent any further actions which would be undertaken otherwise by the System in a respect to the operation with the index.

The second step would be to make a request to the index build service. This request could be made at any time of a catalog lifecycle (normally when no super-transactions are being open). Also, the service allows re-building the index at any stage if needed. Just keep in mind that the content of the index would be based on the object existing in the catalog at a time when such a request was made. Here is an example:

curl http://localhost:25080/ingest/index/secondary \
  -X POST -H "Content-Type: application/json" \
  -d '{"database":"test101","allow_for_published":0,"rebuild":0,"local":0,"auth_key":"SECURED"}'

There are three optional attributes (allow_for_publishedrebuild and local) in the request. The attributes are explained in the documentation for the index build service. Please, read that document for further information on this subject. These parameters are important for most workflows.



9.5. Ingesting files directly from workers

There are two alternatives to ingesting files from a client's local filesystem using the previously explained binary tool qserv-replica-file INGEST. The ingest workers can ingest files directly from the local filesystems mounted at the corresponding worker nodes, or pull files from a remote object store via the HTTP / HTTPS protocol. In both cases, a client doesn't need to send the file content to the ingest service. Each ingest worker has a built-in REST server providing the Ingesting a single file service that addresses both the above-mentioned scenarios.

Worker-specific locations (the DNS hostnames and the port numbers) of these REST services are reported by requests for the metadata Allocate/locate chunks of the partitioned tables and Locate regular tablesJSON objects returned by each of the services have two additional attributes:

attr
http_host
The DNS name or IP address of the worker's REST server
http_port
The port number of the worker's REST service

The connection parameters are meant to be used to formulate ingest requests which could look, for example, like this:

curl http://qserv-worker12:25004/ingest/file -X POST -H 'Content-Type: application/json' -d'{...}'

Two subsections illustrate each of the methods for ingesting files directly by the worker ingest services.

9.5.1. Ingesting from a local filesystem of a worker

Please, make sure a filesystem is seen from inside the Replication System worker's Docker container.

Here is the schematic diagram of this ingest option:

Ingest from local file at worker

Let's suppose there are two  CSV  files located at the following paths as seen by the worker qserv-worker12:

/datasets/catalogs/test101/tables/Object/partitioned/chunk_123.txt
/datasets/catalogs/test101/tables/Object/partitioned/chunk_123_overlap.txt

These files have typical names for files made by the partitioning tool. The first one represents a contribution to the chunk 123 of the partitioned table Object. The second one - into the corresponding overlap table of the same chunk. Here is an example illustrating using the out-of-box tool curl  for ingesting these into Qserv (the transaction identifier, the name of a database, the name of a table, as well as connection parameters of the worker's ingest service have been made for this example):

curl http://qserv-worker12:25004/ingest/file -X POST -H 'Content-Type: application/json' \
  -d'{"transaction_id":45, "table":"Object", "column_separator":",", "chunk":123, "overlap":0, \
      "url":"file:///datasets/catalogs/test101/tables/Object/partitioned/chunk_123.txt", \
      "auth_key":"CHANGEME"}'

curl http://qserv-worker12:25004/ingest/file -X POST -H 'Content-Type: application/json' \
  -d'{"transaction_id":45, "table":"Object", "column_separator":",", "chunk":123, "overlap":1, \
      "url":"file:///datasets/catalogs/test101/tables/Object/partitioned/chunk_123_overlap.txt", \
      "auth_key":"CHANGEME"}'

9.5.2. Ingesting from an object store via the HTTP/HTTPS protocol

Please, be advised that for some remote sources responding to the encrypted https protocol, extra steps for configuring the Ingest system for a database may be required. There are two REST services provided for that purpose by the Master Replication Controller:

In this scenario files to be ingested are located at a Web server (an object-store, etc.) which is accessible from a worker node. Here is the schematic diagram of this ingest option:

Ingest from an object-store

Here is an example illustrating using the out-of-box tool curl  for ingesting 2 files into Qserv using this method:

curl http://qserv-worker12:25004/ingest/file -X POST -H 'Content-Type: application/json' \
  -d'{"transaction_id":45, "table":"Object", "column_separator":",", "chunk":123, "overlap":0, \
      "url":"http://data-host:8080/datasets/catalogs/test101/tables/Object/partitioned/chunk_123.txt", \
      "http_method":"GET", \
      "auth_key":"CHANGEME"}'

curl http://qserv-worker12:25004/ingest/file -X POST -H 'Content-Type: application/json' \
  -d'{"transaction_id":45, "table":"Object", "column_separator":",", "chunk":123, "overlap":1, \
      "url":"http://data-host:8080/datasets/catalogs/test101/tables/Object/partitioned/chunk_123_overlap.txt", \
      "http_method":"GET", \
      "auth_key":"CHANGEME"}'

Another example would be for ingesting a file to be pulled over the SSL-encrypted protocol https from a host that doesn't have a valid peer certificate. In this case one may want to turn off certificate validation by configuring the Ingest system for the given database prior to attempting the file ingests:

curl http://qserv-master01:25080/ingest/config -X PUT -H 'Content-Type: application/json' \
  -d'{"auth_key":"CHANGEME", "database":"test101", "SSL_VERIFYPEER":0}'

curl http://qserv-worker12:25004/ingest/file -X POST -H 'Content-Type: application/json' \
  -d'{"transaction_id":45, "table":"Object", "column_separator":",", "chunk":123, "overlap":0, \
      "url":"https://data-host:8080/datasets/catalogs/test101/tables/Object/partitioned/chunk_123.txt", \
      "http_method":"GET", \
      "auth_key":"CHANGEME"}'
...

Note, those configuration parameters (of the ingest) are expected to be set (if required) globally for a catalog. Any subsequent operations with pulling remote files will be instantaneously affected by the most recent settings. Changes made to the configuration won't affect on-going file transfers, although they will affect the queued file uploading requests. To avoid any ambiguity with that one has to ensure all parameters are configured in the system before attempting file ingests.

Normally, pulling any data from untrusted hosts is not a good idea. One should always exercise caution and make sure the data source doesn't have any harmful data.

9.5.3. Using extended error reporting to optimize workflows

It's not so uncommon to see intermittent problems while attempting to pull data from remote sources (Web servers, or distributed filesystems). Sometimes these problems are caused by network glitches, overloaded servers, or some other issues that would normally go away at the next attempt to pull the same data. A standard approach to handling this kind of problem would be to abort the corresponding transaction and re-ingest the same data again. Though it may work perfectly well for ingesting small amounts of data, the very same approach may result in a significant loss of efficiency for the large-scale ingests. However, restarting a transaction is not the only option here. The truth is that a failure to pull data from a remote source is still harmless since it leaves no "side effects" (modifications to the target table). Hence a more efficient approach would be to simply repeat the same ingest w/o restarting a transaction. In order to help workflow developers to make the best decision, the Ingest system offers extended error reporting on such events. What's reported by the system may be used by the developers in determining what would be the best course of action after seeing a specific failure mode. This section explains what to expect in case of failures and how to treat extended error objects returned by the worker Ingest service.

The general error reporting for the REST services is explained in the section Error reporting. The document mentions the extended dictionary error_ext that has a request-specific payload clarifying failures. In case of failures the worker ingest service this dictionary may have the following fields:

{"success":0,
 "error":<string>,
 "error_ext":{
   "http_error":<number>,
   "system_error":<number>,
   "retry_allowed":<number>
 }
}

(warning) These 3 extra attributes may not always be present in all failure modes. If the dictionary is empty, or if the attribute retry_allowed is not present in there then one has to assume that the last request resulted in changes made to the target table.

The table below explains these attributes:

attributedescription
http_error
HTTP response code (where applies). The default value of the parameter is 0, which means that the operation didn't have any specific response. The code of 200  means the successful completion. Any value at 400  or higher means an error.  
system_error
A value of the UNIX errno captured at a point where a problem occurred. 
retry_allowed
Values of the attribute are treated as boolean flags. A value of 0 (or if the attribute wasn't found in the dictionary) means a catastrophic failure of the corresponding operation requiring to abort a transaction. Any other value means that it's safe to retry ingesting the same table contribution w/o restarting the transaction.






10. Information requests

This section covers additional queries to the System which could be used to improve robustness and efficiency of the workflows.

10.1. Status of transactions

The status of a transaction (or many transactions) can be obtained via services documented in the section Get info on transactions. There are two services in this group. The first one will return the information on many transactions in the scope of databases selected via optional filters passed via the request's query. The other service is meant to return information on a single transaction give a unique identifier of the transaction.

These are just a few ideas where the services may be found useful:

  • Figuring out the status of a transaction in case if recovery for a failure is needed. This may be handy in case if an ingest workflow starts multiple transactions and need to abort all on-going transaction before re-attempting data uploading operations associated with some transactions.
  • Monitoring the progress of transactions in workflows opening multiple transactions.
  • Building a Web dashboard.

10.1.1.  Transaction contexts

When starting (or finishing a transaction) using the corresponding services (see Starting transactions or Committing/Aborting transactions) a workflow may also attach a piece of arbitrary information (the JSON object) to the transaction. The object will be stored within the Ingest system's internal database and be associated with the transaction. The object could be as large as 16MB. Normally the context object is passed along the transaction start request. However, the object may also be set/updated when aborting or committing a transaction.

The workflow may use the contexts for the following reasons:

  • for performance analysis
  • to improve/automate the recovery process in case of failures
  • for the purpose of the internal bookkeeping that would be independent of the user workflow's infrastructure or environment.
  • to be used as a source of metadata for data provenance systems.

Obviously, the workflow implementation may have its own mechanism for that. And it probably should do so. However, having the internal storage for the transaction contexts has a few important benefits. In particular, it guarantees consistency between transactions and the contexts. Secondly, it provides the precise timing for the ingest operations (the start and finish times are measured by the Ingest system at the right moments of time). Thirdly, the information may be seen from the general-purpose Web Dashboard application of Qserv. And it could also be used by the database support teams for building various metrics on the performance of the Qserv Ingest system.

10.1.2. Transaction contributions

The worker Ingest Services also records each attempted table contribution (whether it was successful or not) in its internal database. These records are associated with the corresponding transactions. A workflow may also request these records when requesting info on the transactions. Please, read the documentation on the transaction information services in the section Get info on transactions for further details.

All use cases that were mentioned earlier for the transaction contexts also apply to the transaction contributions.

10.2. Databases and database families

The objects returned by the service are quite complex. It's recommended  to use Python-based automation tools for launching requests of this kind and exploring results of the operations.

The configuration retrieval services provided by the Replication System and partially documented at Configuration Retrieval Service for Databases and database families are meant to return (among other things) a status of catalogs (databases) and so called database families managed by the system. Knowing what's already exist in Qserv may affect certain aspects of the ingest workflows, such as partitioning parameters of the catalogs to be ingested. There are 3 parameters passed into the partitioning tool which require a special attention:

  • num-stripes
  • num-sub-stripes
  • overlap

Choosing right values of the parameters is important, and not only for a purpose of optimizing a distribution of tables within Qserv (in a respect of the total number of chunks, sizes of those tables, sizes of the overlap  areas of the tables), but also to allow JOIN in queries involving other similar catalogs. Databases which are configured exactly in the same way as others are said to belong to the same database family. The Replication system makes a special treatment of databases - members of the same family, to ensure chunks of different databases are placed at the same workers. This allows Qserv to correctly implement queries involving two or many members of the same database family.

It's always the right thing to do before ingesting a new catalog into Qserv to decide to which family the new catalog should belong to. If a mistake is made here then the input data of the catalog would have to be repartitioned and another attempt to ingest the catalog would have to be made.

Other possible use for the service would be:

  • A quality control checking on the on-going/completed ingests.
  • Building information dashboards, reporting tools, etc.

This topic is covered in more details in Section 3.2 of Deploying, configuring and operating the Qserv Replication system.

Here is an example of using the service (Configuration Retrieval Service for Databases and database families) to pull info on a new catalog after it's ingested as explained in sections An example of a simple workflow and Notes on ingesting regular (fully-replicated) tables:

curl http://localhost:25080/replication/config -X GET -H "Content-Type: application/json"

This would return the following result:

{"success":1,
 "error":"",
 "error_ext":{},
 "config":{
   "general":{..},
   "workers":{..},
   "families":[
     {"name":"production",
      "min_replication_level":1,
      "num_stripes":340,
      "num_sub_stripes":3,
      "overlap":0.00167,
      "databases":[
        {"name":"test101",
         "family":"production",
         "is_published":1,
         "director_table":"Object",
         "director_table_key":"objectId",
         "chunk_id_key":"chunkId",
         "sub_chunk_id_key":"subChunkId",
         "tables":[
           {"name":"Object",
            "is_partitioned":1,
            "latitude_key":"ra",
            "longitude_key":"dec"
           },
           {"name":"Filter",
            "is_partitioned":0,
            "latitude_key":"",
            "longitude_key":""
           }
         ],
    ...
}

10.3. Key configuration parameters of the Ingest System

there two which could be fond handyThis section piggybacks on the REST service and techniques presented in the previous section Databases and database families and the The REST service for pulling configuration records on databases and database families of the current document. Please, read that section before reading the rest of the current section.

Among other parameters returned by the previously explained service, there are two parameters which could be found handy for configuring workflows:

  • The number (and status) of workers
  • The number of the loading threads at the Data Ingest service run at each worker.

This information is found in the following fields of a JSON object returned by the service (only a subset of the relevant fields is shown):

{...
 "config":{
   "general":[
     {"parameter":<string>,
      "value":<string>,
      "description":<string>,
      ...
     },
     ...
   ],
   "workers":[
     {"name":<string>,
      "is_enabled":<number>,
      "is_read_only":<number>,
      ..
     },
   ].
   ...
 }
}

The number of workers available for ingests is derived by counting worker descriptions in the "workers" collection where each eligible for ingest worker should have is_enabled=1 and is_read_only=0.

And the number of per-worker threads could be obtained from field "value" where parameter="WORKER_LOADER_NUM_PROC_THREADS" in one of the objects at the "general" collection.

For example, the object could look like this:

{"success":1,
 "error":"",
 "error_ext":{},
 "config":{
   "general":[
     {"parameter":"WORKER_LOADER_NUM_PROC_THREADS",
      "value":"16",
      "description":"The number of request processing threads in each worker's catalog ingest server.",
      ...
     },
     ...
   ],
   "workers":[
     {"name":"worker01",
      "is_enabled":1,
      "is_read_only":0,
      ..
     },
     {"name":"worker02",
      "is_enabled":1,
      "is_read_only":0,
      ..
     },
   ],
   ...
 }
}

TBC...


11. The API Reference

11.1. The REST services

11.1.1. General guidelines

All services require the following HTTP header to be sent with requests if  a service expects a non-empty JSON  object in the request's body:

HTTP header of a request
Content-Type: application/json 

When requests are sent using the command line application curl then the following option must be used:

-H "Content-Type: application/json"

In this case a JSON object can be specified using one of the following methods:

echo '{...}' | curl <url> -X <method> -H <header> -d@-
curl <url> -X <method> -H <header> -d '{...}'

Where '{...}'  represents a JSON object with details of the request. The object may not be required for some requests. Specific requirements for tis will be mentioned in each service. If the object is not required for a for particular request then the body is allowed to be empty, or it could be an empty JSON  object '{}'.

All (no exception) services return results and errors as JSON  objects as explained in the next subsection below.

11.1.1.1. Error reporting

All services explained in the document adhere to the usual conventions adopted by the Web community for designing and using the REST APIs. In particular, HTTP code 200 is returned if a request is well formed and accepted by the corresponding service. Any other code shall be treated as an error. However, the implementation of the System further extends the error reporting mechanism by guaranteeing that all services did the fine-grain  error reporting in the response objects. All services of the API are guaranteed to return a JSON object if the HTTP code is 200. The objects would have the following mandatory attributes (other attributes depend on a request):
{"success":<number>,
 "error":<string>,
 "error_ext":{...},
 ...
}

Therefore, even if a request is completed with HTTP code 200, a client (a workflowmust inspect the above mentioned fields in the returned object. These are the rules for inspecting the status attributes:

  • A successful completion of a request is indicated by having success=1 in the response. In these case the other two fields should be ignored.
  • Otherwise, a human readable explanation of a problem would be found in the error field.
  • Request-specific extended information on errors is optionally provided in the error_ext field.

Note that, depending on a service, additional attributes may be present in the object.

11.1.1.2. Obtaining the current version of the API


Please, note that the versioning mechanism presented here applies to both the metadata services (the REST API) and the loading tools. Changes in any would result in a change in the version number reported by the service explained hereafter.

To ensure a workflow implementation matches a version of the Ingest API the REST server provides a special metadata  service for retrieving the version: 

methodresource name
GET/meta/version 

The request object for for this request is not required, or it could be an empty JSON  object {}. In case of its successful completion (see Error reporting in the REST API) the service will return a JSON object with attribute version:

{...
 "version":<number>
}

Where, a value of attribute version represents the current version of the API.

11.1.2. Database and table management

11.1.2.1. Register a new database

New databases are registered in the system for ingests using the following service:
methodresource name
POST/ingest/database 

The service requires a JSON  object of the following schema:

{"database":<string>,
 "num_stripes":<number>,
 "num_sub_stripes":<number>,
 "overlap":<number>,
 "auto_build_secondary_index":<number>,
 "local_load_secondary_index":<number>,
 "auth_key":<string>
}

Whee attributes are all mandatory:

attributedescription
database
The name of the database to be created.
num_stripes
The number of stripes.
num_sub_stripes
The number of sub-stripes.
overlap
The overlap radius.
auto_build_secondary_index
The optional attribute indicating the desired mode for building the secondary index of a catalog. The default value of the attribute is 1 which would tell the System to build the index automatically. If a value of 0  is passed into the service the index won't be built, and it will be up to a user implementing a workflow to build it using the secondary index building service.
local_load_secondary_index

The optional attribute affecting a mode of building the secondary index. The attribute has numeric values treated as the boolean flag.

Please refer to a reference document for the REST service (Re-)building the "secondary index" index for a detailed explanation of this parameter. The parameter is named local  in the index builder documentation.

The default value of the option is 0.

auth_key
An authorization key as it's configured in the Qserv Replication/Ingest system.  

Values of attributes num_stripesnum_sub_stripes and overlap are expected to match the corresponding partitioning parameters used when partitioning all partitioned tables of the new database. Note that the current implementation of the system will not validate contributions to the partitioned tables to enforce this requirement. Only the structural correctness may be checked. It's up to a workflow to ensure the data ingested into tables are correct.

If the operation successfully finished (see Error reporting in the REST API) a JSON object returned by the service will have the following attribute:

{...
 "database":{...}
}

The database key in that object will report various configuration info on the database. We're not going to discuss this in detail here. 

11.1.2.2. Publish a database that has been ingested

Databases are published by calling this service:
methodresource namequery
PUT/ingest/database/<database-name>
consolidate_secondary_index={0|1}

Where, the name of the database is provided as parameter <database-name> of the resource. There is an optional parameter consolidate_secondary_index  to be provided in the request's query. If a value of the parameter is not 0  then the System will also flatten the secondary index  table by eliminating MySQL partitions from the table. By default this won't happen. And there are few reasons to request the index consolidation. So, it's safe to ignore the parameter.

An object to be sent in the request's body has a single attribute:

{"auth_key":<string>
}

11.1.2.3. Delete a database or a table

These services can be used for deleting non-published (the ones that are still ingested) as well as published databases, or tables, including deleting all relevant persistent structures from Qserv (such as CSS records, etc.):
methodresource name
DELETE
/ingest/database/<database-name> 
DELETE
/ingest/database/<database-name>/<table-name>

To delete a non-published database (or a table from such database) a client has to provide the normal level authentication key auth_key in a request to the service:

{"auth_key":<string>
}

The name of the databases affected by the operation is specified at the resource's path.

Deleting databases (or tables from those databases) that have already been published requires a user to have elevated administrator-level privileges. These privileges are associated with the authentication key admin_auth_key to be sent with a request instead of auth_key:

{"admin_auth_key":<string>
}

Upon successful completion of the request (for both above-mentioned states of the database), the service will return the standard response as explained in section Error reporting in the Rest API. After that, the database (or the table, depending on a scope of a request) name can be reused for further ingests if needed.

11.1.2.4. Get info on databases and database families

The following service allows pulling all configuration information on the system, which would also include databases and database families:
methodresource name
GET/replication/config 

The request object for for this request is not required, or it could be an empty JSON  object {}.

Upon successful (see  Error reporting in the REST API) completion of the request the service will return a  object will have the following schema (of which only the database and database family related fields are shown):

{...
 "config":{
   "general":{..},
   "workers":{..},
   "families":[
     {"name":<string>,
      "min_replication_level":<number>,
      "num_stripes":<number>,
      "num_sub_stripes":<number>,
      "overlap":<number>,
      "databases":[
        {"name":<string>,
         "family":<string>,
         "is_published":<number>,
         "director_table":<string>,
         "director_table_key":<string>,
         "chunk_id_key":<string>,
         "sub_chunk_id_key":<string>,
         "tables":[
           {"name":<string>,
            "is_partitioned":<number>,
            "latitude_key":<string>,
            "longitude_key":<string>
           },
           ...
         ],
         "columns":{...}
        },
        ...
      ]
     },
     ...
   ]
 }
}


11.1.2.5. Register a new table

Tables (regardless of their type - partitioned  or regular) are registered using the following service:
methodresource name
POST/ingest/table 

Where a JSON  object sent to the service with the requests will describe that table. This is a schema of the object for the partitioned  tables:

{"database":<string>,
 "table":<string>,
 "is_partitioned":<number>,
 "chunk_id_key":<string>,
 "sub_chunk_id_key":<string>,
 "is_director":<number>,
 "director_key":<string>,
 "latitude_key":<string>,
 "longitude_key":<string>,
 "schema":[
   {"name":<string>,"type":<string>},
   ...
 ],
 "auth_key":<string>
}

A description of the regular  tables has a fewer number of attributes (attributes which were required for the partitioned  tables are missing):

{"database":<string>,
 "table":<string>,
 "is_partitioned":<number>,
 "is_director":<number>,
 "schema":[
   {"name":<string>,"type":<string>},
   ...
 ],
 "auth_key":<string>
}

Also note that the table schema does not  include definitions of indexes. Those are managed separately after the catalog will be published. The index management interface is documented in a dedicated document Managing indexes.

This table explains attributes:

attributerequireddescription
databasebothThe name of an existing database.
tablebothThe name of a new table to be registered.
is_partitionedbothThe type of a table. Allowed values: 1  for partitioned tables, 0  for the regular  tables.
chunk_id_keypartitionedThe name of a column representing chunk numbers. Usually it's chunkId.
sub_chunk_id_keypartitionedThe name of a column representing sub-chunk numbers. Usually it's subChunkId.
is_directorboth

The flag indicating if it's a so called director table. There is only one such table in a catalog. And this one of the partitioned  tables. The type of a table. Allowed values: 1  for director tables, 0  for any other tables.

There is a bug in the current implementation of the service which requires this flag to be present in descriptions of the regular  tables. The requirement will be lifted after the bug will be fixed. At the mean time a value of the flag is ignored (though, the flag itself must be present in the table description) for the regular  tables.

director_keypartitionedThe name of a column representing an object identifier in the director  table. 
latitude_keypartitionedThe name of a column in a partitioned  table representing latitude.
longitude_keypartitionedThe name of a column in a partitioned  table representing longitude.
schemaboth

An object representing table schema.

  • The partitioned  tables are required to have columns specified in above stated parameters chunk_id_key, sub_chunk_id_key, director_keylatitude_key and longitude_keys.
  • Do not  use quotes around the names or type specifications.

Where:

attributedescription
nameThe name of a column.
type

The type of a column. For example: 

INT UNSIGNED NOT NULL
auth_keybothA value of an authorization key.

In case of successful completion (see Error reporting in the REST API) the service is expected to return a JSON object with an updated description of the database:

{...
 "database":{...}
}

Where the database key in that object will report various info on the database and its tables.

11.1.2.5.1. Notes on the table names

And these are some notes on the naming conventions (restrictions) for the tables:




11.1.3. Transaction management

11.1.3.1. Start a transaction

Transactions are started by this service:
methodresource name
POST/ingest/trans 

The request expects a JSON  object describing a scope of the new transaction:

{"database":<string>,
 "context":<object>,
 "auth_key":<string>
}

Where:

attributerequireddescription
database
yesThe name of a database associated with the transaction
context
optional

The optional data to be stored within the Ingest system for the transaction. This information could be used later for recovering from errors during the ingest, as well as for general bookkeeping, data provenance, visualization purposes, etc. A value of this attribute, if provided, must be a JSON object. The object could be empty. The default value assumed by the server would be the empty object: 

{ }

(warning) The current implementation of the Ingest system limits the size of the context object by 16 MB.

auth_key
yesThe normal authorization key.

In case of successful completion (see Error reporting in the REST API) the service is expected to return a JSON object with a description of the new transaction:

{...
 "databases":{
   <database-name>:{
     "transactions":[{
       "id":<number>,
       "database":<string>,
       "state":<string>,
       "begin_time":<number>,
       "end_time":<number>,
       "context":<object>
     }],
     ...
   },
   ...
 }
}

Where the most important attribute in the object is a unique identifier of the transaction represented by key id. And the state of the new transaction will be set to STARTED. The end_time  will be 0.  A value of attribute context will be the same one that was provided on the input to the service, or the default value if none was provided.


11.1.3.2. Commit or abort a transaction

In order to abort a transaction one has to use the following service:
methodresource namequery
PUT/ingest/trans/<transaction-id>
abort=1

Transactions are committed using the following service:

methodresource namequery
PUT/ingest/trans/<transaction-id>
abort=0

A unique identifier of the transaction is passed into the service in the resource's path parameter <transacton_id>. The only parameter of the request query will be:

parameterrequireddescription
abort
yesThe mandatory parameter telling the service to abort (the value equal to 1), or commit (the value equal to 0) a transaction.

Other parameters defining a request are  passed via the request's body:

{"context":<object>,
 "auth_key":<string>
}

Where:

attributerequireddescription
context
optional

The optional data to be stored within the Ingest system for the transaction. This information could be used later for recovering from errors during the ingest, as well as for general bookkeeping, data provenance, visualization purposes, etc. A value of this attribute, if provided, must be a JSON object.

(warning) A value provided in the attribute will replace the previously stored (if any) context. Do NOT use this parameter if you don't want the context put into the System at the start time of the transaction. The object is allowed to be empty:

{ }

(warning) The current implementation of the Ingest system limits the size of the context object by 16 MB.

auth_key
yesThe normal authorization key.

Upon successful completion of either request (see section Error reporting in the REST API) the service would return an updated status of the transaction in a JSON  object as it was explained in section Starting a transaction. If a request to abort a transaction was made the transaction will be found in the ABORTED  state. Otherwise, the transaction's state would be set to FINISHED. It's also safe to repeat either of those operations. The System will complain if it will found the transaction in the ABORTED or FINISHED states.

11.1.3.3. Get info on transactions

There are two services in this group. The first one will return the information on many transactions in the scope of databases selected via optional filters passed via the request's query:
methodresource namequery (optional)
GET/ingest/trans 
database=<name>
family=<name>
all_databases={0|1}
is_published={0|1}
include_context={0|1}
contrib={0|1}
contrib_long={0|1}

The request object for for this request is not required, or it could be an empty JSON  object {}. And these are optional filters accepted in the request's query by the service:

parametermeaning
database 
The name of a database for which the transactions were open. If the parameter is present and if it's not empty then the other three parameters will be ignored, even if they were provided.
family 
The optional name of the database family. If the parameter is missing or if it's empty then all databases will be conditionally included in the search. Otherwise, only a subset of databases - members of the specified family will be considered. Note, that in either case two additional filters (optional parameters all_databases and is_published) may limit the final list of databases to be searched for transactions.
all_databases 

The optional flag which is used for further filtering of databases selected by the parameter family. The default value of the parameter is set to 0 telling the server that the parameter is_published should be used to further filter database selection to the desired subset. Any other value would mean no additional filters (hence ignoring is_published), hence including all databases selected by the parameter family.

is_published 
The optional flag is used only if enabled by setting the previous parameter all_databases=0. The default value of the parameter is 0, which narrows the database selection to those which have not been published. Any other value would select the published databases.
include_context

The optional flag telling the server to include the transaction context object into the report for each transaction. See the documentation on services Starting transactions or Committing/Aborting transactions for further details. The default value of the flag is 0  (meaning - don't include the context).

(warning) Be aware that potentially each context object could be as large as 16 MB. So, enable this option only if you really need to see contexts for all transactions. Otherwise use an alternative (single transaction) request to pull the info for one transaction at a time.

contrib
The optional flag telling the server to include the transaction contribution object contrib in the report. See details on this flag in the dedicated section below.
contrib_long
This optional flag is considered only if flag contrib was provided. Setting a value of the flag to any value other than 0 will result in returning detailed info on the contributions. Otherwise (if a value of the parameter is set to 0) only the summary report on contributions will be returned. 

In case of successful completion (see Error reporting in the REST API) the service returns its results as the JSON object which has the following schema:

{...
 "databases":{
   <database-name>:{
	 "is_published":<number>,
     "num_chunks":<number>,
     "transactions":[
       {"id":<number>,
        "database":<name>,
        "state":<string>,
        "begin_time":<number>,
        "end_time":<number>
        "context":<object>,
        "contrib":<object>
       },
       ...
     ]
   },
   ...
 }
}

Where the most significant attributes would be:

attributemeaning
databases

A dictionary of databases, where the name of a database is a key, and database description is a value. Database descriptions are JSON objects:

attributemeaning
is_published
The flag indicating the publishing status of the database, where a value of 0 means it's not published, and any other number if it's published.
num_chunks 
The total number of chunks in the database.
transactions

A collection of transactions associated with the database. Each object in the collection has:

attributemeaning
id 

A unique identifier of the transaction in 

database
The name of a database the transaction was open with.
state

The current status of the transaction, where:

  • STARTED if it's open
  • ABORTED  if it was aborted
  • FINISHED  if it was successfully finished
begin_time
The start time of the transaction. It's a 64-bit unsigned integer number representing the number of milliseconds since the UNIX EPOCH . It's guaranteed to be larger than 0.
end_time
The end time of the transaction if applicable. It has to be 0  if the transaction is in the STARTED state. Otherwise, the number represents a time when the transaction was put into the ABORTED or FINISHED states.
context
The optional context object returned only if the query parameter include_context=1  was provided.
contrib
The optional info on transaction contributions returned only if the query parameter contrib=1  was provided.

The second technique allows exploring a status of a specific transaction:

methodresource name
GET/ingest/trans/<id>

The request object for this request is not required, or it could be an empty JSON  object {}. And the resource path of the request includes a unique identifier <id>  of the transaction in question.

The service returns the JSON object which has a similar schema as the one which was previously described. Though, in case of the successful completion of the request (if the given transaction exists) a collection of "databases"  will have a single entry, and that database will have the only entry in its collection of transactions "transactions"  representing the requested transaction.

11.1.3.3.1. The contribution object

As it was explained in the description of the transaction information services, this optional object is returned if a client provided the optional query parameter contrib=1  in a request to the services. The structure of the object also depends on the value of the optional parameter contrib_long. Here is the general layout of the object if the parameter contrib_log=1 was specified:

{...
 "contrib":{
   "summary":{
     "num_workers":<number>,
     "num_failed_files":<number>,
     "num_regular_files":<number>,
     "num_chunk_files":<number>,
     "num_chunk_overlap_files":<number>,
     "data_size_gb":<number>,
     "num_rows":<number>,
     "first_contrib_begin":<number>,
     "last_contrib_end":<number>,
     "table":{
       <table-name>:{
         "data_size_gb":<number>,
         "num_rows":<number>,
         "num_files":<number>,
         "overlap":{
           "data_size_gb":<number>,
           "num_rows":<number>,
           "num_files":<number>
         }
       },
       ...
     },
     "worker":{
       <worker-name>:{
         "data_size_gb":<number>,
         "num_rows":<number>,
         "num_chunk_files":<number>,
         "num_chunk_overlap_files":<number>
         "num_regular_files":<number>
       },
       ...
     }
   },
   "files":[
     {"id":<number>,
      "transaction_id":<number>,
      "worker":<string>,
      "database":<string>,
      "table":<string>,
      "chunk":<number>,
      "overlap":<number>,
      "url":<string>,
      "begin_time":<number>,
      "end_time":<number>,
      "num_bytes":<number>,
      "num_rows":<number>,
      "success":<number>
     },
     ...
   ]
 }
}

Where:

attrdescription
summary

A dictionary that is always present:

attrdescription
num_workers
The total number of workers at which the contributions were made during the transaction.
num_failed_files
The total number of failed (or incomplete, depending on the status of the transaction) contributions across all workers made during the transaction.
num_regular_files
The total number of successful contributions made into the regular tables across all workers during the transaction.
num_chunk_files
The total number of successful contributions made into the chunked tables across all workers during the transaction.
num_chunk_overlap_files
The total number of successful contributions made into the chunk overlap tables across all workers during the transaction.
data_size_gb
The total amount of data (GiB) read from the file contributions across all workers during the transaction.
num_rows
The total number of rows read from the file contributions across all workers during the transaction.
first_contrib_begin

The time when the first successfully completed contributions started (milliseconds).

(warning) A value of 0 is returned if no contributions have been made (or finished) yet.

last_contrib_end

The time when the last successfully completed contributions started (milliseconds).

(warning) A value of 0 is returned if no contributions have been made (or finished) yet.

table

An object representing per-table statistics. The key name in the dictionary will be the name of a table (that would be the base name for the partitioned tables). A value corresponding to the key would be an object having the following attributes:

attrdescription
data_size_gb
The total amount of data (GiB) read from the file contributions into the regular or the chunk (excluding table overlaps) table across all workers during the transaction.
num_rows
The total number of rows read from the file contributions into the regular or the chunk (excluding table overlaps) table across all workers during the transaction.
num_files
The total number of successful contributions uploaded into the regular or the chunk (excluding table overlaps) table across all workers during the transaction.
overlap

The optional object for the partitioned tables. This object captures statistics for total contributions into the table overlaps.

attrdescription
data_size_gb
See above.
num_rows
See above.
num_files
See above.
worker

An object representing per-worker statistics. The key name in the dictionary will be the name (a unique identifier) of a worker. A value corresponding to the key would be an object having the following attributes:

attrdescription
data_size_gb
The total amount of data (GiB) read from the file contributions into the regular or the chunk (excluding table overlaps) tables across all tables at the given worker during the transaction.
num_rows
The total number of rows read from the file contributions into the regular or the chunk (excluding table overlaps) table across all tables at the given worker during the transaction.
num_files
The total number of successful contributions uploaded into the regular or the chunk (excluding table overlaps) table across all tables at the given worker during the transaction.
overlap

The optional object for the partitioned tables. This object captures statistics for total contributions into the table overlaps.

attrdescription
data_size_gb
See above.
num_rows
See above.
num_files
See above.
files

(warning) The optional array of file contributions will be present only if the query parameter contrib_log=1 was provided in a request.

Each entry in the array is an object:

attrdscription
id
A unique identifier of a contribution.
transaction_id
A unique identifier of a transaction in which the contribution was made or attempted.
worker
The name (a unique identifier) of a worker where the contribution was made.
database

The name of a database the contribution was made into.

(warning) This is the same database that was associated with the transaction.

table

The basename of a table the contribution was made into.

(warning) The final name of the table depends on the table type (if it's partitioned, or the regular one, and if the table represents the overlap in the case of the partitioned table). 

chunk
The chunk number (only for the partitioned tables).
overlap
The table overlap flag (a value of 0 is treated as boolean false).
url

The data source for the contribution. Allowed schemes:

  • http://host:port/path  - for files pulled from a remote object store
  • file:///path  - for files read from a locally mounted filesystem at the worker's node where the Ingest Service was run.
  • BINARY  - the content of the file was received over the proprietary binary protocol between a file uploader application and the worker Ingest Service.
begin_time

The time when the ingest processing request started (milliseconds).

(warning) This time may be later than the time when a client sent a request. This is because the Ingest Service has a limited processing capacity. Requests that exceed the capacity get queued.

end_time

The time when the file uploading finished (milliseconds).

(warning) The parameter will be returned 0 if the operation is still in progress or it has failed. See a description of attribute status for further details.

num_bytes

The total number of bytes (including newlines) in the input file.

(warning) The parameter will be returned 0 if the operation is still in progress or it has failed.  See a description of attribute status for further details.

num_rows

The total number of rows found in the input file.

(warning) The parameter will be returned 0 if the operation is still in progress or it has failed.  See a description of attribute status for further details.

success

The completion status of the operation. Any value that differs from 0 means success.

(warning) The status may be returned 0  in two scenarios:

  • if the request failed, or
  • if the request is still being processed (it may take time to upload large contributions).

One has to check the completion status of the transaction to disambiguate between these scenarios. If the transaction is over then a value of 0 always means a failure. If the transaction is still open then it would be impossible to say which of the above-mentioned scenarios is has been realized.





11.1.4. Table location services

11.1.4.1. Allocate/locate chunks of the partitioned tables

The current implementation of the system offers two services for allocating (or determining locations of existing) chunks:
  • Single chunk allocation.
  • Batch mode (multiple chunks) allocation. 

A choice of a technique depends on the requirements of a workflow. Though, it's recommended to use the second service due to its efficiency for allocating large numbers of chunks.

Also note, that once a chunk is assigned (allocated) to a particular worker node all subsequent requests for the chunk are guaranteed to return the same name of a worker as a location of the chunk. Making multiple requests for the same chunk is safe. Chunk allocation requests require a valid super-transaction in the STARTED  state.

The following service is meant to be used for a single chunk allocation/location:

methodresource name
POST/ingest/chunk

Where the request object has the following schema, in which a client would have to provide the name of a database:

{"database":<string>,
 "chunk":<number>,
 "auth_key":<string>
}

The service also supports an alternative method accepting a transaction  identifier(transactions are always associated with the corresponding databases):

{"transaction_id":<number>,
 "chunk":<number>,
 "auth_key":<string>
}

If the operation succeeded (see Error reporting in the REST API), the System would respond with the following JSON object:

{...
 "location":{
   "worker":<string>,
   "host":<string>,
   "port":<number>,
   "http_host":<string>,
   "http_port":<number>
 }
}

For allocating multiple chunks one would have to use the following service:

methodresource name
POST/ingest/chunks

Where the request object has the following schema, in which a client would have to provide the name of a database:

{"database":<string>,
 "chunks":[<number>,<number>,...<number>],
 "auth_key":<string>
}

Like the  above-explained case of the single chunk allocation service, this one also supports an alternative method accepting a transaction identifier (transactions are always associated with the corresponding databases):

{"transaction_id":<number>,
 "chunks":[<number>,<number>,...<number>],
 "auth_key":<string>
}

(warning) the difference in the object schema - unlike the single-chunk allocator, this one expects an array of chunk numbers.

The resulting object  has the following schema:

{...
 "location":[
   {"chunk":<chunk>,
    "worker":<string>,
    "host":<string>,
    "port":<number>,
    "http_host":<string>,
    "http_port":<number>},
   ...
 ]
}

The table below explains the connection parameters returned by the services:

attrdescription
host
port
Connection parameters of the ingest service based on the proprietary binary protocol. This service requires the content of an input file to be sent directly to the service y the client. The Replication/Ingest system provides a ready-to-use application qserv-replica-file-ingest that is based on this protocol. 
http_host
http_port
Connection parameters of the REST server (HTTP-based protocol) that is built into the system's workers. The server provided the Ingesting a single file service asking the workers to ingest files from a variety of external sources, such as a filesystem that is locally mounted at the workers, or a remote object-store.





11.1.4.2. Locate regular tables

The following service returns connection parameters of the Data Ingest Service of workers which are available for ingesting regular (fully replicated) tables:
methodresource name
GET/ingest/regular 

Where the request object passed in a request's body has the following schema, in which a client would have to provide the name of a database:

{"database":<string>
}

The database should not be published at a time when the request was being called.

The service also supports an alternative method accepting a transaction  identifier(transactions are always associated with the corresponding databases):

{"transaction_id":<number>
}

If the transaction identifier was provided then the transaction is required to be in the STARTED  state at a time of a request.

In case of successful completion (see Error reporting in the REST API) the service returns the following object:

{..
 "locations":[
  {"worker":<string>,
   "host":<string>,
   "port":<number>,
   "http_host":<string>,
   "http_port":<number>
  },
  ...
 ]
}

Where each worker entry has:

attrdescription
worker
The name (a unique identifier) of a worker.
host
port
Connection parameters of the ingest service based on the proprietary binary protocol. This service requires the content of an input file to be sent directly to the service y the client. The Replication/Ingest system provides a ready-to-use application qserv-replica-file-ingest that is based on this protocol. 
http_host
http_port
Connection parameters of the REST server (HTTP-based protocol) that is built into the system's workers. The server provided the Ingesting a single file service asking the workers to ingest files from a variety of external sources, such as a filesystem that is locally mounted at the workers, or a remote object-store.

11.1.5. Information services

11.1.5.1. Chunk disposition


Do not  use this service for the chunk placement decisions during catalog ingest!

The service returns an information on chunk disposition at workers in a scope of a database:

methodresource namerequest query parameters
GET/ingest/chunks
database=<name>

Where the name of a database is passed via a required parameter database in the request's query. The request body is not required by the service.

If the operation succeeded (see Error reporting in the REST API), the System would respond with the following JSON object:

{...
 "replica":[
   {"chunk":<number>,
    "worker":<string>,
    "table":{
      <partitioned-table-name>:{
        "overlap_rows":<number>,
        "overlap_data_size":<number>,
        "overlap_index_size":<number>,
        "rows":<number>,
        "data_size":<number>,
        "index_size":<number>
      },
      ...
    }
   },
   ...
 ]
}

Where:

attributedescription
chunk
The number of a chunk.
worker
The name of a worker where the chunk replica is located.
table

A dictionary of the partitioned tables, where each entry represents statistics on a table:

attributedescription
overlap_rows

The number of rows in the chunk's overlap  table.

This number will always be set to 0 in the current implementation of the service. The attribute was added for the future compatibility of the API.

overlap_data_size

The number of bytes in the chunk's overlap table (measured by a size of the corresponding file).

This number will be set to 0 for databases which are not published.

overlap_index_size

The number of bytes in the index of the chunk's overlap table (measured by a size of the corresponding file).

This number will be set to 0 for databases which are not published.

rows

The number of rows in the chunk table.

This number will always be set to 0 in the current implementation of the service. The attribute was added for the future compatibility of the API.

data_size

The number of bytes in the chunk table (measured by a size of the corresponding file).

This number will be set to 0 for databases which are not published.

index_size

The number of bytes in the index of the chunk table (measured by a size of the corresponding file).

This number will be set to 0 for databases which are not published.

Usage notes:

  • The service can be used on databases of any status (published or not). Though, its output may vary depending on the database status. See details in the table above.
  • The numbers of rows (attributes overlap_rows and rows) are not computed in the current implementation of the service. This will be implemented in some future version of the system.
  • Attributes reporting sizes of tables will have 0 values for databases which are not  yet published. This may also change in some future version of the system.
  • Sizes of tables and indexes are the sizes (in bytes) of the corresponding files of the MyISAM  MySQL engine. 


11.1.6. Secondary index management

11.1.6.1. (Re-)build the index


Be advised that the amount of time needed to build an index of a large scale catalog may be quite large. The current implementation of the secondary index  is based on the MySQL's InnoDB table engine. The insert time into this B-Tree table has logarithmic performance. It may take many hours to build catalogs of billion objects. In some earlier tests, the build time was 20 hours for a catalog of 20 billion objects.

The service is used to build or rebuild (if needed) the secondary index table of a database. The database may or may not be published.

methodresource name
POST/ingest/index/secondary

Where the request object has the following schema:

{"database":<string>,
 "allow_for_published":<number>,
 "rebuild":<number>,
 "local":<number>
}

Where:

attributedescription
database
The required parameter which has the name of a database affected by the operation.
allow_for_published
The optional (boolean) flag, which if set to 0  won't allow modifying the index of the published databases. Any other numeric value would lift that restriction. The default value of the parameter is 0.
rebuild
The optional (boolean) flag, which if set to 0  won't allow the service to recreate the index. Any other value would tell the service to drop (if exists) the secondary index table before re-creating and re-populating it with entries. The default value of the parameter is 0.
local

Choices made for values of this optional parameter would depend on specific configurations of Qserv and its Replication/Ingest system. Please, read carefully the rest of this section should you decide to change the default value of the parameter in order to understand the possible consequences of each value.

The parameter is required to have a numeric value treated as explained in this table:

valuedescription
0

In this case, index contributions are required to be directly placed by the Replication/Ingest system at a location that is directly accessible by the MySQL server hosting the index table. This could be either some local folder of a host where the service is being run or a folder located at a network filesystem mounted on the host. Once a file is a place it would be ingested into the destination table using this protocol:

LOAD DATA INFILE ...

Be aware that this option may not be always possible (or cause complications) in Kubernetes-based deployments of Qserv.

1 or any other number

In this case, index contributions would be loaded into the table using this protocol:

LOAD DATA LOCAL INFILE ...

Files would be first copied by MySQL at some temporary folder owned by the MySQL service before being ingested into the table. This option has the following caveats:

  • the protocol must be enabled in the MySQL server configuration by setting the following system variable: local_infile=1.
  • the temporary folder of the MySQL server is required to have sufficient space to temporarily accommodate index contribution files before they'd be loaded into the table. In the worst-case scenario, there should be enough space to accommodate all contributions of a given catalog. One could make a reasonable estimate for the later by knowing the total number of rows in the director table of the catalog, the size of the primary key (the so-called "object identifier") of the table, as well as types of the "chunk" and "sub-chunk" columns (which are usually the 32-bit integer numbers in Qserv).
  • a choice of this option would also affect (lower) the overall performance of the operation due to additional data transfers required for copying file contributions from a location owned by the Master Replication Controller to the temporary folder of the MySQL server.

The default value of the parameter is 0.

If the operation succeeded (see Error reporting in the REST API), the System would respond with the default JSON object which won't carry any additional attributes on top of what's mandated by the previously mentioned Error reporting requirements.

In case of errors, and if an actual attempt to build the index was made, the object may have a non-trivial value of the error_ext  attribute as shown below: 

{...
 "error_ext":{
   <worker>:{
    <chunk>:<string>,
    ...
   },
 }
}

Where:

attributedescription
<worker>
The placeholder for a unique identifier of a Qserv worker defining a scope of an error.
<chunk>

The placeholder for the chunk number defining a scope of an error.

A value of the attribute would an actual explanation of a error which was reported by the service in case of a problem with harvesting input data for the index or loading these data into the index table.

Here is an example of how this object might look like:

{...
 "error_ext":{
   "qserv-db01:{
     122:"Failed to connect to the worker service",
     3456:"error: Table 'tes96__Object' already exists, errno: 1050",
   },
   "qserv-db23:{
     123:"Failed to connect to the worker service"
   }
 }
}

11.1.7. Worker ingest services

Services explained in this section are provided by the worker servers, not by the main REST server of the Master Replication Controller. DNS names (IP addresses) of the corresponding hosts and the relevant port numbers of the services are provided either by requests for a configuration of the system or by requests for the metadata Allocate/locate chunks of the partitioned tables and Locate regular tables.

11.1.7.1. Ingesting a single file

Transactions are started by this service:
methodresource name
POST/ingest/file 

The request expects a JSON object which has the following schema:

{"transaction_id":<number>,
 "table":<string>,
 "column_separator":<string>,
 "chunk":<number>,
 "overlap":<number>,
 "url":<string>,
 "http_method":<string>,
 "http_data":<string>,
 "http_headers":<string>,
 "auth_key":<string>
}

Where:

parametermeaning
transaction_id 
A unique identifier of a transaction which is required to be in the STARTED state.
table 
The base name of a table affected by the operation. Note that for regular tables the base names are the same as the actual names of the table in the database. And for the partitioned (chunked) tables the final names of the tables are made based on their base name, the chunk number and the overlap attribute of the table.
column_separator 

A single character string that stores a separator between columns in the input file. Usually, it's either comma or the tabulator character.

chunk
The chunk number for the partitioned tables. Though this is a mandatory parameter it's ignored for the regular  tables.
overlap 
The numeric flag indicating a kind of the partitioned table (0  if this is not the overlap table or any other number of this is the overlap table). Though this is a mandatory parameter it's ignored for the regular tables.
url

A location of a file to be ingested. The current implementation supports two schemas:

schemadescription
file:///<path>
A file on a filesystem that is mounted locally on the corresponding worker. Note that the file path must be absolute. See details on this subject here: https://en.wikipedia.org/wiki/File_URI_scheme.
http://<resource>
https://<resource>
A file at a web server. For either of these schemas additional attributes required for pulling a file over the specified protocol could be provided.
http_method
The optional parameter for the HTTP-based file locations. The default value of the parameter is GET .
http_data
The optional parameter for the HTTP-based file locations. The default value of the parameter is the empty string (no data).
http_headers
The optional parameter for the HTTP-based file locations. The default value of the parameter is the empty string (no data).
auth_key
The optional authorization key.

In case of successful completion (see Error reporting in the REST API) the service is expected to return a JSON object with the statistics and performance info on the operation:

{...
 "stats":{
   "num_bytes":<number>,
   "num_rows":<number>
 },
 "perf":{
   "begin_file_read_ms":<number>,
   "end_file_read_ms":<number>,
   "begin_file_ingest_ms":<number>,
   "end_file_ingest_ms":<number>
 }
 ...
}

Where the most important attribute in the object is a unique identifier of the transaction represented by key id. And the state of the new transaction will be set to STARTED.

11.1.8. Ingest configuration

11.1.8.1. Setting configuration parameters

Parameters are set for a database (regardless of its published  status) using the following service:
methodresource name
PUT/ingest/config 

The service requires a JSON  object of the following schema:

{"auth_key":<string>,
 "database":<string>, 
 "SSL_VERIFYHOST":<number>,
 "SSL_VERIFYPEER":<number>,
 "CAPATH":<string>,
 "CAINFO":<string>,
 "CAINFO_VAL":<string>,
 "PROXY_SSL_VERIFYHOST":<number>,
 "PROXY_SSL_VERIFYPEER":<number>,
 "PROXY_CAPATH":<string>,
 "PROXY_CAINFO":<string>,
 "PROXY_CAINFO_VAL":<string>
}

Whee attributes are explained below:

attributemandatorydescription
auth_key
YESThe parameter representing an authorization key as it's configured in the Qserv Replication/Ingest system.  
database
YESThe parameter specifying the name of the database affected by the operation.
SSL_VERIFYHOST

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_SSL_VERIFYHOST.html.

Numeric values of the parameter are treated as boolean variables, where 0 represents false and any other values represent true.

SSL_VERIFYPEER

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_SSL_VERIFYPEER.html.

Numeric values of the parameter are treated as boolean variables, where 0 represents false and any other values represent true.

CAPATH

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_CAPATH.html.

Putting the empty string as a value of the parameter will effectively turn this option off as if it has never been configured for the database.

CAINFO

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_CAINFO.html.

Putting the empty string as a value of the parameter will effectively turn this option off as if it has never been configured for the database.

CAINFO_VAL

A value of a certificate bundle for a peer. This parameter is used in those cases when it's impossible to inject the bundle directly into the Ingest workers' environments. If a non-empty value of the parameter is provided then ingest servers will use it instead of the one mentioned (if any) in the above-described attribute CAINFO.

(warning) Values of the attribute are the actual certificates, not file paths like in the case of CAINFO.

PROXY_SSL_VERIFYHOST

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_PROXY_SSL_VERIFYHOST.html.

Numeric values of the parameter are treated as boolean variables, where 0 represents false and any other values represent true.

PROXY_SSL_VERIFYPEER

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_PROXY_SSL_VERIFYPEER.html.

Numeric values of the parameter are treated as boolean variables, where 0 represents false and any other values represent true.

PROXY_CAPATH

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_PROXY_CAPATH.html.

Putting the empty string as a value of the parameter will effectively turn this option off as if it has never been configured for the database.

PROXY_CAINFO

This attribute directly maps to https://curl.se/libcurl/c/CURLOPT_PROXY_CAINFO.html.

Putting the empty string as a value of the parameter will effectively turn this option off as if it has never been configured for the database.

PROXY_CAINFO_VAL

A value of a certificate bundle for a proxy. This parameter is used in those cases when it's impossible to inject the bundle directly into the Ingest workers' environments. If a non-empty value of the parameter is provided then ingest servers will use it instead of the one mentioned (if any) in the above-described attribute PROXY_CAINFO.

(warning) Values of the attribute are the actual certificates, not file paths like in the case of PROXY_CAINFO.

If the operation successfully finished it returns the standard JSON  object w/o any additional data but the completion status of a request (see Error reporting in the REST API).

11.1.8.2. Retrieving configuration parameters

Parameters are retrieved for a database (regardless of its published  status) using the following service:
methodresource name
GET/ingest/config 

The service requires a JSON object of the following schema:

{"database":<string>
}

Where attributes are explained below:

attributemandatorydescription
database
YESThe parameter specifying the name of the database affected by the operation.

If the operation successfully finished it returns an extended JSON object that has the following schema (in addition to the standard status and error reporting attributes explained in Error reporting in the REST API):

{...,
 "database":<string>, 
 "SSL_VERIFYHOST":<number>,
 "SSL_VERIFYPEER":<number>,
 "CAPATH":<string>,
 "CAINFO":<string>,
 "CAINFO_VAL":<string>,
 "PROXY_SSL_VERIFYHOST":<number>,
 "PROXY_SSL_VERIFYPEER":<number>,
 "PROXY_CAPATH":<string>,
 "PROXY_CAINFO":<string>,
 "PROXY_CAINFO_VAL":<string>
}

In this object, the only mandatory attribute is the name of a database. Other parameters in the result may or may not be present depending on prior use of the configuration setting service  Setting configuration parameters of the ingest. See further description of these parameters in the documentation for the service.



11.2. Tools

11.2.1. Error reporting

All command line tools return 0 to indicate a successful completion of the requested operation. Other values shall be treated as errors.  

11.2.2. qserv-replica-file

This is the client-side application providing a number of tools for operations with files. The general syntax:

Usage:
  <application> [parameters] [options] [flags]
  <application> --help

Supported applications:
  EXPORT
  INGEST
  READ
  SERVER

Each tool/application is explained in a dedicated section below.

11.2.2.1. EXPORT

TBC...

11.2.2.2. INGEST

This is a tool meant for ingesting table contributions into Qserv over a proprietary binary protocol.

The general syntax:

qserv-replica-file INGEST {FILE|FILE-LIST|FILE-LIST-TRANS|PARSE} [<parameter> [<parameter> ... ] [--<option>=<value> ...] [--<flag> ...]

Commands:

commanddescription and syntax
FILE 

The single-file ingest option. A destination of the ingest and a path to the file to ingest are specified via a group of mandatory parameters:

<worker-host> <worker-port> <transaction-id> <table> <type> <path> 

Where:

parameterdescription
worker-host 

The name of a worker host the Ingest service is running.

worker-port 

The port number of the worker's Ingest service.

transaction-id 

A unique identifier (number) of a super-transaction that must be already started. Note that the identifiers are associated with databases.

table 

The name of a table to be ingested.

type 

The type of table to be ingested. Allowed options:

  • P - for contributions into partitioned tables
  • R - for contributions into the regular tables
path 

A path to an input file to be sent to the worker.

FILE-LIST

The batch ingests of multiple files as contributions into (potentially) various tables and databases (as determined by transaction identifiers).

Parameters:

<file-list>

Where:

parameterdescription
file-list 

The name of a file with specifications of files to be ingested. If the file name is set to '-' then the specifications will be read from the standard input stream.

The input file is expected to contain a serialized JSON / YAML object which has the following schema:

[{"worker-host":<string>,
  "worker-port":<number>,
  "transaction-id":<number>,
  "table":<string>,
  "type":<string>,
  "path":<string>
 },
 ...
]

Where the attributes of each contribution are the same as mandatory parameters of the previously explained single file ingest command FILE.

FILE-LIST-TRANS

The variant of the FILE-LIST command reduced for ingesting contributions within the same context specified by the required parameters.

Parameters:

<transaction-id> <table> <type> <file-list>
The input file is expected to contain the serialized JSON/YAML object which has the following schema:
[{"worker-host":<string>,
  "worker-port":<number>,
  "path":<string>
 },
 ...
]

(warning) The schema represents a reduced version of the one required by the command FILE-LIST in which the common parameters got removed.

PARSE

This is just a test meant for testing the correctness of the parser for the TSV/CSV files.

Parameters:

<infile> <outfile>

Where:

parameterdescription
infile
A path to the input file to be parsed.
outfile
A path to the output file for writing the output of the parser. If the file exists it will be truncated

The tool will also report statistics. For example:

% qserv-replica-file INGEST PARSE Science_Ccd_Exposure.tsv Science_Ccd_Exposure.tsv.parsed
read: 2126186 bytes, wrote: 2126186 bytes, lines: 3304

When the application finishes one may compare the files using:

diff Science_Ccd_Exposure.tsv Science_Ccd_Exposure.tsv.parsed

 The numbers of bytes read from the input file and recorded into the output one are supposed to be equal.

Unable to render {include} The included page could not be found.

Unable to render {include} The included page could not be found.

Unable to render {include} The included page could not be found.

Options and flags that are common for all commands:

option/flagdescription
--fields-enclosed-by=<value>

A character for quoting fields in the TSV/CSV files. Please, check the content of the TSV/CSV files to ensure the file payload has the quotes. Normally, the quotes are not used. The default value: '\0'. This is a special character indicating that quotes aren't present.

--fields-escaped-by=<value>
An escape character for configuring the MySQL LOAD DATA INFILE statement when ingesting table contributions. The default value: '\\'.
--fields-terminated-by=<value>

A character that separates fields within a row. The default value: '\t'.

--lines-terminated-by=[<value>]

A character that is used to terminate lines. The default value: '\n'.

--record-size-bytes=[<value>]

The parameter specifying the record size for reading from the input file and for sending data to a server. The default value: 1048576.

--verbose

Print various stats upon completion of the ingest.


11.2.2.3. READ

TBC...

11.2.2.4. SERVER

TBC...







  • No labels