Services presented in this document adhere to the latest version of the REST API explained in: The version history of the Ingest API. The ingest workflow is supposed to retrieve and evaluate the current number of the API using a method explained in that section or use an advanced version control technique that was made available in the section Optional version numbers in calls to the REST API

1. Introduction

This document explains how to use the index management API of the Qserv Replication System.

IMPORTANT: all operations on indexes are only allowed on databases that are in the published state. The system will refuse all requests on databases that are in a process of being ingested. This is done for three reasons: 1) to avoid interfering with the catalog ingest operations, 2) to prevent returning an inconsistent (or wrong) state of the indexes, and 3) to prevent transient errors due to potential race conditions since the overall state of the distributed catalogs might be being changed by the Ingest system.

Services provided by the system are related to the following index management operations of the MySQL family of DBMS:

SHOW INDEX FROM <table> ...
CREATE ... INDEX <index> ON <table> ...
DROP INDEX ON <table> ...

However, unlike the single table operations listed above, the services allow managing groups of related tables distributed across Qserv worker nodes. Hence, each request for any service refers to a single such group. In particular:

  • if a request refers to a regular  (not partitioned) table that is supposed to be fully replicated across all workers, then all instances of the table will be affected by the request.
  • otherwise, (in the case of the partitioned tables), each replica of every chunk of the table will be included in an operation. Note that the index management services differentiate between the chunk tables per se and the corresponding full overlap tables. When submitting a request, a user will have to choose which of those tables will be included in the operation.

The latter would also be reflected in the result and error reports made by the services. The JSON  objects returned by the services would return the names of the final tables affected by the operations, not the names of the corresponding group as per the original request to the services. This is done to facilitate investigating problems with Qserv should they occur.

1.1. Things to consider before creating indexes

The recommendations mentioned in this section are not complete. Please, get yourself familiarized with the MySQL documentation on indexes, and specifically with the CREATE INDEX syntax, before attempting the operation. Always keep in mind that there is the MySQL (or alike) machinery behind Qserv and that Qserv presently uses the MyISAM engine to store data in tables. Here are a few points to consider.

General:

  • Indexes are created on tables, and not on views.
  • Indexes are created in the scope of the table, and not databases.
  • In the case of the partitioned tables, the chunk and full overlap may not need to have the same set of indexes. In some cases, it may not be even possible, for instance, due to different UNIQUE  constraints requirements.
  • Please, provide a reasonable comment for an index, even though, comments aren't mandatory. Your comments could be useful for bookkeeping purposes to know why and what the index was created for.
  • Be aware that indexes take additional space on the Qserv workers' filesystems where the database tables are residing. Potentially, if too many indexes were created, MySQL may run out of disk space and stop working. The rule of thumb for estimating the amount of space to be taken by an index is based on the sizes of columns mentioned in an index specification (explained in the next section) multiplied by the total number of rows in a table (on which the index is being created). There are other factors here, such as the type of a table (regular or partitioned) as well as the number of replicas for the partitioned tables. The index management service which is used for creating indexes will also make the best attempt to prevent creating indexes if it will detect that the amount of available disk space is falling too low. In this case, the service will refuse to create an index and an error will be reported back.

When choosing a name for an index:

  • The name should be unique (don't confuse this with the UNIQUE  keyword used in the index specifications) in the scope of the table. It means it should not be already taken by some other index. Always check which indexes already exist in a table before creating a new one.
  • Generally, the name must adhere to the MySQL requirements for identifiers as explained in: https://dev.mysql.com/doc/refman/8.0/en/identifier-qualifiers.html
  • Keep in mind that names of identifiers (including indexes) in Qserv are case-insensitive. This is not the general requirement in MySQL, where the case sensitivity of identifiers is configurable one way or another. It's because of a design decision the original Qserv developers made to configure the underlying MySQL machinery.
  • To make things simple, restrain from using special characters (all but the underscore one _).
  • The length of the name should not exceed 64 characters: https://dev.mysql.com/doc/refman/8.0/en/identifier-length.html

1.2. Error reporting

All services would normally return the HTTP response code 200 regardless if a service succeeded or failed with the request. Other error codes may indicate, as usually, an incorrect URL of a request, the REST  service being down, network communication problems, etc. Requests completed with a status code 200 will also return a JSON object. For a successfully completed request, the object would have a slightly different schema depending on the request. However, all failures are reported with JSON objects of the following schema:

{"success":<number>,
 "error":<string>,
 "error_ext":{
   "job_state":<string>,
   "workers":{
     <worker>:{
       <table>:{
         "request_status":<string>,
         "request_error":<string>
       },
       ..
     },
     ..
   }
 }
}

Where:

attributemeaning
success 

The completion status of the request: 1  - for success, 0  - for failures.

NOTE: other numbers are reserved for possible extensions of the error reporting protocol.
errorA non-empty string with a general explanation of a problem should the one occur. It will be an empty string if success=1.
error_ext

A detailed description of an error should the one occur. It must be an empty dictionary object {} if success=1. If the object is not empty then it must have two fields:

attributemeaning
job_state

A completion status (state) of the corresponding job (classes SqlGetIndexesJob, SqlCreateIndexesJob, or SqlDropIndexesJob, depending on q request) a  serialized into a string. The explanation of the statuses could be found in the following header:

Look for this enum type:

enum ExtendedCompletionStatus {
    EXT_STATUS_NONE,
    EXT_STATUS_INVALID_PARAM,
    ..
};

In many cases (should some problem happen) expect the status value of ExtendedCompletionStatus::EXT_STATUS_MULTIPLE.  Then look for worker/table-specific errors in a JSON object explained below. 

workers

JSON object which has unique  identifiers of workers as the keys, where the corresponding value  for the worker is another JSON object which has names of worker-side tables as the next-level keys for descriptions of problems with managing indexes for the corresponding tables.

NOTE: only the problematic tables (if any) would be mentioned in the report. If no problems were seen during the index management operations then a collection of workers and tables will be empty.

Each table entry would have:

attributemeaning
request_status

A completion status (state) of the index creation request (classes SqlGetIndexesRequest, SqlCreateIndexesRequest, or SqlDropIndexesRequest) serialized into a string. The explanation of the statuses could be found in the following header:

 Look for this enum type:

enum ExtendedCompletionStatus {
    EXT_STATUS_NONE,
    EXT_STATUS_INVALID_PARAM,
    ..
};
request_errorThis string provides an expanded explanation of an error reported by the Replication system's worker (in case if the request failed on the worker's side and is reported to the service).

1.2.1. Reporting partial successes or failures

Since the index management requests may (will) involve multiple tables, the corresponding operations may be potentially partially successful and partially not successful. All failures for specific indexes which couldn't be managed (created, queried, or deleted) would be reported as explained in the previous section. For example, that would be a case if a request was made to drop a known to-exist index, and if no such index existed for some final tables. There may be various reasons why this might happen. An explanation of the reasons is beyond a scope of this document. The best way a user should treat this situation is to expect that the service would do the "best effort" of removing the index. It's also allowed to run the index removal request multiple times. This won't make any harm. All subsequent requests will report failures for all final tables in the specified group of tables.

2. Creating a new index

methodresource nameHTTP header of a requestrequest body typeresult type
POST/replication/sql/index Content-Type: application/json JSON 

JSON 

2.1. Request

The service expects a JSON object with a description of a new index to be sent in a request's body. Here is the general structure of this object:

{"database":<string>,
 "table":<string>,
 "overlap":<number>,
 "index":<string>,
 "spec":<string>,
 "comment":<string>,
 "columns":[
   {"column":<string>,
    "length":<number>,
    "ascending":<number>
   },
   ..
 ],
 "auth_key":<string>
}

Where:

attributemeaning
database 
The name of a database that is published.
table 
The base name  of a table, like LSST's ObjectSource, etc. Note that for the partitioned tables this name should not include chunk numbers or the FullOverlap  extension. The REST service will take care of locating all instances of the corresponding final tables residing at Qserv workers and it will send separate index creation requests for each of those.
overlap
An optional flag that only applies to the partitioned tables. If a value of the flag is 0 then only the chunk tables will be affected by a request. Otherwise (in the case of any other number), only the so-called full overlap tables will be affected. The default value of the flag is 0.
index
The name of an index to be created.
spec

One of the following keywords: DEFAULTUNIQUE, FULLTEXT, or SPATIAL. All but the first one (DEFAULT) are mentioned in the MySQL documentation for CREATE INDEX

CREATE [UNIQUE | FULLTEXT | SPATIAL] INDEX index_name ...

The REST service expects DEFAULT in those cases when none of the other three specifications are provided.

comment

An arbitrary comment for the index which will be passed via the COMMENT parameter into the MySQL index creation statement: 

CREATE ... INDEX ... COMMENT 'string' ...
columns

A non-empty array of JSON  objects keys mentioned in the key_part  of the index creation statements: 

CREATE ... INDEX index_name ON tbl_name (key_part,...) ...

NOTE: the current implementation of the service doesn't support the extended-expression syntax of key_part introduced in MySQL version 8.

These are the mandatory attributes of each key-specific object:

keymeaning
column
The name of an existing column.
length
The length of a substring used for the index. It only has a meaning for columns of types: TEXTCHAR[N]VARCHAR[N], BLOB , etc. And it must be always 0 for other types (numeric, etc.). Otherwise, an index creation request will fail. 
ascending
This translates into ASC  or DESC  option in the key definition in key_part. Allowed values are 0 for DESC or any other positive number to represent ASC.
auth_key
A value of the authorization key set for specific deployment of the Replication System.

2.2. Result

See the section Error reporting for an explanation of errors.

In case of successful completion of the request the JSON object will have the following schema:

{"success":1,
 "error":"",
 "error_ext":{}
}

2.3. Examples

Suppose we have the regular or fully replicated (using the Qserv terminology) table which has the following schema:

CREATE TABLE `sdss_stripe82_01`.`Science_Ccd_Exposure_NoFile` (
  `scienceCcdExposureId` bigint(20) NOT NULL,
  `run` int(11) NOT NULL,
  `filterId` tinyint(4) NOT NULL,
  `camcol` tinyint(4) NOT NULL,
  `field` int(11) NOT NULL,
  `path` varchar(255) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

And, suppose we are going to create the PRIMARY  key index based on the very first column scienceCcdExposureId. In this case the request object might look like this:

{"database":"sdss_stripe82_01",
 "table":"Science_Ccd_Exposure_NoFile",
 "index":"PRIMARY",
 "spec":"UNIQUE",
 "comment":"This is the primary key index",
 "columns":[
   {"column":"scienceCcdExposureId",
    "length":0,
    "ascending":1
   }
 ],
 "auth_key":""
}

(warning) The request deliberately misses the overlap attribute since it won't apply to the regular tables.

The request could be tested using the command line tool curl which is (usually) readily available at all UNIX installations. Assuming a request is launched from the same host (or Kubernetes pod) where Master Replication Controller is running, and the service is configured on port 25080, the request could be launched as:

curl 'http://localhost:25080/replication/sql/index' \
  -X POST -H "Content-Type: application/json" \
  -d '{"database":"sdss_stripe82_01","table":"Science_Ccd_Exposure_NoFile","index":"PRIMARY","spec":"UNIQUE","comment":"This is the primary key index","columns":[{"column":"scienceCcdExposureId","length":0,"ascending":1}],"auth_key":""}'

3. Dropping an index

DELETE/replication/sql/index Content-Type: application/json JSON JSON

3.1. Request

The service expects a JSON object with a scope of the request to be sent in a request's body. Here is the general structure of this object:

{"database":<string>,
 "table":<string>,
 "overlap":<number>,
 "index":<string>,
 "auth_key":<string>
}

Where:

attributemeaning
database 
The name of a database. The database must be in the published state.
table 
The base name  of a table, like LSST's ObjectSource, etc. Note that for the partitioned tables this name should not include chunk numbers or the FullOverlap  extension. The REST service will take care of locating all instances of the corresponding final tables residing at Qserv workers and it will send separate index deletion requests for each of those.
overlap
An optional flag that only applies to the partitioned tables. If a value of the flag is 0 then only the chunk tables will be affected by a request. Otherwise (in the case of any other number), only the so-called full overlap tables will be affected. The default value of the flag is 0.
index
The name of an existing index (key) to be dropped.
auth_key
A value of the authorization key set for the specific deployment of the Replication System.

3.2. Result

See the section Error reporting for an explanation of errors.

In case of successful completion of the request the JSON object will have the following schema:

{"success":1,
 "error":"",
 "error_ext":{}
}

3.3. Examples

Suppose we have the partitioned table sdss_stripe82_01.RunDeepSource (an equivalent of the LSST's Object  table), and there is index  IDX_id_coadd_id which was previously created in the table. Then the index deletion request object would have the following payload:

{"database":"sdss_stripe82_01",
 "table":"RunDeepSource",
 "overlap":0,
 "index":"IDX_id_coadd_id",
 "auth_key":""
}

The request could be tested using the command line tool curl which is (usually) readily available at all UNIX installations. Assuming a request is launched from the same host (or Kubernetes pod) where Master Replication Controller is running, and the service is configured on port 25080, the request could be launched as:

curl 'http://localhost:25080/replication/sql/index' \
  -X DELETE -H "Content-Type: application/json" \
  -d '{"database":"sdss_stripe82_01","table":"RunDeepSource","overlap":0,"index":"IDX_id_coadd_id","auth_key":""}'

4. Inspecting which indexes already exist

This service was completely redesigned in version 17  of the API.

methodresource nameresult type
GET/replication/sql/index/:database/:table[?overlap={0|1}] JSON

4.1. Request

All parameters that are required by the service should be provided in the resource path or as the optional parameters of the query.

attributemeaning
database 
The name of a database. The database must be in the published state.
table 
The base name  of a table, like LSST's ObjectSource, etc. Note that for the partitioned tables this name should not include chunk numbers or the FullOverlap  extension. The REST service will take care of locating all instances of the corresponding final tables residing at Qserv workers and it will send separate index query requests for each of those.
overlap
An optional flag that only applies to the partitioned tables. If a value of the flag is 0 then only the chunk tables will be affected by a request. Otherwise (in the case of any other number), only the so-called full overlap tables will be affected. The default value of the flag is 0.

4.2. Result

All failures are reported as explained in the section Error reporting.

In case of successful completion of the request the JSON object will have the following schema:

{"status": {
    "database": <string>,
    "table":    <string>,
    "overlap":  <number>,
    "indexes": [
       {"name":               <string>,
        "unique":             <number>,
        "type":               <string>,
        "comment":            <string>,
        "status":             <string>,
        "num_replicas_total": <number>,
        "num_replicas":       <number>, 
        "columns": [
           {"name":      <string>,
            "seq":       <number>,
            "sub_part":  <number>,
            "collation": <string>
           }
           ...
        ]
       }
       ...
    ]
 }
}

Where, in addition to the standard attributes representing the completion status of the request (successerror, error_ext, etc.), the object will have a collection of the indexes in the dictionary status. The first attributes of the dictionary will mirror the corresponding input parameters of the service that were explained earlier: the name of the database, the name of the datable, and the overlap flag. The last attribute indexes will contain descriptions of the indexes found in the table:

attributedescription
name
The name of the index (the key)
unique
The numeric flag indicates of the index's keys are unique, where a value of 0 means they're unique. Any other value would mean the opposite.
type
The type of index, such as BTREE , SPATIAL , PRIMARY, FULLTEXT, etc.
comment

An optional explanation for the index passed to the index creation statement:

CREATE ... INDEX ... COMMENT 'string' ...
status

The status of the index. This parameter considers the aggregate status of the index across all replicas of the table. Possible values here are:

  • COMPLETE: the same index (same type, columns) is present in all replicas of the table (or its chunks in the case of the partitioned table)
  • INCOMPLETE: the same index is present in a subset of replicas of the table, where all indexes have the same definition.
  • INCONSISTENT: instances of the index that have the same name have different definitions in some replicas

(warning) The result object reported by the service will not provide any further details on the last status INCONSISTENT apart from indicating the inconsistency. It will be up to the data administrator to investigate which replicas have unexpected index definitions.

num_replicas_total
The total number of replicas that exist for the table. This is the target number of replicas where the index is expected to be present.
num_replicas
The number of replicas where the index was found to be present. If this number is not the same as the one reported in the attribute num_replicas_total then the index will be INCOMPLETE.
columns

The collection of columns that were included in the index definition. Each entry of the collection has:

attributedescription
nameThe name of the column
seqThe 1-based position of the column in the index.
sub_partThe index prefix. That is, the number of indexed characters if the column is only partly indexed 0 if the entire column is indexed.
collationHow the column is sorted in the index. This can have values ASC, DESC, or NOT_SORTED.

4.3. Examples

The following request will report indexes from the fully-replicated table ivoa.ObsCore:

curl http://localhost:25081/replication/sql/index/ivoa/ObsCore -X GET

The (truncated and formatted for readability) result of an operation performed in a Qserv deployment with 6-workers may look like this:

{"error":"",
 "error_ext":{},
 "status":{
  "database":"ivoa",
  "indexes":[
   {"name":"idx_dataproduct_subtype",
    "type":"BTREE",
    "unique":0,
    "status":"COMPLETE",
    "num_replicas":6,
    "num_replicas_total":6,
    "comment":"The regular index on the column dataproduct_subtype",
    "columns":[
     {"collation":"ASC","name":"dataproduct_subtype","seq":1,"sub_part":0}
    ]
   },
   {"name":"idx_s_region_bounds",
    "type":"SPATIAL",
    "unique":0,
    "status":"COMPLETE",
    "num_replicas":6,
    "num_replicas_total":6,
    "comment":"The spatial index on the geometric region s_region_bounds",
    "columns":[
     {"collation":"ASC","name":"s_region_bounds","seq":1,"sub_part":32}
    ]
   },
   {"name":"idx_lsst_tract_patch",
    "type":"BTREE",
    "unique":0,
    "status":"COMPLETE",
    "num_replicas":6,
    "num_replicas_total":6,
    "comment":"The composite index on the columns lsst_tract and lsst_patch",
    "columns":[
     {"collation":"ASC","name":"lsst_tract","seq":1,"sub_part":0},
     {"collation":"ASC","name":"lsst_patch","seq":2,"sub_part":0}
    ]
   },
   ...
}

(warning) The second index "idx_s_region_bounds" is spatial. It's based on the binary column of which only the first 32 bytes are indexed.

(warning) The third index "idx_lsst_tract_patch" is defined over two columns.


  • No labels