1. Scope
This is a summary of Qserv's HTTP-based frontend for submitting queries as developed in:
- - DM-42617Getting issue details... STATUS
- - DM-43465Getting issue details... STATUS
- - DM-43497Getting issue details... STATUS
The frontend complements the existing one based on mysql-proxy
.
Key features of the new API:
- It's based on the HTTP protocol.
- Result sets (data and schema) are returned in the JSON objects.
- Both synchronous and asynchronous query submission operations are supported.
- It provides a mechanism for tracking the progress of the asynchronously submitted queries.
- It allows query cancellation.
- It provides the protocol versioning to ensure the integrity of the distributed applications.
2. General guidelines
This section introduces the general guidelines for using the REST services of the frontend.
2.1. JSON
All services regardless of the HTTP method (GET
, POST
, etc.) report results in the JSON objects. The schema of the objects varies depending on a service called. A detailed explanation of the schema can be found in the corresponding sections where the services are explained.
2.2. Error reporting
If a service was able to receive a request it would always return a response with the HTTP code 200. The actual completion status of a request is returned in the JSON
object. The following attributes are related to the completion status:
attr | type | description |
---|---|---|
success | number | The binary flag indicating if the request succeeded:
|
error | string | An explanation of the error (if any). |
error_ext | object | The optional information on the error (if any). The content of the object is request-specific. Details can be found in the reference section of the corresponding request. |
warning | string | The optional warning for on-critical conditions. Note that warnings may be posted for both completed or failed requests. |
Other HTTP codes (3xx, 404, 5xx, etc.) could be also returned by the frontend's HTTP server or intermediate proxy servers.
2.3. Protocol Versioning
Each implementation of the API has a specific version. The version number will change if any changes to the implementation or the API that might affect users will be made. The current document will be also kept updated to reflect the latest available version of the API.
There are two ways the client applications can use the version numbers:
- "pull mode": Ask the frontend explicitly what version it implements and cross-check that number with the one expected by the application.
- "push mode": Pass the expected version number as a parameter when calling the services and let the services verify if that version matches one of the frontend implementations.
Application developers are free to use neither, either of two, or both methods of reinforcing their applications.
2.3.1. Pull mode
To support the first scenario, the API provides a special metadata service that will return the version number (along with some other information on the frontend):
method | resource name |
---|---|
GET | /meta/version |
The request object for this request is not required, or it could be an empty JSON
object {
}. In case of its successful completion, the service will return a JSON
object that will include the following attributes:
{ "kind":<string>, "name":<string>, "id":<number>, "instance_id":<string>, "version":<number>, ... }
Where:
attr | type | description |
---|---|---|
kind | string | The name of the service. The following name will be always reported: qserv-czar-query-frontend |
name | string | The unique name of the frontend within a given Qserv. The current implementation will always return: http |
id | number | The numeric identifier of the frontend within a given Qserv. The number returned here may vary. |
instance_id | string | An identifier of the Qserv. A value of the attribute depends on a particular deployment of Qserv. |
version | number | The current version number of the API. |
Example:
{"kind":"qserv-czar-query-frontend", "name":"http", "id":8, "instance_id":"qserv-prod", "version":33, "success":1, ... }
2.3.2. Push mode
In the case of the second scenario, an application will pass the desired version number as a request parameter. The number would be a part of the request's query for the method. For example, the following request for checking the status of the ongoing query might look like this:
curl 'http://localhost:4041/query-async/status/1234?version=33' -X GET
For other HTTP methods used by the API, the number is required to be provided within the body of a request as shown below:
curl 'http://localhost:4041/query-async' -X POST -H 'Content-Type: application/json' -d'{"version":33,"query":"SELECT ..."}'
If the number does not match expectations, such a request will fail and the service return the following response. Here is an example of what will happen if the wrong version number 29 is specified instead of 33 (as per the current version of the API):
{"success":0, "error":"The requested version 29 of the API is not in the range supported by the service.", "error_ext":{ "max_version":32, "min_version":32 }, "warning":"" }
3. Submitting queries
3.1. Synchronous interface
The following REST service implements the synchronous interface:
method | resource name |
---|---|
POST | /query |
A client is required to pass the JSON
object in the request body to specify what query needs to be executed. The object has the following schema:
{"query":<string>, "database":<string>, "binary_encoding":<string> }
Where:
attr | description | default |
---|---|---|
query | The query text in which the default database may be missing. In the latter case, a client needs to provide the name in a separate parameter. | |
database | The optional name of the default database for queries where no such name was provided explicitly. | "" |
binary_encoding | The optional format for encoding the binary data into JSON, where the following options for the values of the parameter are allowed:
The missing parameter will be treated as if the hexadecimal encoding was requested. Here is an example of the same sequence of 4-bytes encoded into the hexadecimal format:
The array representation of the same binary sequence would look like this:
| "hex" |
A call to this service will block a client application until one of the following events will happen:
- The query will be successfully processed and its result is returned to a caller.
- The query will fail and the corresponding error will be returned to the caller.
- The frontend will become unavailable (due to a crash, restart, or networking problem, etc.) and a network connection will be lost.
In case of the successful completion of the request, the service will return a result set in the JSON object explained in the section Result sets.
3.2. Asynchronous interface
The following REST service implements the asynchronous interface:
method | resource name |
---|---|
POST | /query-async |
A client is required to pass the JSON
object in the request body to specify what query needs to be executed. The object has the following schema:
{"query":<string>, "database":<string> }
Where:
attr | description | default |
---|---|---|
query | The query text in which the default database may be missing. In the latter case, a client needs to provide the name in a separate parameter. | |
database | The optional name of the default database for queries where no such name was provided explicitly. | "" |
A call to this service will normally block a client application for a short period until one of the following events will happen:
- The query will be successfully analyzed and queued for asynchronous processing by Qserv and a response object with the unique identifier of the query returned to a caller.
- The query will fail and the corresponding error will be returned to the caller.
- The frontend will become unavailable (due to a crash, restart, or networking problem, etc.) and a network connection will be lost.
In case of the successful completion of the request, the service will return the following JSON object:
{"queryId":<number>}
The number reported in the object should be further used for making the following requests explained in the dedicated subsections below:
- checking the status of the query to see when it's finished
- requesting a result set of the query
- or, canceling the query if needed
3.2.1. Checking the status of the ongoing query
This service can be also used for checking the status of queries submitted via the synchronous interface, provided the unique identifier of such query is known to a user.
The status of the query can be checked using:
method | resource name |
---|---|
GET | /query-async/status/<queryId> |
Where:
attr | type | description |
---|---|---|
queryId | number | The unique identifier of the previously submitted query. |
If the identifier of the query is not valid then the service will report an error in the response object. For example, consider the following request:
curl 'http://localhost:4041/query-async/status/123?version=32' -X GET
It might result in the following response:
{"success":0, "error":"failed to obtain info for queryId=123, ex: Czar::getQueryInfo Unknown user query, err=, sql=SELECT status,messageTable,resultQuery FROM QInfo WHERE queryId=123", "error_ext":{}, ... }
If the query identifier is valid then the following object will be returned:
{"success":1, .. "status":{ "queryId":<number>, "status":<string>, "totalChunks":<number>, "completedChunks":<number>, "queryBeginEpoch":<number>, "lastUpdateEpoch":<number> } }
Where the "status" dictionary of the response will have the following attributes:
attr | type | description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
queryId | number | The unique identifier of the previously submitted query. | ||||||||||
status | string | The ongoing status of the query can have one of the following values:
| ||||||||||
totalChunks | number | The total number of so-called "chunks" (spatial shards used for splitting the large tables in Qserv into smaller sub-tables to be distributed across worker nodes of Qserv). | ||||||||||
completedChunks | number | The number of chunks that have been processed by Qserv so far. The value of this parameter varies from 0 to the maximum number reported in the above-mentioned attribute | ||||||||||
queryBeginEpoch | number | The 32-bit number representing the start time of the query expressed in seconds since the UNIX Epoch | ||||||||||
lastUpdateEpoch | number | The 32-bit number represents the last time when the query status was recorded/updated by the Qserv internal monitoring system. The timestamp is the number of seconds since the UNIX Epoch. The number will be 0 if the query processing hasn't started or the query hasn't been inspected by the monitoring system. |
Here is an example of the status inquiry request that succeeded:
{"success":1, "status":{ "queryId":310554, "status":"EXECUTING", "totalChunks":1477, "completedChunks":112, "queryBeginEpoch":1708141345, "lastUpdateEpoch":1708141359 } }
Users could use the status service to estimate when the query will finish. Normally, the client applications are encouraged to wait before the query reaches the status "COMPLETED" and fetch a result set by calling another service explained next.
3.2.2. Requesting result sets
The resultsets could be retrieved by calling the following service:
method | resource name |
---|---|
GET | /query-async/result/<queryId>[?binary_encoding=<encoding>] |
Where:
attr | type | description | default |
---|---|---|---|
queryId | number | The unique identifier of the previously submitted query. | |
binary_encoding | string | The optional format for encoding the binary data into JSON, where the following options for the values of the parameter are allowed:
The missing parameter will be treated as if the hexadecimal encoding was requested. Here is an example of the same sequence of 4-bytes encoded into the hexadecimal format:
The array representation of the same binary sequence would look like this:
| "hex" |
Like in the case of the status inquiry request, if the query identifier is not valid then the service will report an error in the response object. Otherwise, a JSON object explained in the section Result sets will be returned.
3.3. Result sets
Both flavors of the query submission services will return the following JSON object in case of the successful completion of the queries:
{"schema":[ [{"table":<string>, "column":<string>, "type":<string>, "is_binary":<number>], // Col 0 [{"table":<string>, "column":<string>, "type":<string>, "is_binary":<number>], // Col 1 ... [{"table":<string>, "column":<string>, "type":<string>, "is_binary":<number>], // Col (NUM_COLUMNS - 1) ], "rows":[ // Col 0 Col 1 Col (NUM_COLUMNS - 1) // ------ -------- -------- [<string>, <string>, ... <string>], // Result row 0 [<string>, <string>, ... <string>], // Result row 1 ... [<string>, <string>, ... <string>], // Result row (NUM_ROWS - 1) ], ... }
Where:
attr | type | description | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
schema | array | A collection of rows, in which each row is a dictionary representing a definition of the corresponding column of the result set:
The number of columns will match the number of columns in the collection of result rows. | |||||||||||||||
rows | array | A collection of the result rows, where each row is a row of strings representing values at positions of the corresponding columns (see schema attribute above). |
For example, consider the following query submission request:
curl 'http://localhost:4041/query' -X POST-H 'Content-Type: application/json' \ -d'{"version":32,"query":"SELECT objectId,coord_ra,coord_dec FROM dp02_dc2_catalogs.Object LIMIT 5"}'
The query could return:
{"schema":[ {"column":"objectId","table":"","type":"BIGINT(20)"}, {"column":"coord_ra","table":"","type":"DOUBLE"}, {"column":"coord_dec","table":"","type":"DOUBLE"}], "rows":[ ["1248640588874548987","51.5508603","-44.5061095"], ["1248640588874548975","51.5626104","-44.5061529"], ["1248640588874548976","51.5625138","-44.5052961"], ["1248640588874548977","51.3780995","-44.5072101"], ["1248640588874548978","51.374245","-44.5071616"] ], "success":1, "warning":"", "error":"", "error_ext":{} }
4. Canceling queries
This service can be used for terminating queries submitted via the synchronous interface, provided the unique identifier of such query is known to a user.
The status of the query can be checked using:
method | resource name |
---|---|
DELETE | /query-async/<queryId> |
Where:
attr | type | description |
---|---|---|
queryId | number | The unique identifier of the previously submitted query. |
If the query identifier is not valid then the service will report an error in the response object. For example, consider the following request:
curl 'http://localhost:4041/query-async/123?version=32' -X DELETE
It might result in the following response:
{"success":0, "error":"failed to obtain info for queryId=123, ex: Czar::getQueryInfo Unknown user query, err=, sql=SELECT status,messageTable,resultQuery FROM QInfo WHERE queryId=123", "error_ext":{}, ... }
If the query identifier is valid then the following object will be returned:
{"success":1, ... }