1. Scope

This is a summary of Qserv's HTTP-based frontend for submitting queries as developed in:

The frontend complements the existing one based on  mysql-proxy.

Key features of the new API:

  • It's based on the HTTP protocol.
  • Result sets (data and schema) are returned in the JSON objects.
  • Both synchronous and asynchronous query submission operations are supported.
  • It provides a mechanism for tracking the progress of the asynchronously submitted queries.
  • It allows query cancellation.
  • It provides the protocol versioning to ensure the integrity of the distributed applications.

2. General guidelines

This section introduces the general guidelines for using the REST services of the frontend. 

2.1. JSON

All services regardless of the HTTP method (GET , POST , etc.) report results in the JSON objects. The schema of the objects varies depending on a service called. A detailed explanation of the schema can be found in the corresponding sections where the services are explained. 

2.2. Error reporting

If a service was able to receive a request it would always return a response with the HTTP code 200. The actual completion status of a request is returned in the JSON object. The following attributes are related to the completion status:

attrtypedescription
success
number

The binary flag indicating if the request succeeded:

  • 0 if a request failed (then see attributes warning, error, and error_ext)
  • any other number if a request succeeded (also inspect the attribute warning for non-critical notifications)
error
string

An explanation of the error (if any).

error_ext
object

The optional information on the error (if any). The content of the object is request-specific. Details can be found in the reference section of the corresponding request.

warning
string

The optional warning for on-critical conditions. Note that warnings may be posted for both completed or failed requests.

Other HTTP codes  (3xx, 404, 5xx, etc.) could be also returned by the frontend's HTTP server or intermediate proxy servers. 

2.3. Protocol Versioning

The API adheres to the optional version control mechanism introduced in:

DM-35456 - Getting issue details... STATUS

Application developers are encouraged to use the mechanism to reinforce the integrity of the applications. The document corresponds to version number 33 of the API.

Each implementation of the API has a specific version. The version number will change if any changes to the implementation or the API that might affect users will be made. The current document will be also kept updated to reflect the latest available version of the API.

There are two ways the client applications can use the version numbers:

  • "pull mode": Ask the frontend explicitly what version it implements and cross-check that number with the one expected by the application.
  • "push mode": Pass the expected version number as a parameter when calling the services and let the services verify if that version matches one of the frontend implementations.

Application developers are free to use neither, either of two, or both methods of reinforcing their applications.

2.3.1. Pull mode

To support the first scenario, the API provides a special metadata service that will return the version number (along with some other information on the frontend):

methodresource name
GET/meta/version 

The request object for this request is not required, or it could be an empty JSON  object {}. In case of its successful completion, the service will return a JSON object that will include the following attributes:

{
  "kind":<string>,
  "name":<string>,
  "id":<number>,
  "instance_id":<string>,
  "version":<number>,
  ...
}

Where:

attrtypedescription
kind
string

The name of the service. The following name will be always reported:

qserv-czar-query-frontend
name
string

The unique name of the frontend within a given Qserv. The current implementation will always return:

http
id
number

The numeric identifier of the frontend within a given Qserv. The number returned here may vary.

instance_id
string
An identifier of the Qserv. A value of the attribute depends on a particular deployment of Qserv.
version
number
The current version number of the API.

Example:

{"kind":"qserv-czar-query-frontend",
 "name":"http",
 "id":8,
 "instance_id":"qserv-prod",
 "version":33,
 "success":1,
 ...
}

2.3.2. Push mode

In the case of the second scenario, an application will pass the desired version number as a request parameter. The number would be a part of the request's query for the method. For example, the following request for checking the status of the ongoing query might look like this:

curl 'http://localhost:4041/query-async/status/1234?version=33' -X GET

For other HTTP methods used by the API, the number is required to be provided within the body of a request as shown below:

curl 'http://localhost:4041/query-async' -X POST -H 'Content-Type: application/json' -d'{"version":33,"query":"SELECT ..."}'

If the number does not match expectations, such a request will fail and the service return the following response. Here is an example of what will happen if the wrong version number 29 is specified instead of 33  (as per the current version of the API):

{"success":0,
 "error":"The requested version 29 of the API is not in the range supported by the service.",
 "error_ext":{
  "max_version":32,
  "min_version":32
 },
 "warning":""
}

3. Submitting queries

3.1. Synchronous interface

The following REST service implements the synchronous interface:

methodresource name
POST/query

A client is required to pass the JSON  object in the request body to specify what query needs to be executed. The object has the following schema:

{"query":<string>,
 "database":<string>,
 "binary_encoding":<string>
}

Where:

attrdescriptiondefault
query
The query text in which the default database may be missing. In the latter case, a client needs to provide the name in a separate parameter. 
database
The optional name of the default database for queries where no such name was provided explicitly.""
binary_encoding

The optional format for encoding the binary data into JSON, where the following options for the values of the parameter are allowed:

  • hex - for serializing each byte into the hexadecimal format of 2 ASCII characters per each byte of the binary data, where the encoded characters will be in a range of ‘0' .. ‘F’ . In this case, the encoded value will be packaged into the JSON string.

  • array - for serializing bytes into the JSON array of numbers in a range of 0 … 255.

The missing parameter will be treated as if the hexadecimal encoding was requested.

Here is an example of the same sequence of 4-bytes encoded into the hexadecimal format:

"0A11FFD2"

The array representation of the same binary sequence would look like this:

[10,17,255,210]

"hex"

A call to this service will block a client application until one of the following events will happen:

  • The query will be successfully processed and its result is returned to a caller.
  • The query will fail and the corresponding error will be returned to the caller.
  • The frontend will become unavailable (due to a crash, restart, or networking problem, etc.) and a network connection will be lost.

In case of the successful completion of the request, the service will return a result set in the JSON object explained in the section Result sets.

3.2. Asynchronous interface

The following REST service implements the asynchronous interface:

methodresource name
POST/query-async

A client is required to pass the JSON  object in the request body to specify what query needs to be executed. The object has the following schema:

{"query":<string>,
 "database":<string>
}

Where:

attrdescriptiondefault
query
The query text in which the default database may be missing. In the latter case, a client needs to provide the name in a separate parameter. 
database
The optional name of the default database for queries where no such name was provided explicitly.""

A call to this service will normally block a client application for a short period until one of the following events will happen:

  • The query will be successfully analyzed and queued for asynchronous processing by Qserv and a response object with the unique identifier of the query returned to a caller.
  • The query will fail and the corresponding error will be returned to the caller.
  • The frontend will become unavailable (due to a crash, restart, or networking problem, etc.) and a network connection will be lost.

In case of the successful completion of the request, the service will return the following JSON object:

{"queryId":<number>}

The number reported in the object should be further used for making the following requests explained in the dedicated subsections below:

  • checking the status of the query to see when it's finished
  • requesting a result set of the query
  • or, canceling the query if needed

3.2.1. Checking the status of the ongoing query

(warning) This service can be also used for checking the status of queries submitted via the synchronous interface, provided the unique identifier of such query is known to a user.

The status of the query can be checked using:

methodresource name
GET/query-async/status/<queryId> 

Where:

attrtypedescription
queryId
number

The unique identifier of the previously submitted query.

If the identifier of the query is not valid then the service will report an error in the response object. For example, consider the following request:

curl 'http://localhost:4041/query-async/status/123?version=32' -X GET

It might result in the following response:

{"success":0,
 "error":"failed to obtain info for queryId=123, ex: Czar::getQueryInfo Unknown user query, err=, sql=SELECT status,messageTable,resultQuery FROM QInfo WHERE queryId=123",
 "error_ext":{},
 ...
}

If the query identifier is valid then the following object will be returned:

{"success":1,
 ..
 "status":{
   "queryId":<number>,
   "status":<string>,
   "totalChunks":<number>,
   "completedChunks":<number>,
   "queryBeginEpoch":<number>,
   "lastUpdateEpoch":<number>
 }
}

Where the "status" dictionary of the response will have the following attributes:

attrtypedescription
queryId
number

The unique identifier of the previously submitted query.

status
string

The ongoing status of the query can have one of the following values:

valuemeaning
EXECUTING
The query processing is still in progress. 
COMPLETED
The query has been completed.
FAILED

The query failed.

(warning) Unfortunately, the current implementation doesn't retain reasons for query failures. This problem will be addressed in the future release of Qserv.

ABORTED

The query was aborted:

  • explicitly by a user using the query cancellation REST service explained in the document.
  • or, implicitly by Qserv if the intermediate result set of the query exceeds the large result limit (which is configured by the Qserv administrators). 
  • or, implicitly when the query processing service was restarted due to some failure or by Qserv administrators.
totalChunks
number

The total number of so-called "chunks" (spatial shards used for splitting the large tables in Qserv into smaller sub-tables to be distributed across worker nodes of Qserv).

completedChunks
number

The number of chunks that have been processed by Qserv so far. The value of this parameter varies from 0  to the maximum number reported in the above-mentioned attribute totalChunks.

queryBeginEpoch
number

The 32-bit number representing the start time of the query expressed in seconds since the UNIX Epoch

lastUpdateEpoch
number

The 32-bit number represents the last time when the query status was recorded/updated by the Qserv internal monitoring system. The timestamp is the number of seconds since the UNIX Epoch. The number will be 0  if the query processing hasn't started or the query hasn't been inspected by the monitoring system.

Here is an example of the status inquiry request that succeeded:

{"success":1,
 "status":{
   "queryId":310554,
   "status":"EXECUTING",
   "totalChunks":1477,
   "completedChunks":112,
   "queryBeginEpoch":1708141345,
   "lastUpdateEpoch":1708141359
  }
}

Users could use the status service to estimate when the query will finish. Normally, the client applications are encouraged to wait before the query reaches the status "COMPLETED" and fetch a result set by calling another service explained next.

3.2.2. Requesting result sets

The resultsets could be retrieved by calling the following service:

methodresource name
GET/query-async/result/<queryId>[?binary_encoding=<encoding>] 

Where:

attrtypedescriptiondefault
queryId
number

The unique identifier of the previously submitted query.


binary_encoding
string

The optional format for encoding the binary data into JSON, where the following options for the values of the parameter are allowed:

  • hex - for serializing each byte into the hexadecimal format of 2 ASCII characters per each byte of the binary data, where the encoded characters will be in a range of ‘0' .. ‘F’ . In this case, the encoded value will be packaged into the JSON string.

  • array - for serializing bytes into the JSON array of numbers in a range of 0 … 255.

The missing parameter will be treated as if the hexadecimal encoding was requested.

Here is an example of the same sequence of 4-bytes encoded into the hexadecimal format:

"0A11FFD2"

The array representation of the same binary sequence would look like this:

[10,17,255,210]

"hex"

Like in the case of the status inquiry request, if the query identifier is not valid then the service will report an error in the response object. Otherwise, a JSON object explained in the section Result sets will be returned.

3.3. Result sets

Both flavors of the query submission services will return the following JSON object in case of the successful completion of the queries:

{"schema":[
  [{"table":<string>, "column":<string>, "type":<string>, "is_binary":<number>],   // Col 0
  [{"table":<string>, "column":<string>, "type":<string>, "is_binary":<number>],   // Col 1
  ...
  [{"table":<string>, "column":<string>, "type":<string>, "is_binary":<number>],   // Col (NUM_COLUMNS - 1)
 ],
 "rows":[
   // Col 0   Col 1         Col (NUM_COLUMNS - 1)
   // ------  --------      --------
   [<string>, <string>, ... <string>],                       // Result row 0
   [<string>, <string>, ... <string>],                       // Result row 1
   ...
   [<string>, <string>, ... <string>],                       // Result row (NUM_ROWS - 1)
  ],
 ...
}

Where:

attrtypedescription
schema
array

A collection of rows, in which each row is a dictionary representing a definition of the corresponding column of the result set:

attrtypedescription
table
string

The name of the result table. The string will be empty in the current version of the frontend.

column
string

The name of a column.

type
string

The MySQL type of the column as in the MySQL CREATE TABLE ... statement.

is_binary
number

The numeric flag indicates if the column type represents the binary type. The MySQL binary types are listed below:

A value that is not 0  indicates the binary type.

(warning) Binary values need to be processed according to a format specified in the optional attribute "binary_encoding" mentioned in the descriptions of the query submission and result retrieval services:

(warning) The number of columns will match the number of columns in the collection of result rows.

rows
array

A collection of the result rows, where each row is a row of strings representing values at positions of the corresponding columns (see schema attribute above).

For example, consider the following query submission request:

curl 'http://localhost:4041/query' -X POST-H 'Content-Type: application/json' \
    -d'{"version":32,"query":"SELECT objectId,coord_ra,coord_dec FROM dp02_dc2_catalogs.Object LIMIT 5"}'

The query could return:

{"schema":[
  {"column":"objectId","table":"","type":"BIGINT(20)"},
  {"column":"coord_ra","table":"","type":"DOUBLE"},
  {"column":"coord_dec","table":"","type":"DOUBLE"}],
 "rows":[
  ["1248640588874548987","51.5508603","-44.5061095"],
  ["1248640588874548975","51.5626104","-44.5061529"],
  ["1248640588874548976","51.5625138","-44.5052961"],
  ["1248640588874548977","51.3780995","-44.5072101"],
  ["1248640588874548978","51.374245","-44.5071616"]
 ],
 "success":1,
 "warning":"",
 "error":"",
 "error_ext":{}
}

4. Canceling queries

(warning) This service can be used for terminating queries submitted via the synchronous interface, provided the unique identifier of such query is known to a user.

The status of the query can be checked using:

methodresource name
DELETE/query-async/<queryId> 

Where:

attrtypedescription
queryId
number

The unique identifier of the previously submitted query.

If the query identifier is not valid then the service will report an error in the response object. For example, consider the following request:

curl 'http://localhost:4041/query-async/123?version=32' -X DELETE

It might result in the following response:

{"success":0,
 "error":"failed to obtain info for queryId=123, ex: Czar::getQueryInfo Unknown user query, err=, sql=SELECT status,messageTable,resultQuery FROM QInfo WHERE queryId=123",
 "error_ext":{},
 ...
}

If the query identifier is valid then the following object will be returned:

{"success":1,
 ...
}





  • No labels