Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. When "DROP TABLE XYZ" is requested through one of our czars, the czar sets the value of /DBS/<dbName>/TABLES/XYZ in CSS to "PRE_DELETE_<date>", and returns the uuid for that table. The <date> indicates when the pre_delete was initiated. Note that the query execution has to pay attention to values of keys /DBS/<dbName>/TABLES/<tableName> and it can not schedule a query against any table unless it is in "READY" state
  2. The value of /DBS/<dbName>/TABLES/XYZ is watched by a deletion watcher. There is only one such watcher (e.g., it is not per worker). The watcher is considered best-effort, it can fail, it can miss deletes, it is unreliable. When watcher wakes up on "PRE_DELETE", it ensures all czars had enough time to refresh their state and know about pending delete by sleeping a short amount of time: 30 sec. If, for some reason some czars will fail to refresh their state during that time and start scheduling queries about the table XYZ, these queries will most likely die.
  3. The watcher then scans the list of long running queries and look for queries that involve the XYZ table. Note, that means the CSS metadata keeping track of active queries needs to keep track of tables involved in each query. If there are queries on that list involving the XYZ table, wait and periodically re-check, and proceed only when all such queries complete.
  4. When there are no more active queries on that table, the watcher removes the entry /DBS/<dbName>/TABLES/XYZ and enters /DELETING/DBS/DELETING/<dbName>/TABLES/XYZ_<uuid>. Note that when this happens, queries on the XYZ that is being deleted will fail. Also note that at this point "CREATE TABLE XYZ" will be accepted, however individual workers can reject if it chunks for the XYZ that is being deleted didn't get removed.
  5. The watcher then sends a message to the Data Distribution System: "DROP TABLE XYZ" (or maybe it does it for each chunk of XYZ, tbd depending how much Data Distribution System will know).
  6. Data Distribution System is responsible for deleting all replicas for a given chunk.
  7. A separate process that watches overall health of the system will periodically clean entries in /DELETING/DBS/DELETING.

Note that the above is missing steps needed for provenance tracking. I assume that will be dealt with under a separate ticket.

...

  • /DBS/<dbName>/TABLES/<tableName>
    • PENDING: the table is currently created
    • READY: indicates the table is ready to be used / queried
    • PRE_DELETE <date>: indicates that the table is about to be deleted. The <date> indicates the date/time when the deletion was requested
  • /DELETING/DBS/DELETING/<dbName>/TABLES/<tableName>_<uuid>

...