This is an extension of the previous use case Ingesting ALL chunks independently (single transaction).

Workflow organization:

  • All chunk contributions are ingested within multiple independent transactions.
  • All contributions of a particular chunk are ingested via the same transaction.
  • Each chunk is allocated and ingested independently of the others.
  • The transactions are committed independently when all contributions to the corresponding chunks have been successfully uploaded.

Here is a diagram illustrating the idea:

many_chunks_many_transactions

Things to consider (keep in mind):

  • Even though this scheme assumes that each chunk is "assigned" to some transaction this is not strictly required. The system allows allocating the same chunk and ingesting contributions into that chunk from any (or many transactions). Just make sure not to ingest the same set of rows (the same set of contributions) within more than one transaction. Though, the very same rule applies to any workflow anyways.

Best use:

  • When a workflow ingests a large amount of data it can be separated into independently ingested groups based on chunks. Remember that transactions provide a mechanism for mitigating failures. 
  • No labels