CSS v2

Related epic: DM-2882 - Getting issue details... STATUS

Related stories: DM-2966 - Getting issue details... STATUS

Problem Description

Existing version of CSS reads all information from Zookeeper (in python) when czar starts, makes a snapshot and makes it available to Qserv C++ layer. All the information is then statically cached througout the lifetime of czar. If any information in Zookeeper changes, czar will not know. With introductions of features such as creating / deleting databases, or keeping track of long-running queries, the existing model of static caching is not enough.

Information about database/tables has to be delivered very quickly. Querying CSS for each query in real-time is generally undesirable for (a) performance and (b) overloading CSS reasons – there can be o(100-200) calls for various CSS-based information per query.

Information about all databases and tables involved in a single query has to be consistent. In the current system we are reading individual keys one by one. This is OK because the information is all static, but once we switch to more dynamic system, we need to pay attention to that, because something might be updated/deleted in between fetching keys. So, we need to ensure information fetched from CSS related to a given query has to be consistent.

Information does not have to be completely up-to-date, e.g., it is OK to use a few-second-old cache.

Current Implementation

When a query arrives, Qserv parses a query, extracts a list of databases and tables involved in the query, and then asks for various pieces of information pertaining to each database and table involved in the query. The CSS information is cached through Facade which relies on underlying memory-based "KVInterface" owned by UserQueryFactory. The cache is initialized as follows:

UserQueryFactory::Impl::initFacade()       at ccontrol/UserQueryFactory.cc:213
UserQueryFactory::Impl::readConfigFacade() at ccontrol/UserQueryFactory.cc:192
UserQueryFactory::UserQueryFactory()       at ccontrol/UserQueryFactory.cc:87
Context::_initFactory()                    at czar/python/app.py:278
Context::_init()                           at czar/python/app.py:259
AppInterface::submitQuery()                at czar/python/appInterface.py:109

_initFactory() is called only once ("if not initialized then initialize"). It creates a Facade object, which is then cached in UserQueryFactory::Impl for the lifetime of czar. As new queries arrive for each query we are creating a new QuerySession. QuerySession triggers code in qana (QservRestictorPlugin.cc, TableInfoPool.cc, MatchTablePlugin.cc, RelationGraph.cc) and query (ChunkMapping.cc), that code is requesting information from the Facade, details about these interactions are provided below in Appendix A: Qserv Interactions with CSS. Then, QuerySession creates UserQuery object which does not hold pointer to Facade. After creating UserQuery, QuerySession goes out of scope. That means that for each query we are holding a pointer to Facade only for a very short time.

Proposed Design

Define how often we want to refresh cache. Configurable, with default = 15 sec.

Keep track of when last refresh was done (in python land, because only python land knows how to talk to zookeeper now). So in app.py

Instead of one Facade, allow keeping a vector of Facades in UserQueryFactory.

When a new query comes in, and CSS is due for refresh, make a new snapshot of the CSS information (fetch everything from zookeeper) and add a new Facade to UserQueryFactory. All new queries will now use the latest Facade.

When older facades are no longer needed (no QuerySessions are holding pointers to these Facades), remove them.

Note that in that design, it is unlikely that we will have more than 2 active Facades at any given time, because each query, even if very long, holds pointer to Facade for a very short time.

1. Optional optimization: avoid blocking new query when refreshing CSS

In the above design when Facade is due for refresh, one unlucky query will be blocked until refresh completes. Based on some limited observations, a refresh takes ~60 milliseconds (local zookeeper, a small data set from integration tests case 01). It might get worse though once refreshing cache will start obeying locks from readers/writers, which is currently not implemented.

If that starts to be a problem, we could trigger the refresh asynchronously rather then when new query comes.

2. Optional optimization: fetch only when something changed

Instead of blindly re-fetching few seconds, make sure there were updates in zookeeper. If nothing changed, don't re-fetch. There is one complication here: zookeeper keeps last_updated timestamp for each zk node, but it is not recursive. That means that there is no single last_updated value we could check. To determine if anything changed we would have to either scan all nodes and check their last_updated, or introduce a new node that would serve as a flag.

3. Optional optimization: fetch only what changed

Instead of blindly re-fetch everything, re-fetch only the parts that changed. Not clear if the extra complexity is worth the savings.

4. Optional optimization: fetch only what is needed

Instead of making full snapshot, take a snapshot of metadata for most commonly used databases / tables. When a new query comes in and it needs metadata about table/database that it does not have, fetch it and append to the existing snapshot.

Make empty-chunk info per-database

I believe information about empty chunks should not be global, it should probably be per partitioning information. But that is a separate story that we should handle separately.

The New Design

Per Database Meeting 2015-07-15 we will get rid of zookeeper and use mysql instead. We will implement mysql-based KVInterface in C++ and expose it to python layer (thus we will keep just one implementation instead of managing two). We will extend the KVInterface to support updates. And, finally, we will synchronize CSS updates with the new Query Metadata through locks to make sure CSS seen by Qserv is always consistent.

Appendix A: Qserv Interactions with CSS

qana/QservRestrictorPlugin.cc

in lookupSecIndex():
- containsDb()
- containsTable()
- getSecIndexColNames()
in operator()():
- containsDb()
- containsTable()
- tableIsChunked()
- getPartitionCols()
in _convertObjectId():
- containsDb()
- containsTable()
- getDirColName()

qana/TableInfoPool.cc

in get()
- getChunkLevel()
- isMatchTable()
- getMatchTableParams()
- getDirTable()
- getPartitionCols()
- getDbStriping()
- getDirTable()
- getDirColName()

qana/MatchTablePlugin.cc

in applyLogical()
- isMatchTable()
- getMatchTableParams()

qana/RelationGraph.cc

getOverlap()

query/ChunkMapping.cc

getChunkLevel() called in a loop

We are not calling:

tableIsSubChunked()
getAllowedDbs()
getChunkedTables()
getSubChunkedTables()

in any qserv/core file, with the exception of some test programs.

Space shortcuts

Page tree

CSS v2

Problem Description

Current Implementation

Proposed Design

1. Optional optimization: avoid blocking new query when refreshing CSS

2. Optional optimization: fetch only when something changed

3. Optional optimization: fetch only what changed

4. Optional optimization: fetch only what is needed

Make empty-chunk info per-database

The New Design

Appendix A: Qserv Interactions with CSS

6 Comments

Andy Salnikov

Jacek Becla

Andy Salnikov

Jacek Becla

Jacek Becla

Andy Salnikov