Skip to end of metadata
Go to start of metadata




  • Please register topics below

Discussion items


Project news

Progress from the previous meeting Database meeting 2022-03-09

Grand Unified Repo:

  • PR for merging qserv_testdata is still on the review for DM-33618 - Getting issue details... STATUS

Database init:

  • TBC

Subchunk sizes and overlap:

  • the problem is pertinent to the high-density catalogs
  • a goal is to decrease the number of rows in each subchunk to improve the of cross-joins in the N-N queries
  • do the query profiling first on the existing catalog (Fritz Mueller )
  • Igor Gaponenko has proposed to use an existing Data Exportation service of the Replication/Ingest system to get a subset of chunks from the existing kpm50  catalog where the problem is seen and reingest these data into the same 
  • Fritz Mueller will further investigate it and schedule it based on existing priorities.

Optimizations in processing results of the N-N queries

The context:

Update on worker load imbalance problemFritz Mueller 

The context was set in the previous meeting (see the link Database meeting 2022-03-09):

  • seems to be XRootD version dependant (Andy Hanushevsky 's help is needed here)
  • Andy Hanushevsky still needs to see the redirector's logs from the redirector and from one of the workers to see what's going on. John Gates would do this.

Andy Hanushevsky:

  • affinity works fine before an overload happens. After that XROOTD begins shifting chunk requests to further workers. This explains the linear behavior.


IDF worker crash this morningFritz Mueller 


How do we investigate this?

  • Andy Hanushevsky inspect the log files to see what service has the wrong address
  • Andy Hanushevsky 's theory is that we may have some "rogue" service in Qserv using the wrong IP address

Possible short-term solutions:

  • coordinate GKE upgrades with complete restarts of Qserv 

There is a (potentially?) related issue exhibiting itself in the worker logs as follows:

lsst.qserv.wdb.ChunkResource WARN: memLockStatus unexpected results, assuming LOCKED_OTHER. err=Error 0: Expecting one row, found no rows
lsst.qserv.wdb.ChunkResource WARN: Memory tables were not released cleanly! LockStatus=1

Further investigation shows that these harmless messages are posted by:

  • wdb/SQLBackend

Refactoring qserv-ingest

The work on modifying the workflow to begin using ASYNC ingest service is still in progress.

The SYNC mode worked successfully for ingesting 50 TB  catalog

Refactoring qserv-operator 


Fabrice Jammes is still working on the Operator to integrate the change.

Action items