Date

Attendees

Goals

Discussion items

TimeItemWhoNotes
 NCSA(everyone is at SC or busy with a service issue)

 

 Data AccessFritz Mueller
  • Working on deploying a new version of imgserv that will support coadds; was built by Brian Van Klaveren last night. Includes fixes from Jim Bosch and John Gates.
    • (Note added following meeting) Last night's build did not contain all necessary fixes to exposure ID decoding / handling. Rebuilding today.
  • Deployed a basic Nagios service.
    • Currently monitoring status of SQL services (i.e., Qserv czar).
    • Ready to monitor state of HTTP service endpoints (i.e., DAX webserv) as soon as NCSA changes the puppet configurations to permit it.
      • Will be able to accommodate monitoring of the SUIT endpoints as well. SUIT will look at making sure that there are useful "no-op" endpoints that support this.
    • Will also do machine-level monitoring (CPU, memory, disk space).
    • NCSA has its own production Nagios monitoring. Will look into merging what we need into that.
  • Following discussions with Mario Juric et al., re-prioritizing work to get the WISE data loaded sooner.
    • Gregory Dubois-Felsmann has already been working with Igor Gaponenko to test the transfer rates achievable between IPAC and NCSA for the ~3.5TB of WISE ForcedSource-like ("MEP") data. Naive HTTP transfers look like they'll be adequate to just run in the background for a few days. Object table is very small, not a problem. For now, will not explore any alternative data transfer approaches (fast parallel TCP tools or physical media).
    • Gregory Dubois-Felsmann will look into the readiness of the Source-like tables for bulk distribution.
    • Thinking about whether PDAC should / needs to host the WISE single-epoch or coadded image data. Using the IRSA image services remotely from PDAC is an option and might lower the workload on the Data Access group, freeing up some effort for QA. Discussion of a hybrid model where the IRSA services are wrapped by DAX imgserv. This could enable the DAX cutout and stitching services to be relied upon by the UI. Discussion of the creation of the camera mapper code that would be needed to support WISE data through the Butler.
      • This might be an interesting startup project for a new Data Access team member.
      • Gregory Dubois-Felsmann noted that WISE image data support in the Butler would also be a first step toward enabling pixel processing of the WISE data someday. This is relevant at least because it is an LSST SUIT goal to enable ad-hoc interactive pixel-level analysis.
      • PDAC will need the WISE image metadata tables in any event. Need to figure out how to get that data.
 SUITGregory Dubois-Felsmann, Unknown User (xiuqin), Trey Roby
  • Basic PDAC portal is coming together. Can search Object (RunDeepSource) and ForcedSource (RunDeepForcedSource), beginning to be able to search single-epoch and coadded image metadata tables, can display single-epoch images. Working on:
    • Polishing of the UI, to make the available data clearer to the user
    • Expanding the range of image searches available, following patterns found to be useful in the existing WISE archive service
    • Light-curve viewer
  • Main issues against the back ends are:
    • Need coadded image retrievals to work
    • Ability of Qserv to survive syntax errors in queries would greatly speed progress (and reduce the service load on the DAX team)
    • Some concern that imgserv performance is slower than expected. Will recheck using entirely internal-to-PDAC network connections.
      • Some discussion of making sure that imgserv logs permit the characterization of its performance. IPAC should be able to provide some data from the performance of the equivalent in-house services for WISE image data.
  • Lengthy discussion of the migration from scisql_ functions to qserv_ functions for optimization of queries against sharded tables.
    • Calling sequences are different, because the qserv_ functions depend implicitly on the spatial columns that were used to drive the sharding of tables, whereas these columns must be specified explicitly in scisql_ functions.
    • Large improvements in back-end performance will result from the correct use of the functions, though these may not always be visible to the user issuing the query.
    • Explicit dependency on whether a database is in Qserv is seen as a problem from the SUIT perspective, as this then inhibits testing the SUIT against pure-mySQL/MariaDB databases. (Not mentioned in the meeting: among other things, this complicates the establishment of "runtime CI" for the SUIT and DAX, if every test of client code requires a running instance of Qserv.)
    • Lack of certainty about the behavior in the case of queries against non-sharded (i.e., replicated) tables in a Qserv system. How are queries executed? Do the qserv_ functions work (what would the implicit columns be?)?
    • Substantial practical problems inhibit standardizing on the scisql_ functions, as their invocations would have to be accurately parsed to determine whether they actually used the shard-driving columns before rewriting them as qserv_ calls. It may be simpler to allow the opposite direction of rewrite; this could be a dbserv-level feature.
    • Trey Roby expressed great concern about whether users in the Python environment, who are being encouraged to think they can use the DAX services directly, can be trained to use the correct functions in the correct situation.
      • Note added after meeting: this is an especial concern when using the wrong function may have no immediately visible effect on performance for the user issuing the query.
    • Discussion of the need for both documentation and for online services supporting the ability to understand the services, which tables are sharded and how, etc.; see in part .
  • Good chance that a minimum operational capability can be demonstrated by the end of November / cycle F16.

Action items