(back to the list of all Panda meeting minutes)

Time

8 am PT

Attendees

Wen Guan Wei Yang Edward Karavakis Peter Love James Chiang Tim Jenness Jen Adelman-Mccarthy Michelle Gower Mikolaj Kowalik Fabio Hernandez 

Regrets

Richard Dubois 

Agenda:

  1. CM news
  2. Panda update:
    1. changes made over the break: k8s containers in Alma 9, m-core, DB partition 

Notes:

  1. CM news:
    1. With regards to DM-35114 - Getting issue details... STATUS (passing a long list QuantumNode UUIDs via a config file to Panda - for clustering), we all agreed that this capability is very desired.
    2. CM team ran step 1 and was on 2a before the winter break. Will start again next week. No blocks. Will use clustering 
  2. USDF Panda news:
    1. Improvement on memory boosting function. Need to define in a yaml before bps submission
    2. Implement iDDS DB table "contents" and "contents-xt" partitioning. This will improve the performance. Older partitions will eventually go to the archive database (Panda DB is already doing this).
    3. Chasing occasional zombie process that cause pilot to hang. No clear answer yet
    4. M-core
      1. Switch to use m-core jobs (m=8 currently), 1 pilot per core. Stress test at USDF and functional tests at UKDF and FrDF.
      2. Stress test at USDF: pull mode uses m-core, push mode (usually merging jobs with large memory requests) uses single core, reached concurrent 20k pilot at USDF
      3. "m" is turnable for different DFs. Concern that 1-core jobs will already win the scheduling competition against m-cores. Each DF will decide its optimal "m". 
    5.  CRIC config file scaling issue: each job will ready this same file - will put it in CVMFS 
    6. All iDDS/Panda/Harvester containers in k8s have been upgraded to alma9 based image. Running OK
    7. With all these changes, do we want to rerun step 1, etc. Jen will talk to Brian (on vacation until the week of Jan 15). 
  3. Panda meets Rucio
    1. Fabio: development of registring Panda job output to Rucio ran into an issue that the python versions by LSST software and Rucio client are different. Is it efficient to use the Rucio command line tool?