(back to the list of all Panda meeting minutes)

Time

8 am PT

Attendees

Tim Jenness Michelle Gower James Chiang Zhaoyu Yang Brian Yanny Richard Dubois 

Regrets

Jen Adelman-Mccarthy Wen Guan Edward Karavakis 

Agenda:

  1. news from CM team
  2. Panda installation at USDF
  3. Multi-site testing
    1. response from S3DF about SLURM memory limit, RSS vs VSZ: "we have ConstrainRAMSpace=yes in our slurm cgroup.conf which means “constrain the job’s RAM usage by setting the memory soft limit to the allocated memory and the hard limit to the allocated memory.”

Notes:

  1. HSF PDR2 processing
    1. completed step1 wide and deep (and ultra-deep) . Will use CM tools to group/monitoring step 2 (expect step 2 to take a week)
    2. step1 is mostly smooth. saw <0.1% hanging pilots (also saw in other runs at USDF, not seen in FrDF and UKDF at smaller scales). Retry will fix the issue but cause a long tail for the overall task to finish.
  2. Panda installation at USDF
    1. Panda prod and DB deployed and working. IAM works with cilogon but has issue with SLAC Dex. Debugging
  3. Multi-site testing
    1. Give most resources to CM team so only small scale testing
    2. "File distribution" env for Panda jobs are site specific. Defined in prod accounts at DFs but not defined for users.