(back to the list of all Panda meeting minutes)

Time

8 am PT

Attendees

Mikolaj Kowalik Wei Yang Brian Yanny Zhaoyu Yang James Chiang Peter Love Fabio Hernandez Wen Guan Richard Dubois Edward Karavakis Jen Adelman-Mccarthy 

Regrets


Agenda:

  1. Update on step 1 (and 2) at all DFs
  2. USDF Panda server installation status
  3. Scaling test at FrDF and UKDF: Multi-site_Scaling-test.pdf

Notes:

  1. Short summary of step 1 and scaling test at USDF, FrDF and UKDF:
    • Run ~3000 cores and USDF and FrDF, and ~100 cores at UKDF (5000 core are available now). Smoother run at FrDF and UKDF compare to USDF, indicating limitations of USDF Squid and NAT.
    • Some Panda log rotating issue being looked at (Log rotating happens at mid-night EU time. Running jobs that avoiding that time also reduces the number of failure/retry).
    • Panda DB issues: (for job state update) will ask pilot to retry after a short random delay.
    • BPS clustering clearly helps
  2. Next steps:
    • More testing with cluster (so far only tried at FrDF). Try next step (step3) and eventually all steps
  3. Monitoring:
    1. DF level monitoring (probably implemented by each DFs, internal to DF)
    2. Panda's built-in monitoring: currently mixed with other experiments at the Panda-DOMA instance (CERN). The USDF Panda will be dedicated to Rubin. 
    3. What CM team want to see in monitoring?
  4. Working with CM team
    • once the patch for bps submit (to generate qgraph at remote site) is validate, can we start working to CM team so that they can start getting familiar of using Panda?
    • Can we dedicate one Panda meeting to discuss with the CM team and their needs and feedback. after CHEP.
  5. Panda server installation
    • Requesting a K8s cluster for production instance
    • Will testing Loki