(back to the list of all Panda meeting minutes)

Time

8 am PT

Attendees

Brian Yanny Wei Yang Richard Dubois James Chiang Wen Guan Mikolaj Kowalik Edward Karavakis Michelle Gower Jen Adelman-Mccarthy Fabio Hernandez Tim Jenness 

Regrets

Agenda:

  1. CM news
  2. Panda news
    1. uniform PQ name and batch job name?
    2. news on using Lancaster monitoring for HammerCloud? can it submit Panda jobs?
  3. Panda in IDF, show we delete them? How?
  4. FYI: Update on Panda meets Rucio: discussion in Rucio data replication meeting on deterministic vs non-deterministic RSE 


Notes:

  1. CM News
    1. Can now use CM tools to submit to multi-DF (before we had to use bare bps commands). However, some features are needed in CM tools. e.g. need scripts to run chain-collection command.
    2. Working on accessing remote sites directly for debugging purpose.
    3. See a few job with long wall time >> CPU time. 
    4. more stress test after Panda config changes during winter break? Sierra will try later this week or next week.
    5. Heartbeat info from payload?
      1. pipeline team implemented logging infrastructure, and should be turn on
      2. pilot should monitor this and to avoid killing tasks.
      3. Wen will check and send a message to the pipeline team that we want to configure the logging infrastructure to emit heartbeat every 2h (ideally ~30m).
    6. Where to set the bps retry? In bps submission yaml. Currently per task? Wen will work on finer granularity.
  2. Panda News:
    1. fixed a problem in bps report.
    2. Event service is not ready yet. Need to fix a panda monitoring problem (Postgres specific).
    3. Readiness probe in k8s (suggested during Panda K8s deployment review) implemented.
    4. Panda/iDDS accept ping test. Currently probe the web frontend, not the backend agents.
  3. HammerCloud
    1. Peter is looking into it, based on Lancaster CE monitoring
    2. will include pipeline check jobs submitted via Panda.
  4. Panda meets Rucio
    1. Steve is working on registering output to Rucio. His script does in-place registration (avoid copying/upload). Work with non-deterministic RSEs but the main RSEs are all DF are deterministic. 
    2. Concern about extra info in Rucio DB is using non-deterministic RSEs widely.
    3. To use deterministic RSE, one question is how to handle Rucio scope.
    4. To be discussed in Rucio data replication meeting.