(back to the list of all Panda meeting minutes)

Time

8 am PT

Attendees

Zhaoyu Yang Wei Yang Michelle Gower Tim Jenness Mikolaj Kowalik Jen Adelman-Mccarthy Wen Guan Fabio Hernandez Edward Karavakis Brian Yanny Peter Love 

Regrets

Richard Dubois 

Agenda:

  1. Update
    1. bps panda and clustering 
    2. other bps panda submission
    3. site issues
    4. panda installation at USDF
  2. Next steps
    1. If we will use a tract in DP0.2 for the next test, how do we prepare data/butler at all DFs, and select a tract?

Notes:

  1. Updates:
    1. long discussion about clustering. Vertical clustering (vertical in q-graph) is relative easy. Horizontal clustering (clustering the same pipetask over multiple images) is possible but require more work. Michelle: working on this part (enable clustering in bps-panda in general). Richard: Do we have an estimated date?
    2. Generally agreed that cluster ingcan also reduce the load on the Panda system. Panda team wants to estimate how long will a cluster of pipetasks runs. Tim: depends on pipetasks, things like isr will still be short. Tried to get an estimation based on FrDF DP0.2 plots. but not able to (For example, for isr, only know total time but do not know how many of them)
    3. Peter: working on turning cs_hsi_gen3 into an routine test function for all DFs. First attempt is here (https://lsst.lancs.ac.uk/fabric/). Does this run via Panda or via a single job (though ARC CE)?
    4. Harvester and iDDS optimization seems to show positive results. From Zhaoyu:
      HSC_RC2 used to use 3+ days https://panda-doma.cern.ch/tasks/?reqid=3105&days=100. Now 5h: https://panda-doma.cern.ch/tasks/?reqid=3596&days=100
    5. Wen: still want more scaling test at USDF via ARC CE: 1) remove the restriction of only running on rome nodes 2) spin up another ARC CE. Wei: 1) done 2) is possible since ARC CE runs in a container at USDF. Will looking in this, and Panda team will need to learn how to balance between two CEs.
    6. Scaling issue when logging to Google.
    7. Eddy still needs to test Panda DB at USDF for a few issues
    8. Fabio observed that many of the DP0.2 (question)  jobs repeatedly access the same file (or a few of them) for (O^5) times. Tim suspect that these are geometry files. Can enable local storage cache on batch nodes.
  2. Next steps
    1. Prepare processing one or a few tracts, similar to DP0.2.
    2. Total raw image data: ~50T. Available at FrDF (and should be available at USDF). UKDF has space to copy them. FrDF will provide the data source. Will not use Rucio to copy (no need to mix issues at this point)
    3. Will then need to ingest to Butler