Date
Zoom
https://stanford.zoom.us/j/91889629258?pwd=KzNleVdmSnA1dkN6VkRVUTNtMHBPZz09
Attendees
- Richard Dubois
- Brandon White
- Kian-Tat Lim
- Brian Yanny
- Yuyi Guo
- Steve Pietrowicz
- Wei Yang
- George Beckett
- Greg Daues
- Andy Hanushevsky
- Matt Doidge
- Wen Guan
- Peter Love
- Timothy Noble
- Lionel Schwarz
- Fabio Hernandez
- Peter Clark
- Michelle Gower
Apologies
Agenda — Data Replication
- Status of each facility's storage element:
- USDF
- UKDF
- FrDF
- Status of configuration of those storage elements in pre-production Rucio instance
- Status of replication exercises among facilities
- Status of Rucio & FTS monitoring
- Status of butler & Rucio integration
Data Replication JIRA Issues
- Status of Rucio evaluation platform at USDF and replication exercises
- Status of Rucio production platform at USDF
- Status of data replication exercises
- Status of data replication monitoring tools and logging platform
- Status of integration of Rucio and butler for automated ingestion
- Collective writing of a technote where we collect details on what we need to replicate and when
- JIRA tickets relevant for data replication
Note: the JIRA issues related to data replication have the label "data-replication" (among others)
Notes
Data Replication
- Fabio surveyed status of Storage Elements with regard to replication tests over next three months on evaluation platform
- Wei confirmed that US Storage Element for USDF is in a near-final configuration (potential to expand with more servers, though aim to keep Rucio configuration as is).
- Decision to use Object Storage (S3) rather than Posix storage may require a change.
- UK Rucio endpoint at Lancaster has been running for a while, supporting Rubin.
- Various minor bugs are in progress (tickets raised on LSST Stack and Xrootd groups)
- UK should be ready for replication tests, and expect to see reasonable bandwidth and initial capacity for a few hundred TBs of capacity
- Peter noted RAL storage end point also available.
- French DF has instance devoted to Rubin, with good connectivity to network and finalised hardware configuration
- This end point has been tested internally and is connected to Rubin Rucio service, so should be ready for replication testing
- Currently have 4–5 PB of capacty, with plans to expand in coming year (without change to network configuration).
- Wei confirmed that US Storage Element for USDF is in a near-final configuration (potential to expand with more servers, though aim to keep Rucio configuration as is).
- Rucio Production platform at USDF (Brad)
- IN2P3 endpoint now integrated
- Seeing issue with proxy certificates, due to hashing of secrets, which are being resolved
- Test dataset has been ingested and registered into Rucio (in place/ no duplication) (as of Friday 3rd)
- Next step is to kick-off rule-based transfers (750k files have been ingested, ready for this), aiming to progress in w/b 6th Feb.
- Option for data facilities to upload files into user space, to check this works, though Brandon previously confirmed this for each Rucio end point (except Summit, which is currently broken).
- Wei asked about ingest rate (e.g., file per second)
- Brandon noted Yuyi saw issues with ingest rate
- Data split into sets of 100k files, but Rucio server not able to handle this many files (error pointed to headers being too large for Apache/ API error about list being too big)
- Reducing sets to 1,000 files per set worked around that.
- Greg noted 1,000 looked to be upper limit in API.
- Wei noted rucio might be able to break larger collections of files into blocks of 1,000 files per block
- Greg observed, previously, ingesting 1,000 files would take around 5 seconds
- Yuyi noted worth to check if Rucio could stream in file list rather than creating a complete list a priori.
- Yuyi noted ingestion of 100 × 1,000 files took around 20 minutes, though saw some issues with connection during ingestion.
- Ingest was run from local terminal (not from SLAC)
- Had switched to different Rucio API (Python API) to work around problem with duplicate entries.
- Yuyi does not believe command-line interface is good enough for this case (command-line requires Rucio upload, meaning it will file copy, which is not what is intended)
- Brandon noted he is aware of other communities who want to be able to ingest data (without copying) via the Rucio command-line client. He proposes to raise this again with Rucio developers.
- Fabio suggested Python scripts might be a suitable starting point for a future CLI improvement.
- Yuyi noted potential bug in Rucio API, for list-dids (http://rucio.cern.ch/documentation/using_the_client). Depending on when called, API produced different results, which initially led Yuyi to believe there was an ingestion issue. List-DID looks to be delayed in reporting when files ingested.
- Brandon noted seeing issues (missing file links) as long as two days after the upload was ocmpleted.
- Yuyi noted was list-dids command line client
- Monitoring (Tim N)
- Tim noted was unable to progress with monitoring work as still waiting for his account to be reactivated.
- Most recent comms from SLAC supprt on Friday 3rd: hope to have account reset in w/b 6th Feb.
- Rucio-butler integration (Steve P)
- Steve discussing side-car issue with KT and documenting status of problem
- Steve discussing set up of Kafka and Mirrormaker at each DF (could be containers)
- Steve noted that move to S3 would require changes to Butler ingestion: proposes that team make a decision on this now. Currently relies on Posix path being used as prefix for file identifier.
- KT though swapping Posix path with Object URL would be okay: Butler recognises Object URLs
- Steve proposes to test software as soon as possible, within the next month.
- First step would be to get Kafka running and tested.
- K-T noted we need names for FrDF and UKDF to set up Kafka cluster (and MirrorMaker) end points on a cluster/ credentials.
- Peter L noted UK site does not currently offer Kubernetes. Bare metal potentially possible but might involve additional overhead.
- Peter L noted UK would normally provide service (in this case Kafka Bootstrap Endpoint)
- Peter L noted need to confirm details of versioning and configuration.
- Wei suggested running Docker would simplify things as could use existing image.
- Fabio committed to report back on how FrDF could run Kafka cluster and MirrorMaker.
- Each DF would run a Kafka cluster (min three nodes) and a MirrorMaker2 instance.
- This is to avoid remote sites contacting US directly for Kafka messages. MirrorMaker 2 would provide this functionality behind the scenes.
- Brandon noted need for ActiveMQ instance to help detect when transfers have completed.
- Steve noted Butler used Kafka to determine when transfers have completed.
- Steve has modified Hermes so it sends messages both to ActiveMQ and to Kafka.
- Wei noted two sources of confirmation that a transfer is complete (FTS and ActiveMQ)
- Brandon noted currently rely on polling FTS for when a transfer is completed. With new Rucio, can be notified (by ActiveMQ) when a transfer has completed
- Fabio asked about plan for replicating ComCam data from USDF to FrDF.
- K-T noted would use manual version of Steve P's code. Raw files contain all of th information, so could use butler-ingest-raw with appropriate options.
- Would package files into datasets and could check when a dataset was complete, and manually trigger replication once that happened.
- No code at USDF end to register files into Rucio, though would expect to use approach similar to what Yuyi has done for replication tests.
- Fabio asks if there is existing data that could be used to test the transfer mechanism.
- K-T expects there to be delay in implementation of Embargo-to-main, to register files in auto-ingester when they arrive at Embargo storage (this would not be permitted in production)
- Wei noted he had questioned whether the right thing to do was to use hashed directories (for Rubin Camera and for DRP)
- Some sites (running NFS server) are limited to 32k files per directory, which is what motivated use of hashed directories
- K-T believe Rubin has opposite problem of too many directories (assuming we keep the Butler directory breakdown).
- Wei noted default Rucio configuration is a two-level hash tree. Need to update configuration of Rucio, if wish to do something different.
- Fabio reminded people initial plan was to note modify Rucio configuration until replication test had been completed. However, Fabio expected need to reconfigure Rucio for ComCam data transfer.
Multi-site PanDA Support and Operation
- Wei noted progress with 4,000-character limit issue:
- Michelle noted prepare portion of Control-BPS-PanDA had been updated, but bug identified to be fixed
- Next step is to configure Control-BPS-PanDA to work with job runner
- This should resolve the 4,000-character limit.
- Wen started submitted jobs from PanDA instance in USDF to FrDF and UKDF (configuring NAT for outbound connections from USDF).
- Test data works from SLAC test instance of PanDA service (waiting for production DB to be available)
- Next step is to test normal PanDA job and BPS job to UK DF and Fr DF.
Date of next meeting
Monday March 6th, 8 am PT (initially scheduled for February 20th but canceled since that day is holiday in the US)
Discussions on Slack #dm-rucio-testing in interim