Time, Date & Place

14:00 Pacific 2019-05-30 on Amazon Chime https://chime.aws/1930107527

Attendees

Kian-Tat Lim

Michelle Gower

Unknown User (mbutler)

Greg Daues

Steve Pietrowicz

Hsin-Fang Chiang

ChrisM

Aaron

GregT

Miron

Discussion Items

  • Sanjay, GregD, and Hsin-Fang had a good time in HTCondorWeek thanks to the HTCondor team. 
    • GregT showed Hsin-Fang & GregD how to start a condor pool on AWS; notes here
  • Using HTCondor Annex & Spot 
    • Data on EFS; with a larger (mock) workflow than ci_hsc ; not on Spot 
    • copying 4GB into EFS took 20 min. Question to AWS folks: what's the best way to load large datasets into EC2? 
    • It sounds longer than it should. But we should try to move out of EFS as soon as possible. 
    • If we really have to, there are utilities to move data in parallel
    • Should start loading larger dataset into Amazon storage; hope to put directly into S3 storage as a repo.  (action) Hsin-Fang will ask Dino
    • Next dataset is ~50,000 files and each file is ~20MB
    • (action) Hsin-Fang will try GregT's steps of the annex run
    • a demo with Spot
  • Using S3 as the data storage 
    1. Butler S3 datastore 
    2. HTCondor S3 plug-in for data transfer, shared-nothing
      • Likely need new utilities in Butler, such as local sqlite registry, to do shared-nothing. Need to provide URLs? Other prerequisites?
      • Condor needs to know job input/output S3 URLs  
      • Need to create a local butler repo – however not in LSST-DM schedule in the next few months 
      • (stopgap) One shared registry. Need 3 new utilities: URL generation, S3 to Posix datastore copy, Posix to S3.
        • Run as a Posix datastore locally at workers
        • LSST is unlikely have the resources in the near future to do such work in BPS though
      • Hsin-Fang still remembers to give GregT an example without S3 – will do
  • Q: operationally, how not to destroy out data on S3 while running tests? 
    • bucket sync
    • read-only delete-protect 
  • No labels