Time, Date & Place
12:00 Pacific 2019-05-10 on Amazon Chime https://chime.aws/1930107527
Attendees
Lorena
Sanjay
Aaron
Chris
Greg Thain
Miron
Discussion Items
- The AWS team (Aaron, Chris, Lorena) provided a helpful AMI Demo last Friday 5/3 (joined by Steve, GregD, GregT, K-T, Hsin-Fang). Topics included how to create an AMI, create an EC2 instance using the AMI, how to share the AMI, advices in managing accounts and users, etc.
- Updates on AWS account/user
- Credit has been applied, and some users have been created; contact Hsin-Fang Chiang if you need a user but haven't got one.
- Please use this user for the POC related work only.
- AWS services covered by the POC credit: list-of-services.txt
- It'd be better to move it to be under Wil's account. The migration can be done; Chris/Aaron can help.
- (action) Hsin-Fang will move this to Wil's account. Wil can add Hsin-Fang to his organization so extra bills go there.
- Updates on running LSST pipelines on AWS instances
- AMI building with the LSST software stack w_2019_14 and the HTCondor software
- First running on single-node; straight-forward.
- Greg can run this with a two-node HTCondor pool. Data in an EFS filesystem.
- Why EFS? Because a shared file system is needed currently, for worker nodes to access to the sqlite3 file and the butler datastore.
- LSST-DM code is not ready to work a larger dataset yet. May be ready in some weeks
- Future: want a larger workflow, with annex & spot fleet
- Updates on S3 Butler Repos
- going well. First PR on using the S3 datastore is almost ready. The RDS registry part will be next.
- Dino has concerned about a test that takes 3-5 minutes.
- (action) Hsin-Fang will try the demo after Dino merge his S3 work.
- (action) KT and Dino will talk more about RDS and design; consult the AWS team for performance
- Others?
- (action) Hsin-Fang will mock a bigger workflow using the same small dataset, as the DM code isn't ready to do other datasets yet.
- CondorWeek is in 2 weeks. We'll have an additional in-person meeting there. Will talk about data movement.
- Potential AWS + Condor Annex demo
- Sanjay mentioned we should not need instances larger than T3 so far. Keep the credits for large tests later. For more memory, maybe C4 or C3.
- We talked about file sizes and the pattern of scaling up is a larger number of files, but individual file size doesn't increase much.