60min | Debrief from LSST 2017 | | - Not clear how AHM is organized (the overall goals, the theme). This remains an issue.
- How do we strike a good balance of sessions we want (bottom-up) vs. sessions we need (top-down organization)?
- Mario Juric: make sure to fill out the LSST2017 surveys!
- Was at the DESC commissioning simulations session, found it useful.
- Gregory Dubois-Felsmann
- Dialing in from JupyterCon
- Summarized the LSP session
- Lots of talk of specifics of using the LSP, large result sets, reconstitution of data into its Python form from queries
- The only fundamentally new thing: somebody asking about our ability to provide encryption of users' notebooks
- We need to continue reaching out to the SCs and make sure the tools we’re building meet the needs of the science they want to do. Gregory is interested to become more active in that outreach: talk to the community how they plan to use our tools, steer them in the right direction, and update our plans when needed. We all need to understand that better.
- Spent most of time on Butler/SuperTask; more as a JTM than meeting with scientists
- Was in PSF estimation & deblending sessions, but didn't learn anything fundamentally new (because we already talk to these folks)
- Level 3 batch questions:
- It would be nice to have some clarity from the LSP design on what kind of compute and storage we’re going to offer to the users
- What is the batch system going to look like?
- What is the storage system going to look like?
- Example:
- If someone wants to rerun a SuperTask in a different way on a bunch of objects they've selected, is that something they can expect to do within the science platform by launching that in batch, or are they limited by the CPUs their notebook is running on and all computation will happen locally, in serial? Or is there a way they can launch jobs?
- There was a lot of discussion on the gaps in our understanding of how the platform users will interact with the batch system. I.e., will people be asked to do the moral equivalent of `qsub`, or will we have a friendlier interface to the system?
- Mario Juric thought developing this friendly layer was part of SQuaREs remit (remember discussing it in ~fall/winter 2016)?
- Simon Krughoff reports it has not made it into SQuaRE's final plan.
- Gregory Dubois-Felsmann: Would be good to make sure the workflow system we use internally is usable to our users as well; otherwise we won't be able to efficiently share capacity between L2 and L3 resources if we ever wanted to. Cautionary tails from BaBar. Worries that what we're seeing so far from NCSA is a very production oriented design.
- Jim Bosch: if we are planning to have the users use our L2 workflow, we haven't had those considerations included in the workflow system design yet.
- Even if we didn't want to offer the internal workflow system to the users, since our developers will use the LSP for development, they should have the same workflow system/interfaces available at least to them.
- This will also generate additional requirements on the Butler/SuperTask and we should take it into account.
- It's not clear how all these things fit into the system that Michelle is designing
- Unknown User (mjuric-admin) mentioned there was independent discussion of this with Don; they're currently planning to offer a pretty "vanilla" batch system to the users. We would "sprinkle" something on top of that (Python API) to make it easier to submit jobs from the notebooks, but the batch system will be fairly classical (HT/Condor).
- Robert Gruendl: the real worry is how long you allow user jobs to persist and how big they are.
- Gregory Dubois-Felsmann: I think we worry about one level higher than that – what if the user writes a supertask and wants to run it (in batch) on some subset of objects/images/data. How do they do that? Do they write scripts of their own? Or do they re-use the workflow system we have internally?
- Robert Gruendl thought this guidance/helpdesk for the users would come from SQuaRE.
- Unknown User (mjuric-admin): Bottom line: we need to clarify the LSP <-> batch interface(s) and who's responsible for what.
- Setup meetings to follow up on this
- Simon Krughoff
- Calibration products, talking to Merlin et al.
- How will we take them all (many images to take)
- Simon reports he's been named the DM liaison to the Commissioning Scientist (by Chuck)
- Worries about the SNR for various calibration products; if we don't know that, we don't know how many we have to take.
- Anecdotal potential issues with bias frames
- Bias not stable in the test stand as one would hope (the electronics float around)
- Biases may have to be taken during the night.
- We were surprised by this. Simon says this is anecdotal, based on the test stand.
- Commissioning rehearsals were useful and fun
- How will we do releases in commissioning?
- Simon Krughoff expects we'll make releases potentially even twice per day early in commissioning
- How do we (DM) support operations (third shift)?
- Michael Wood-Vasey concurs these were interesting, JTM will be even better.
- Remote access was not good; frustration; bad wifi/connection, bad audio
- Make sure it works when it works or don't do it at all
- AuxTel status has him worried
- Lots of TBDs even for hardware on how it’s going to come together
- Designs not solid even for things that are due to be built soon
- Worried we may make wrong decisions (or repeat things to get it right) because we have to do them and don't have time to do analysis
- There was some discussion on remote access
- Robert Gruendl: if you'll provide remote access, make sure it works. Or focus on providing good remote access only some of the time (i.e., not for all sessions). That way people can attend only a portion of the meeting, but usefully. As long as we're using hotels for this, he's worried we won't be able to do remote access well.
- Gregory Dubois-Felsmann: it's not realistic to expect high-quality connection to many sessions with the effort spent & the venue chosen; depends on too many variables.
- Mario Juric: I see this work well elsewhere; confused how we experience the same problems (at least since Bremerton). Don't think it's a fundamental issue with these kinds of meetings, more with our organization. Strange that we wouldn't want to pay for a wired connection in each room, or have speaker laptops/speakerphones/webcams ready. It's peanuts relative to what it costs to organize this meeting.
- Simon Krughoff: Thought Victor made it clear in a plennary he doesn't want to make it easy for people to attend remotely, to increase participation? Mario Juric got the opposite feedback (incl. promise after last Feb JTM that the project is buying "meeting in a box" telecon equipment to make things just work). Never seen that box in action. It's frustrating and causes trust issues. Feel like we're being told whatever will make us believe the problem will be fixed & go away.
- Michael Wood-Vasey: feels that the project has an implicit if not explicit policy on this that we disagree with. For a small investments of effort they could do noticebly better. At least stream the plenaries. But in parallel session it's hard; you can't do it because it takes more technical support.
- Melissa Graham
- SAC meeting was interesting
- Went to the science sessions
- Contributed to information on data processing
- Unknown User (mjuric-admin): Note that 10% for SPs is not the same thing as 10% for Level 3; the "10%" is the same number just by chance. The 10% for SPs are allocated within the normal "Level 2" budget (or should be).
- Melissa Graham: Need to confirm that w. KTL et al.
- Colin Slater
- Most useful, beyond the sessions, was lunch we had with Fritz Mueller and Donald Petravick on various map-reduce type options for data storage & processing, and how those relate to user experience.
- We don't have a very good story on how we'll do next-to-the-database processing; we're working to understand this better.
- Trying to better understand our database requirements from the PoV of the user
- Current requirements are too simplistic. Trying to give more info to Fritz on what the real requirements are.
- Zeljko Ivezic
- Mario Juric
| - Gregory Dubois-Felsmann will convene meeting(s) to understand the state of and clarify LSP ↔ batch interface (maybe a WG?). There's a concern a) we're designing the workflow system so it may be difficult for the users to reuse it, b) a concern that it's not clear who (if anyone) will write the user friendly LSP ↔ batch interface interaction layer. See the Notes for the details).
(Note 2019-03-06, This task was re-tasked in 2018-05-21 DM SST F2F meeting to create a ticket for this work)
- Leanne Guy to create a ticket for Robert Lupton to follow up with Steve Ritz or Chris Stubbs on potential issues with biases floating around (see Simon's notes). Also follow-up on issues with the lens that were mentioned off-hand in the plenary (there are no CCB records of it). 2019-03-07: Following discussion with Robert Lupton, the issue of the lens was not followed-up, Robert does not remember the exact issue
- Mario Juric should report back to the organizing committee the difficulties with remote access (see Notes for the details)
- Melissa Graham to follow up with Kian-Tat Lim to verify that the sizing model includes the allocation of compute and storage capacity for special programs.
- Mario Juric to ask the communications team for an updated community survey.
- Mario Juric will start the process to name the PoCs for all SCs (done: see here)
- Mario Juric To send e-mail to Gregory and everyone with recommended actions on LCRs
- Mario Juric To discuss with Beth Willman data access peculiarities for Science Collaboration participants w/o DAC rights (UK being the primary example): if a SC decides to build common data products in the US DAC, their non-DAC-rights members won't be able to access it. To confirm that this is the policy.
- Everyone: please fill out LSST2017 exit survey
- Everyone: please upload your presentations to the LSST2017 website
- @Mario Juric Schedule a Doodle poll for a new meeting time during this semester.
|