Skip to end of metadata
Go to start of metadata

Logistics

Date

  – 

Location

This meeting will be virtual; details of the teleconferencing system are to be determined.

Join Zoom Meeting
https://illinois.zoom.us/j/8154193943?pwd=N0JPZTV1RFZsY1dwUWMxa0E4N0RjQT09

Meeting ID: 815 419 3943
Password: 131840

One tap mobile
+13126266799,,8154193943# US (Chicago)
+13017158592,,8154193943# US (Washington D.C)


Attendees

Agenda


Day 1: 2020-11-17

Time (Project)TopicCoordinatorPre-meeting notesRunning notes

Moderator:  Yusra AlSayyad

Notetaker: Kian-Tat Lim

09:00WelcomeWil O'Mullane
  • Introductory remarks
  • Review agenda and code of conduct

Past actions:

  • KTL still needs to update document
  • GPDF has a session at IVOA about non-database catalogs; will become a document
  • RHL/KTL deciding whether to repurpose existing risk or add a new one
9:15Project news and updatesWil O'Mullane
  • FY20 accounting still not closed
    • Adding supplements for COVID, security to remove Huawei (Brazil routers on main 100Gbit connection but not backup)
    • RobertG: How do we monitor that no one introduces additional suspect equipment? May need to put this in agreements with providers
  • Victor desires for DMLT:
    • How do we get done with verification?
    • Document whether operations or maintenance
    • COVID justified separately
    • Assume we will be in lockdown for 12 months, assume reduced efficiency for replan
    • 10% increment should be OK for replan
  • Chuck asks how many 1a/1b requirements need mountain data:
    • Expected that almost all can be verified with simulated data, but need real data for re-verification and validation
  • Leanne Guyto determine which 1a/1b requirements need "mountain" or on-sky data  
  • Electronic logging:
    • LOVE has a very simple (hack) front-end but real solution is coming
  • Scope options: may need to be rewritten in terms of risks
  • Helpers needed for AAS booth; can get a discount code for registration; names needed by mid-Dec
    • CET will be helping to staff
  • Panels going up rapidly on dome, with real louvers, with work by local Chilean subcontractor
    • Spanish team to work on TMA supposed to start in Jan
  • ComCam might be back by the end of the week, now at the summit
  • Monthly report due Thu
9:45Plan to end of project:  Part I Wil O'Mullane



Victor will want a plan to the end of construction
  • Identify parts due to COVID and parts that are not
Look at burn rates — how far do we get with the money we have?
  • Assume we need to go to end of FY23 (Oct 2023)
  • Frossie: might as well show sheets if we have them
Transition of personnel to Operations needs to be checked
  • Can adjust ramps a bit, but delaying people doesn't get any more money into pre-Ops
  • Leanne: Zeljko mentioned that DM was funded through Construction, so is there extra money when people go to Ops? Answer: there was always a ramp, so no extra money
Do remaining milestones make sense? Do we have effort to achieve them?
When are things operational?
  • Baseline: When all 1b requirements are verified
  • DRP, reliability would then be pushed to Ops
  • Want to declare pieces operational earlier; they would still be in maintenance
  • Frossie: Handing over to Ops doesn't mean cessation of development
    • But pre-Ops money is problematic to use for development
  • Leanne working with Jeff on tying milestones to requirements; currently in spreadsheet, then will generate test plans and add more milestones if needed
    • DMTN-158 could have a list of requirements in the YAML file for each milestone
    • GPDF: A lot of milestones will have many requirements; test plans may be better places for requirements?
    • But need to be able to tell people when requirements will be met
    • Leanne: looking to extract from Jira so we have one source of truth
Justifying COVID expense:
  • Delayed prerequisite milestones
    • E.g. where we've been delayed due to not getting data
  • Variances in P6
  • Frossie: standing army waiting for things is not included?
  • EVM says we should deliver DM in Oct 2022 on budget

Presented spreadsheets showing burn rates and needs through end of FY23


USDF starts as Ops when Ops starts
  • No plan to get from here to start of Ops
  • Need such a plan and money for it in Ops proposal


Plan for tomorrow:
  • Go through milestones
    • Effort to achieve
    • Dates, whether correct or not
    • Missing milestones
    • Do any of them mark operational handovers?
  • Last cycle/next cycle as well
10:30Break

Moderator: Kian-Tat Lim

Notetaker: Yusra AlSayyad
11:00Plans for DM serving photometric redshiftsLeanne Guy

Leanne Guy will give an overview of the current  status of the plans and then open the floor for feedback, input and discussion

Slides

RHL: How does proposed validation from algorithm candidates map to phase 3 when we do validation?
LG: I expect the authors of algorithms to be involved. If they don't provide validation, those algorithms will probably not be selected.
RHL: Retraining is also the responsibility of the authors.  JFB: +1
CS: Nothing said to the proposers gets us off the hook for anything. We can say that they have to validate and train, but we're still responsible if they don't.
KT: Do you see letters of rec from people other than their authors? LG: It's possible USERS of algorithms will recommend.
JFB: Would "statement of interest" be more consistent? LG: That sounds vague too!
RHL: If they don't do more than write a letter, then it won't be useful.
CS: We're not being explicit here about the project's responsibility.
RL: We never said we were going to deliver photo-zs. If no one in the community steps up, we'll have to put 0.1 for everything! 
JFB: I agree with Robert that if no one does, this should be the first thing to get descoped. It's the thing that the community is better at. Operations.
YA: The backup plan isn't as scary as 0.1s. One of the first projects that DESCs new pipeline scientists have started is a photo-z estimator (Schmidt, Malz, Charles et al.) I bet they'll have a sufficient backup going.
CS: In that case, we should move the timeline earlier.
JFB: Agree. Clarifying the lines between the two groups early is good. We should get something written down so we don't get in a "I thought you were going to do that!" situation.

KT: If an author team proposes a photoz and we don't select it, is it still an in-kind contribution?

11:30Gen3

We are in a transitional period. Gen3 just released to science pipelines; and awaiting feedback from users before deciding where to focus new development. In the meantime: potential discussion topics are:

  •  Jim Bosch's proposal for repo organization of precursor data ( RFC-741 - Getting issue details... STATUS ).  Resolving open questions about proposed changes to filesystem locations and access controls should probably take precedence at DMLT.
  •  Tim Jenness's expectations on pipelines use and feedback. 
  •  Unknown User (gruendl)'s test plan?
  •  Yusra AlSayyad's plans for pipeline conversion and expansion?

Test Plan Discussion:

RG: We don't have a test plan: part my fault, part testing framework. Some tests are running now. Monika is running RC2. There was a request that she run all tracts together, but if you give it multiple things to run, the batch system will behave the same terrible way that the DES had: If one exposure has a problem, the job halts, and you have to work out what happened and restart.  

Test plans: Where do you balance modularity in the test plan with just getting done and being done with it?? "I can ingest a comcam exposure" "I can ingest an auxtel exposure." etc.. Do I write a test plan for each one? In the ops rehearsal, I wrote out each step of the test serially. You can't rerun it ever again. 

WO: Re mechanics of testing, worth having a chat with Jeff Carlin and Leanne. This might be one-off, though.

RG: It'd be worth writing down: this is how you ingest a raw comcam exposure. "Go get a raw exp, and do an ingest. check No/Yes" 

This is not the right way to do this. I'm authoring it. Executing it, and arguing why what I did was fine. 

GDPF: What RG says about "self dealing" has been the exact same for the science platform. It'd be nice if we had someone with an independent point of view. 

TJ: I was expecting this to be more collaborative with Science Pipelines. 

RG: No one else has time for that either. 

This is more important for telling external users that is fit for use. 

RHL: We can help with my integration work. The "does it ingest" is coming from the outside.  

JFB: One prob is we wrote a Big bang milestone for a gradual process. It's inching ahead. We just now declared that the schema is stable, and it is worth switching dev to daily work. The important part of the milestone is schema stability, and the rest is the box-ticking exercise. 

RG: This is one step more formal than the boxes on the confluence page. I'm not saying that I shouldn't be doing what I'm doing. 

TJ: We can declare that we're adopting DM-DAX-12 whenever we want. If it's a 503 we can't.

  • All. Be prepared to provide feedback on whether test plan (LVV-P77provides the tests and rigor necessary to declare Gen3 open for DM use/development. 

-------

Jim Bosch's next steps (slides)

Questions:

TJ: When we mean pain points, we mean command lines being weird. That means usability. happy to make improvements

RHL: Where do we stand on remote butler access or butler exports?
TJ: My client/server work includes this. You can do a local ingest into a local sqlite and the URIs are a remote archive that downloads on demand. Have to make that all work for RSP support.
FE: We don't give people infrastructure accounts, everyone is a user of the services.
WO: I support not using user accounts inside the DB. Yes, we did this with skyserver, and CADC does it. you need to know who the user is, but you don't need database accounts

CS: Are there other worries you haven't enumerated here re PP?
TJ: We have the problem of not knowing when the visit is finished.
JFB: We're not running AP through BPS. We need some execution environment.
K-T: One worry I have is whether we have multiple pipeline starts or a blocking operator that waits for data to show up. We should discuss.

RHL: We can't special case Alert processing. we should be using generic mechanisms.
JFB: on the point of AP vs. not-AP. The question isn't are we going to have these other things. Are they ENOUGH like AP, and is it hard enough to write one of these custom executors? I predict it's not hard to write one more.

Task: Discuss. Someone has to write a Pipeline.yaml. If its writing a new pipetask for this, then we have to talk about it. 

RHL: There's a layer of controlling the processing, that we're not paying attention to. e.g. maybe we define the visits outside.
KT: Jim had "Specifiying an external set of dataIds" on his list and a slide on it.
[We all behold Jim's slide]
TJ: How does that get passed to the next task?
JFB: We can define a summary dataset.
TJ: I think there's more; let's talk offline

KSK:
1) Can you defer loading? Yes. Look at coaddition and FGCM.
2) How do you get the provenance so you can see what data has actually been used? Hasn't been written yet.
3) Custom datastore that writes metrics to the database and then gives the registry access to those columns.

RHL: How do we manage those lists. Who owns the job of this? Frossie is going to say that we cal use OWL but what DB does it talk to? I'm willing to define it to not be middleware, but someone has to own it. There are few tables Jim put in the butler for good and bad exps, but that doesn't cover it.
KT: You have access to a wide variety of databases where you can query for a variety of exposures.
RHL: and I want it to track WHY I did it.
KT: That's a lab notebook.
RHL: that's a Rubin deliverable.
KT: Lets figure out how
Frossie: OWL will go towards this. But its not a processing control system.
TJ: there is absolutely a gap.

[We ran out of time for Jim's presentation of the RFC for user collections]

12:30Break

Moderator: Wil O'Mullane

Notetaker:
13:00HIPS requirements

(Contingent on Gregory's not being needed at an IVOA session)

Slides

Have had requirements for some time, LCR added them to DMSR (DMS-REQ-329, 379-385)

HiPS visualization was already added to Firefly (funded by LSST)
MOC visualization mostly done (part LSST, part IRSA), but weakness around discovering MOCs
New versions of Aladin Lite will take HiPS maps

Producing HiPS:
  • Will resample existing coadds, not generating new ones
  • Can do ourselves, with hipsgen, or with Montage, (RHL: or with HSC?)
    • RHL: want to visualize the data we have processed, not something with other processing (goes against Montage)
    • Pretty much have to do ourselves
    • Coadd patches or cutout service have potential issues for generating object context; high-resolution HiPS tiles could be an answer
    • HSCMap doesn't use HiPS (RHL: but it might have an option to use it?)
  • 1.7 minutes on a side, 0.2 arc second pixels, 512x512, 25M tiles, HPX WCS (already in AST)
  • Resolve patch overlaps
  • Wil: build from bottom up as you're doing coadds?
    • Doesn't handle overlaps
  • Generate lower-resolution:
    • Need to figure out how to bin pixels
    • Summing flux is not always best for visualization
  • Generate static 3-color
    • Wil: Aladin Lite v3 will recolor on the fly
  • Need to generate defined file hierarchy

Annotate HiPS with ids of underlying coadd patch images
  • Allow drill-down
  • HiPS Progenitors extension metadata
  • Tim: Should be trivial if sphgeom had HEALPix, but we can also do it via provenance
  • Want authoritative answer, not just coordinate-based query
    • Wil: why can't we do coordinates which are mostly good enough?
    • So that people can use standard HiPS maps to find data even without our services

Highest-resolution is data rights controlled
  • Need A&A support from clients; will be solved
  • Do we have the same service with A&A applied only at appropriate levels, or two different maps?

Discovery is iffy
  • CDS maintains a global directory
  • They can clone lower-resolution
  • In Firefly can see locally-curated list (served by DAC) as well as global directory

We need to write DRP tickets for generating this or using hipsgen
  • Montage is really not in the running

Blended color version could be lower priority

RobertG: PTIF has recoloring (but might not be hue-preserving)

Jim: Interactions with healsparse
  • Not really germane to HiPS, more for MOC
  • Not sure it scales to entire survey

MOC:
  • We're required to generate them
  • Would be nice if we could generate a nightly MOC of what we looked at

Yusra: when do you need them from a DRP?
  • By the time we're looking at non-trivial chunks of sky in Science Validation
  • Formally end of Commissioning
  • RHL: Worried that we will need HiPS sooner for figuring out how we handle HSC data in V&V

  • Gregory Dubois-Felsmann will run hipsgen on our HSC outputs as a test just to see how it works  - moved to  DM-29330 - Getting issue details... STATUS
Jim: Get someone to try doing ourselves for a couple of days

Server should be pretty trivial

Could partner with SPHEREx for some development, as SPHEREx will be generating HEALPix all-sky maps
13:30 <Newly Open>


14:30Close

Day 2: 2020-11-18

Moderator: Wil O'Mullane

Notetaker: Ian Sullivan
9:00AP IntegrationEric BellmThe AP team is now testing precursor datasets large enough to require real databases.  Do we push forward with Postgres at NCSA?  Try to integrate Cassandra?

More broadly, can we discuss the path towards the DM-AP-16 ( Full integration of the Alert Production system within the operational environment) milestone, with a view towards commissioning and pre-operations activities?

Slides

SQLite - Postgres - Cassandra

  • K-T
    • User account and rerun structure needs to be solved regardless of database system
  • Fritz
    • Is the DB access abstracted through the AP API built by Andy S.
    • The access is through the API, but later analysis is not
  • Frossie
    • How/why do we use Cassandra?
  • Fritz
    • There are some technotes
    • High concurrency
    • Spatially restricted queries with low latency was not otherwise available
  • Simon
    • Cassandra apparently supports RBAC
  • K-T
    • The API question is a good one, though; we could do a "friendly user" setup with everyone sharing a single account and use the API to keep people separated.
  • Fritz
    • Where DAX is:
      • Still need to prove out to full year of simulated AP
      • Have targeted Google cloud for next round of experiments
      • Andy S will run next round of tests this coming year
      • Will have to be a productization phase
      • Do we run it at NCSA or in the cloud
  • Eric
    • It’s not clear that we will actually need Cassandra even in commissioning
  • Wil
    • I thought it was after 6 months of data that we needed it
  • Fritz
    • Is there a shim to make Postgres work right now?
      • Eric: That’s what we’re doing
    • Reading the technote, it’s not easy to set up Cassandra, would need Andy S.
    • Should try to bridge the gap with Postgres now
  • Fritz
    • We need to know what environment we will run in
    • NCSA or Google cloud, or USDF
  • K-T
    • The fastest solution might be to take an environment variable and tack it on to the user name
  • Colin
    • To clarify, Eric is describing ad-hoc usage where there are many things going on
    • For that, Postgres sounds sufficient once we solve managerial problems
  • Wil
    • There are probably only a handful of calls AP uses
  • Fritz
    • How much of that is inside the API and how much is outside
  • Eric
    • Ian and I need to look at the API
  • Colin
    • In the ad-hoc realm, I don’t want to constrain people
    • Wil worried about the pipeline code, Colin: that’s all in the API
  • Fritz
    • The perceived complication of using Cassandra is the complicated configuration needed to tune it
    • Once it is set up for production, it should be easier to set up a new instance

All the other pieces for integrating AP

  • Wil
    • Want this to be ready when the whole camera is on sky
  • K-T
    • After OCPS is working (little more than a month)
    • Then want to take one AP pipeline and plug it in and run it
    • Gives us minimum functionality
    • We already have DAX simulators on the test stand, can use that
  • Robert
    • Won’t we use OCPS first? Yes
  • Wil
    • Full integration could be what K-T is saying, and could be enough for the first year
    • No requirement on timing of alerts during commissioning, just that we do them
    • Could add a second milestone for later for full working system
  • Fritz
    • How do we manage the gap and color of money issues between NCSA and the USDF or Google?
  • Wil
    • Talk to Richard D to get hardware at SLAC
    • Could potentially complete milestone a couple months into operations
    • There is no point in testing integration before we are ready
  • K-T
    • All of this can be tested at NCSA
  • Wil
    • Could run at NCSA or Google
  • Frossie
    • What is the Alert Pipeline scale when ComCam goes on the sky?
  • Wil
    • ComCam is only 10% of focal plane, 5% data at best
  • Wil
    • Yes, we have enough machines at NCSA, because we do not have to generate alerts for every exposure, and not in real time. Will be best effort basis
  • Fritz
    • First I heard the latency requirement is relaxed in commissioning
  • Leanne
    • We told the community we would package up all the alerts, but with no expectation on latency
  • Colin
    • During commissioning we have to prove we can do the real thing
  • Frossie
    • Nice to prove we have a working thing, even if it is not at full scale
  • Eric
    • We have to demonstrate that we can meet the requirement, but we don’t have to do that with full focal plane and at a sustained rate
  • Fritz
    • Running one CCD in one database isn’t very different than running many in many databases
  • Colin
    • Is OCPS all I need to run AP?
  • K-T
    • It is not the designed component to execute it, but we will see if it can do it
    • The prompt processing system is the designed component, doesn’t exist yet
  • Colin
    • Is somebody building this?
  • K-T
    • Nominally yes, but no one right now
    • Thought it was part of NCSA WBS, but is probably in a grey area
  • RHL
    • K-T is experimenting with a more flexible system
    • Even if it doesn’t work out, it’s still progress towards a functional system
  • K-T
    • That’s the problem right now, there’s no backup
  • RHL
    • Possibility of using Auxtel to test AP
    • Early next year, could do end-to-end test
    • There are filters in place, can use it as a camera
  • Wil
    • Technically we have almost a year into Operations before we need to distribute alerts
  • Colin
    • There is a big push from the agencies and Zeljko to have a working system on Day 1 of Operations


9:30New framework for metric computation

Motivation and current status

Slides

New framework for metric computation – Leanne and Simon
Called Fast (or Flexible) Analysis of Rubin Observatory performance (FARO)

  • Frossie
    • This looks phenomenal, the SQUASH system has been empty
  • Leanne
    • Simon and Keith have done a lot of work on this
  • K-T
    • Is the validate_drp used in this comparison running all the same things the one in Gen 2 was? Yes
    • Impressive that it is faster
  • RHL
    • What is your plan for scaling out to handle large quantities of data
  • Leanne
    • That is a high priority
    • We want to first complete our validation on RC2 and then move to analysing PDR2 when available
    • This is an afterburner, we need to have run science pipelines run first
  • RHL
    • Where do we discuss whether the metrics are good enough
  • Leanne
    • We have a Slack channel we’ve been using to develop this (#dm-svv)
    • Should turn that into a wider channel for all DM, for everyone doing QA
  • Wil
    • This will integrate nicely with pipelines, it could be added to the end of any pipeline? Yes
  • Simon
    • You can even interleave them
    • Run it after one step of the pipeline, then go on to make coadds (for example) and later run more metrics
  • Yusra
    • Science Pipelines are happy with this
  • RHL
    • Metrics are great, but we will use this to discover problems we didn’t expect
    • We need to pay attention to how this becomes useful to Pipelines, without overwhelming the service
  • Eric
    • On integration, we have stood up monthly QA meetings with the commissioning team and SVV
    • Does it need to interface with the APDB as an afterburner, or can you do more ad-hoc analysis
  • Simon
    • We understand there are some metrics people will want to run in-situ
    • That can live together with FARO tasks
    • Everything should still use the Butler
  • RHL
    • On the boundary, if we find discrepencies in the output whose job is it to drill down and find the cause
  • Yusra
    • We have found it useful to have a very senior person like Lauren in place to triage, and know who to direct it to first
  • Wil
    • In Operations, we’ve tried to put that group all together with Leanne. They can analyze who to send it to. There is no clear answer ahead of time whose job it is
10:00 Plan to end of project:  Part II

Milestones – Wil

  • High level milestones
    • DLP-526 states the archive center is complete at NCSA, can never be completed as written
  • LDM-503-14 Do we need to split these into 1a and 1b milestones
    • Leanne
      • Action: Add intermediate milestones (on Wil and Leanne)
  • Pipelines
    • Yusra
      • I don’t want anything holding up our releases
    • Leanne
      • We do the test to check for any major regressions, not whether specific milestones have been met
  • Infrastructure/Integration
    • Michelle
      • We should move the LSSTCam Ops out until after the camera is on the mountain
    • Wil
      • It is tied to that, but the camera team hasn’t updated their milestones

DAX Plan to end of construction – Fritz

  • Don’t have many DAX milestones, need to get more on that are appropriate
  • DM-DAX-5
    • Have a workflow set up with Hsin-Fan
  • DLP-802 Alert Production Database design
    • Need some time from Andy S, should complete in January
    • Need milestones that drive development and APDB integration
    • Features in the TAP service
  • Wil
    • Correct thing is that you have a milestone that you will deliver X, which is blocking Frossie
  • Fritz
    • Worried about past milestones that were not related to concrete design decisions
  • Wil
    • Need milestones for tracking when decisions must be made
    • We have two sets of milestones, for construction and for operations
    • It is hard to link milestones from outside the project
  • Simon
    • Can milestones be attached to multiple WBSs or do we need duplicate milestones for decisions that affect more than one WBS?
  • Wil
    • The level of the milestone reflects the breadth of subsystems it covers.
  • Simon
    • If there is something that needs to be decided by Arch for TAP, that would be a level 2 milestone
  • Fritz Mueller   Get together with Colin and Frossie, and define milestones for RSP dependencies on Qserv (Notes: this meeting was held, resulting in tickets DM-29682 and DM-29683 as first steps.)

DAX estimate of effort to complete

  • Roughly 114 FTE months of effort left to complete, have 30 on construction and 120 in ops
  • Can some activities be shifted to ops, or do we need to revisit the ramp into ops?
  • Wil:
    • To finish, DAX needs construction funding through FY 22
    • The conversation about Ops needs to be with Phil
    • Maintenance is acceptable under ops
10:30Break
Moderator: Leanne GuyNotetaker: Fritz Mueller
11:00Plan to end of project:  Part IIIWe can use Thursday as managers day if you want more time to go through in more detail.

SQuaRE (Frossie)

  • slides
  • Frossie Economou insert milestones TL;DR in meeting notes here, since Fritz was not asked to take notes until after your segment

ARCH (K-T)

  • slides
  • BG3 milestones on-track, OCPS coming up, others longer-term
  • Most ARCH effort is LOE, and milestones on the books look good
  • Among stated goals: eups-independence
  • Some staffing reduction
  • Wil: Some additional MW milestones needed?

DRP/AP (Yusra)

  • slides
  • The "milestone cliff" apparent on the milestone graph is coming up.  A big part of this is pipelines (5 milestones overdue, and 10 more due in next 3 mos.)
  • Many of these are made much easier by arrival of BG3, so the should go quickly.  Others are genuine concerns (see slides).
  • A big component of pipelines planning is annual review of DPDD, producing "annotated DPDD."  This is coming up in January.  Revisit what needs to be done with a hard eye toward what is really needed for DR1.  Typically less rosy than just looking at milestones.
  • Some ongoing activities don't currently have associated milestones (shapes for shear estimation, inferred SEDs, CBP pipeline).
  • Concerns registered re. shrinking commissioning on-sky time for shaking things out.

NCSA (Michelle)

  • slides
  • Not a lot of milestones left, maybe need more, or not at this point?
  • "Full delivery of DF conops" milestone should definitely go over to USDF.
  • Twilight and transition plans for '22/'23 needed in greater detail for budget and people planning.
  • Wil: some USDF preliminary plan documents expected from SLAC Jan/Feb and will help clarify.
  • Wil: transition-to-USDF milestones for individual services/sub-systems seem needed

Wrap up discussion:

  • Reminder from Wil: we can use LCR process to move milestones as appropriate if we get to them before they hit the monthly report.  Best is to tie them to construction or test milestones; Wil can help you identify these.
  • Covid impacts:
    • Perhaps 10-20% impact on development efficiency? (perliminary/speculative)
    • More on order of 1yr. due to commissioning / summit delays
  • Late decision re. USDF has also impacted schedules
  • Fritz Mueller Create shared Google sheet to collect T/CAM budget burn-downs










12:30Break

Moderator: Colin Slater

Notetaker: Eric Bellm

13:00Team status


Yusra:

  • Wil: Down weighting velocity for Covid
    • Yusra: Not explicitly
  • Fritz: If we are looking for EV variances due to variability in efficiency, it may be a bit tricky since it's in the noise
    • Frossie: Maybe in the noise, but will potentially build up.  Even if it's hard to track with our EV metrics, we still need to take care of it in plan
  • Wil: We have all pushed out milestones.  Is it because we have more time or because of efficiency?
    • Frossie: I see it in my variance, but I take care of it in next cycle planning.
  • Wil: Those who can see it in EV metrics, we should capture that if we can.  It would give us a concrete ask for upper management

Michelle:

  • Colin: When can we not ingest in gen2?
    • RHL: We are pretty close now that gen3 ingestion works.  We are blocked on getting full ISR for ComCam.  BOT we can do now.
  • Wil: We should just get new hardware rather than shipping NCSA test stand hardware down to Chile.  The machines are getting old etc.
  • Colin: When you say everything changed for LSSTCam, this is the reorganization K-T put together?
    • Michelle: Yes
    • Colin: Is ComCam also changing?
    • Michelle: No. We had gotten so far down the road we just kept going in that direction.
    • Robert L.: You may find me pushing hard to do the same thing we are doing LSSTCam with ComCam
    • Michelle: I'm hesitant to break a working system
    • Taken off line
  • Wil: NTS shipping to Chile clarification, the DAQ will be shipped, but the nodes will not.  This is the plan: https://ittn-029.lsst.io/v/IT-2465-fwv/index.html
  • Wil: There needs to be a small test stand in Tucson before the NTS gets turned off

Frossie:

  • Wil: when UK/Fra deploy our science platform, do they get our auth?  Frossie: European have on CILogin equivalent, she's willing to do the work to integrate with them.  Wil: that's ID, but what about auth?  Frossie: they will use our group management identity, we manage data rights for them
  • Frossie: schedule will be packed next cycle due to DP0 and commissioning, so less ability to respond to interrupts

KT:

  • Colin: what is the problem with the long-haul networks?  K-T: usually transfers using multiple connections work fine, but single connections often die

Leanne:

Fritz:

Ian:

  • Colin: pytrax integrating to SQuaSH–Square?  Ian: pushing metrics, not integrating, sorry
  • RHL: can you update with performance on various datasets?  Ian: HSC bulge data still on deck.  DECam bulge data made it through SFP and template building quite successfully on 99.5% of 20k CCD-visits.  Diffim tests awaiting Postgres.  Single-CCD diffim test looked okay.  (longer discussion of other datasets)

Gregory:

  • frozen, but wanted to give an update on Firefly TAP capabilities
  • Fritz: is Postgres sufficient to drive your demo?  GPDF: yes but there are some detail questions.  Fritz: let us know if there are items that should be prioritized for Kenny



  • 14:10
  • Focus Friday
We said we'd do Focus Friday provisionally until this Nov meeting. How's it going? Do we want to keep it up?
  • strong votes of support from Frossie, Tim, Michelle
  • K-T reports one anecodal report: " I have hard time talking to people on the project and now I have 20% less time for it"
  • Fritz: missed reaction from his team, might prefer every other week (10% of 20% of our time)
  • Robert frequently gets stuck on his work on Friday because he can't ask questions, and it pushes people to email in non-public forum.  Would prefer a no-meetings Friday, or a "no non-urgent questions"
  • Leanne also is concerned that communication is still happening but in private rather than public channels.  Has reports from some developers that they are also hindered.  Do agree with no regular meetings
  • Frossie: wants a form that can run every week, so people will plan ahead–"on a plane to Japan".  Points out that burden here is differential–some people/teams send mainly outbound Qs, some people/teams get lots inbounds, 2 minutes at a time.  Thinks perhaps people should be allowed to ask questions–but don't expect or plane for a response
  • Simon: disputes the question that development is less efficient, forces him to work through problems and batch questions rather than just pinging Jim every 5 minutes.  Have to avoid demands  for attention on Focus Friday unless something is blocking a whole team
  • Tim: appreciates that he doesn't have to catch up on lots of Slack messages, can plan on getting 2 story points
  • Jim: appreciates the time for his own productivity; not representative, as he does get lots of inbound Qs.  Does think quick qs 
  • Yusra: worry is about decisions being made in public channels
  • Ian: also supportive of an expectation of minimal discussion and no guaranteed response, but allow Qs.  Does end up with telecons anyway on Friday due to non-DM folks
  • Simon: liked how Yusra directed conversation off of Slack and onto a relevant ticket
  • Wil: worried about slippery slope from "no messages" to "some messages" to "same as any other day" but will try to create some language that works
    • Wil O'MullaneDraft PR for Focus Friday to allow non-urgent Qs on Slack without expectation of response.  
14:15Wrap up

next DMLTs:

  • 2021-02-22/25 - Nominally Tucson - Virtual  I think so ...
  • 2021 June 8-10
  • PCW ! In person
  • 2021 October 26-28 - Clash with ADASS ... move?
  • 2022 February 15-17
14:30Close

Day 3: 2020-11-19

Moderator:












11:00Close

Pre-Meeting Planning

TopicRequested byTime required (estimate)Notes
Plan to end of project3x30 minutes

Perhaps we should have 3 separate sessions on planning to the end of the project some things to cover.

  1. burn rate vs budget ops and construction
  2. milestones and when we will deliver what esp. 1a 1b requirements from DMSR
    1. inventory of whats remaning
    2. what might get shift to ops
  3. effort to deliver those - planning packages per milestone ?
Status report from the provenance working group30 minsWhat is the status on this?
Status on plans for DM serving photometric redshifts1hr

Leanne Guy will give an overview of the current  status of the plans and then open the floor for feedback, input and discussion

New framework for metric computation1hr

Leanne Guy and Simon Krughoff will present the new metric computation framework. 

Next steps for Gen3?

We could discuss Unknown User (gruendl)'s test plan, Jim Bosch's proposal for repo organization of precursor data, Yusra AlSayyad's plans for pipeline conversion and expansion, and Tim Jenness's expectations on pipelines use and feedback. 

Focus Friday15 minWe said we'd do Focus Friday provisionally until this Nov meeting. How's it going? Do we want to keep it up?
Requirements for Science Pipelines to support HIPS images. 

AP Integration30 minutesThe AP team is now testing precursor datasets large enough to require real databases.  Do we push forward with Postgres at NCSA?  Try to integrate Cassandra?

More broadly, can we discuss the path towards the DM-AP-16 ( Full integration of the Alert Production system within the operational environment) milestone, with a view towards commissioning and pre-operations activities?

Attached Documents

  File Modified
PDF File DMLT-vF2F-112020-PlansForPZ.pdf Nov 17, 2020 by Leanne Guy
PDF File DMLT-vF2F-112020-DMScience-StatusandPlans.pdf Nov 18, 2020 by Leanne Guy
Microsoft Powerpoint Presentation NCSA milestones.pptx Nov 18, 2020 by mbutler
Microsoft Powerpoint Presentation NCSA next 3 months .pptx Nov 18, 2020 by mbutler
PDF File NCSA next 3 months .pdf Nov 18, 2020 by mbutler
PDF File DMLT-vF2F-112020-NewFrameworkForMetricComputation.pdf Nov 18, 2020 by Leanne Guy
PDF File Bellm_AP_Integration_201118.pdf Nov 18, 2020 by Ian Sullivan
PDF File Sullivan-AP-Status-November-2020.pdf Nov 18, 2020 by Ian Sullivan
PDF File Arch S21A Plans.pdf Nov 18, 2020 by Kian-Tat Lim
PDF File DRPActivies202011.pdf Nov 18, 2020 by Yusra AlSayyad
PDF File Nov2020F2FMilestonesSciencePipelines.pdf Nov 18, 2020 by Yusra AlSayyad
PNG File Screenshot 2020-11-18 at 23.09.30.png Nov 19, 2020 by Wil O'Mullane
PDF File DAX End of F20 Status_Plans.pdf Nov 23, 2020 by Fritz Mueller
PDF File DMLT-SUIT-202011.pdf Dec 02, 2020 by Gregory Dubois-Felsmann
PDF File DMLT-HiPS-202011.pdf Dec 02, 2020 by Gregory Dubois-Felsmann

Action Item Summary

DescriptionDue dateAssigneeTask appears on
  • Frossie Economou Will recommend additional Level 3 milestones for implementation beyond just the DAX-9 Butler provenance milestone.   
15 Mar 2022Frossie EconomouDM Leadership Team Virtual Face-to-Face Meeting, 2022-02-15 to 17
  • Kian-Tat Lim Convene a meeting with Colin, Tim, Robert, Yusra to resolve graph generation with per-dataset quantities (likely based on Consolidated DB work).  
18 Mar 2022Kian-Tat LimDM Leadership Team Virtual Face-to-Face Meeting, 2022-02-15 to 17
21 Sep 2022Wil O'MullaneUser Batch meeting-2022-07-13
  • Robert Lupton Identify point of contact on commissioning side for summit schema  
26 Oct 2022Robert LuptonDM Leadership Team Virtual Face-to-Face Meeting - 2022-10-18
  • Colin Slater to write spec for bake-off between htcondor and parsl-over-slurm including who/how to execute the bake-off  
31 Oct 2022Colin SlaterDM Leadership Team Virtual Face-to-Face Meeting - 2022-10-18
  • Robert Lupton Kian-Tat Lim Kick off meeting between commissioning, Fermilab, and summit (Carlos) for summit schema  
09 Nov 2022Robert LuptonDM Leadership Team Virtual Face-to-Face Meeting - 2022-10-18
  • After HTCondor get parsl/slurm updated before bakeoff (bps report, memory multiplier etc) - Tim Jenness  

14 Nov 2022Tim JennessDM Leadership Team Virtual Face-to-Face Meeting - 2022-10-18
15 Nov 2022Gregory Dubois-FelsmannDMLT meeting-2022-10-24
  •  Kian-Tat Lim  Convene a meeting following up vF2F discussion about what to do with Jenkins  
24 Nov 2022Kian-Tat LimDM Leadership Team Virtual Face-to-Face Meeting - 2022-10-18
13 Feb 2023Gregory Dubois-FelsmannDMLT meeting-2023-01-30
20 Feb 2023Wil O'MullaneDMLT meeting-2023-02-06