Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Abstracting access to OCS through butler
  • Fetching data from staging are to disk (e.g., templates, exposures for forced dia photometry)
  • provisioning hardware for L1: when, what
  • batch processing related needs for L2/L3
  • data backbone for L2/L3
  • cross-site failure recovery
  • cross-site upgrades (are all sites required to be on the same release etc)
  • replicating L3 across different sites / L3 user mydb synchronization across multiple sites
  • distributing DRP products - via network or physical shipping?
  • storage technology for large files (object store)
  • capturing provenance (Cooper starting to think about it)
    • gray area: who captures provenance about OS/hardware
  • how much containerization should we be doing?

    • does it simplify provenance capture?

  • NCSA is writing docs about L2
    • Margaret will expose to Data Access when the docs are in reasonable shape
  • need to bring IN2P3 to Data Access  <-> NCSA into discussions (via JCC)

...

SUIT and Architecture

  • User workspace discussion.  SUI/T  SUIT presented a diagram of the preliminary thoughts on workspace. The workspace could be like user home. Users can save their work, install their software, run a LSST task from the workspace. Users should be able to access the extended storage space (at LSST or not) and access the computation facility (at LSST or not).  iPlant was suggested as  a possible option for workspace. We will have to get the relevant parties together to discuss more and plan a workshop at some stage.
    View file
    nameM31.graffle.pdf
    height250
  • SuperTask's role in workspace.

Additional notes from Paul Wefel

 


  • The network made remote NCSA participation (4 people) challenging but eventually managed to get Jason Alt connected.
  • There is a little bit of a chicken / egg problem here. SUI SUIT would like to know the DM environment / constraints and DM is probably waiting on SUI SUIT for their environment requirements
  • High level overview of SUI SUIT process
    • Users will come in through a web page / portal
    • Authenticate through the web page
    • Now interacting with SUISUIT, SUI SUIT wants to act as the user for all processes run
    • Launch python processes behind the SUI SUIT (using extension architecture that has been proposed)
    • example - looking at an image in detail
    • how the background process is launched is TBD
    • would like the background process to run as the UID/GID

Another mechanism being explored for SUI SUIT is through real interactive environments where a VM or container running on a user workstation/laptop is pre-configured with SUI SUIT software stack.

One question that came up: what is the resource limit for a Science user running jobs.? (Asked to Jason. Couldn’t hear the answer)

VM vs. Batch processing,  SUI SUIT understands Don’s point from yesterday on using a batch job

**SUI SUTI would like to have a two / three day meeting with NCSA to work out system interaction details  (action item that isn't owned)

Jason and Gregory to talk 


Science Pipelines and Architecture

...

Process Control and SQuaRE

...

SUIT and Data Access

We all would like to exercise the SUI SUIT to Data Access system in a basic way regularly, deploy nightly out of NCSA. The idea is to get this going with a very small repository. Then, when we receive the PanSTARRS data, SUI/T SUIT can start to use that data to develop new features for users to try and write a robust set of regression tests.

Jacek would like to get the small set of available around July-Augus 2016. Before then SUI/T then SUIT will deliver example queries so we can verify we’re prepared to run the tests.SUI/T SUIT would like to have the web portal to access PanSTARRS data ready by the end of November, which means that data should be ready for access by end of September the latest.

...

Once the Pan-STARRS data arrives it should take Data Access about a month to get the pan stars data ready. Note, we don’t know what format it’s coming in; we will need to think about how to load it. After loaded, keep it available "for ever". Ok to limit access for users during scheduled stress DB tests.

Remote Butler - SUI/T SUIT really wants a Java client to the remote Butler, for the Firefly service. We need to discuss it with the architecture group, but perhaps we should consider *not* doing the python butler client for now, and doing the Java one instead. Java client useful for SUI/T SUIT server side (performance reasons, does not want to fork python process). Python access still needed for SUI/T SUIT user access.

Would be nice to have simple prototype in Fall 2016 that demonstrates credentials acquiring/passing

...

  • need more definitions of interfaces to global catalog. This needs to be discussed with NCSA. Not sure when we will know

  • Since we need it soon, prototyping is useful
  • but keep an eye on existing systems, like Fermi DataCat

...

SUIT and SQuaRE

  • VO protocol support. VO table binary format was suggested for large data amount for data transfer. Using VO protocol as internal interface was suggested.
  • Calibration and EFD information are important to SQuaRE and to SUI/T SUIT since QA and users would like to know. More discussion and understanding are needed.
  • Documentation. SUI/T SUIT will invite Jonathan Sick to IPAC for a day or two to work together on documentation summer.

...


Science Pipelines and

...

SUIT

Large scale visualization

It was agreed that the large scale visualization of HSC data as presented by Robert was a compelling use case for SUISUIT. There was some discussion about what it would take to achieve this technically. It was agreed that it would be technically feasible to generate PNG (or equivalent) images which would be used in such a visualization as part of the data release processing (rather than post-processing the data release by the SUI SUIT group). The key is to pre-generate the large images in different resolutions.

...

Following some work in 2015 to integrate Firefly with the afw.display system, this effort has been languishing. The SUI SUIT group are keen to see Science Pipelines developers using, and providing feedback on, Firefly; Science Pipelines developers are keen to have access to better visualization and debugging tools. However, the barriers to entry for getting Firefly running on individual laptops are -- arguably! -- offputting.

...

Table-valued functions are a major feature of the SciServer infrastructure. Not currently available in MariaDB or QServ (although less-than-optimal workarounds may be possible). Members of the Data Access will be meeting with Monty Widenius next week and may discuss it with him.

SQuaRE and Architecture

...

SUIT and Process Control

Margret and Don had other session to attend. Jason from NCSA called in for the discussion. We touched on workspace,  resource management, and authorization/authentication. Xiuqin would like to start a regular discussion on work space which leads to a 2-3 day face to face design meeting of all parties.  

...

  • A consensus was not quite established that establishing a new "brand" for the Science Pipelines was a good idea: Mario Juric in particular had reservations about whether this was necessary or desirable.
  • There was some concern that leaving e.g. the middleware unbranded could foster a perceived divide between the science and engineering parts of the project.
  • It was agreed that if a new name were to be chosen, the appropriate top-level product to which it would apply would be lsst_apps. It would exclude the obs_ camera packages.
  • The next step is to write a complete description of what this change would involve so that it is possible to estimate the total cost of making it. This action was assigned to Tim Jenness.

...

Actions

  •   Jacek Becla coordinate Data Access / NCSA discussions (didn't really happen because key people were unavailable / away for ~ two months, now coordination is done through DPS-WG)
  •   Unknown User (npease) visit Princeton ~1 week
  •   Kian-Tat Lim follow up with senior mgmt wrt proritization of making PanSTARRS data available through Qserv/SUIT
  •   Frossie Economou look into Russell's OSX build issue
  •   Frossie Economou schedule Josh to work with Fabrice on qserv Docker-based test deployment
  •   Frossie Economou advise pipelines (John Swinbank) how to self-maintain lsst-dev stack
  •   Unknown User (xiuqin) start a telecon to kick start the workspace discussion, leading to a 2-3 day workshop
      Unknown User (xiuqin) study .  Workspace has been expanded into LSST Science Platform. I think Kian-Tat Lim should be leading this effort since many teams are involved for requirement gathering, design, and implementation. The page Science Platform captures the initial definition from K-T. 
  •   Unknown User (xiuqin) study iPlant to see if we can adopt its workspace
      Unknown User (xiuqin)Frossie EconomouGregory Dubois-Felsmann requirement (
    Jira
    serverJIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId9da94fb6-5771-303d-a785-1b6c5ab0f2d2
    keyDM-6663
    is created for this action)
  •   Unknown User (xiuqin)Frossie EconomouGregory Dubois-FelsmannFritz Mueller  requirement for remote butler 
  •   Unknown User (xiuqin)Jacek Becla,  Fritz Mueller butler in Java? 
  •   Unknown User (ciardi)   What does user community want in the external file format?
  •  Simon Krughoff setup a meeting to talk about AP communication in the context of association (Data Access, Process Control, Alert Production)  (KSK: Similar to the situation below.  We need to sort all this out at a very fine level, but given the replan, I think we can put it off until after the AHM.  Please disagree Fritz MuellerDonald Petravick, or Andrew Connolly)
  •  Simon Krughoff verify division of labor in terms of defining standards, authoring packets and publishing streams in the context of the Event Production Pipeline (VOEvent and VOEvent Transport Protocol)
  •  Simon Krughoff add request for butler to think about how repeated access to the same data (calibration products) could be made performant
  •  Simon Krughoff look into adding the ability to get reference catalogs from the butler.  Look into how multiple catalogs and external repositories for reference catalogs can be handled. (DM-6658)
  •  Tim Jenness prepare a complete description of work involved in rebranding lsst_apps.
  •  John SwinbankJacek BeclaSimon Krughoff Arrange a full walkthrough of the AP and DRP interaction with Data Access services including relevant members of all three teams.  John Swinbank arrange for the (KSK: I don't think this is relevant right at this moment.  I think we need to finish the replan of the pipelines before this can be maximally useful.  I'm checking it off, but we need to do this at some point.  Please feel free to disagree Fritz Mueller and John Swinbank)
  •  John Swinbank arrange for the Data Access team to be provided with a specification for in-Butler catalogue joins.
  •  
  •  Unknown User (xiuqin)John Swinbank Arrange for Firefly-as-a-service on lsst-dev.
  • (This is 
    Jira
    serverJIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId9da94fb6-5771-303d-a785-1b6c5ab0f2d2
    keyDM-6662
    ; removing from the list here.)
  •  Unknown User (xiuqin)John Swinbank Arrange for Firefly-as-a-service on lsst-dev. (JDS: I believe this work is being coordinated by SUIT; I have nothing to do here until they ask for support. Unknown User (xiuqin), do you agree?) (Epic
    Jira
    serverJIRA
    columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
    serverId9da94fb6-5771-303d-a785-1b6c5ab0f2d2
    keyDM-5591
    was created to address this issue)
  •  John SwinbankJim Bosch Compile short-term middleware feature requests and supply to Margaret Gelman well in advance of the next cycle.
  •  John SwinbankMargaret Gelman Decide on an appropriate division of labour for updating ctrl_pool middleware to address Condor. (JDS: I believe this is obsolete due to the ongoing replanning exercise and discussions at the May DMLT)