2017-12-05 Notes by KTL

Development, Test, and Deployment Plans

Web API Aspect Plans

Enhancements to DAX web services

v0 APIs (custom) exist
Improving v1 APIs (more VO-standard, multiple datasets)
- metaserv (metadata about tables)
- imgserv (cutout interface)
- dbserv (API for general database interaction)

Enhancing Kubernetes deployment

Qserv is already containerized in PDAC
Kubernetes running for Qserv at IN2P3, needs to be refined for deployment on PDAC
Will fold Web services into Kubernetes; they're already containerized

Notebook/API Aspect integration

Whatever changes are needed to integrate for prime use cases

Looking to bring up simple Kubernetes configuration on PDAC for now

Administered by LSP groups
Fritz will provide version numbers; Bill Glick to install

Qserv

Andy Hanushevsky working on next-to-database processing design full time
- First will be map/reduce framework cooperating with Qserv shared scans
- Processing using Mario's Pan-STARRS analysis framework could be examples (people at Caltech are running it)
Enhancing loading/ingest, making it more efficient in human and machine time
- Test would be HSC reprocessing ingest
- What is the scale of the data available on April 1? Colin to find out
- Release to Portal won't be until June
- Could load more NeoWISE data? Not before April; could be useful for next-to-database processing
Working on better dashboarding and instrumentation
L1DB and Transformed EFD
- How do notebook users see L1DB?
- Can something be "left running" and continuously updated?
- Replication load on Live L1DB needs to be analyzed
- Latency of processing of ComCam images needs to be understood
Data replication and distribution will be deployed to PDAC in Jan

MyDB and VOSpace is on the schedule for F18

Joining with Qserv databases still being thought about
Who provides what for the file Workspace (VOSpace)?
- Underlying filesystem provided by NCSA
- If there is DBB, DBB is provided by NCSA
- DAX must provide application-level service for VOSpace and/or WebDAV
- Kian-Tat Lim needs to make this clear in LDM-148 (text should also go into LDM-542 and LDM-129)

Notebook Aspect Plans

Deployment

Currently on GKE, waiting to go into PDAC until rest are ready

Authentication

Tested integration with CILogin on Nebula; should be part of PDAC deployment
Want to build "dual authentication" against GitHub to enable privileged access to repositories

"Software Delivery"

Provide software
Allow users to change it
Allow users to "push it back"

Continuous integration of notebooks

"Magic" repository in GitHub; prepopulates environment for novice users; these need to be tested
Perhaps enable users to have their notebooks tested as well
May be tricky if widgets are embedded in the notebook
EPO also wants CI

No plans for doing anything more for shell prompt

Have been prototyping deploying a Bokeh visualization in a notebook as an app within SQuaSH

When Kubernetes commons is available in LDF, deploy Notebook Aspect on it for internal staff developers

Current deployments don't have access to data, they do have an NFS server for persistence
Access to GPFS:
- Export GPFS over NFS seems preferred
- GPFS native client requires passwordless ssh access

Instance on GKE now for Commissioning users until commons is available

Interactions with EFD still being investigated

Portal Aspect Plans

Work with DAX team to get access to new data

Planning to integrate with Metaserv to automate UI layout
Would like HSC data but not going to be available this cycle

Security integration with DAX

Would like to build some version of an authenticated, controlled user Workspace

Working on displaying all-sky images using HIPS

Need to integrate with services that provide HIPS (including IPAC-hosted all-sky maps; generation of those by IRSA could provide experience for LSST HIPS generation; existing hipsgen code could be used; using LSST stack code could be better scientifically)
Need to go from all-sky viewer to specific images (e.g. upon zooming in sufficiently)

When is development effort scheduled for hooks for transitions between Aspects?

Much may already be scheduled to the extent that it is Workspace plus asynchronous query plus query history; see later session

LSP Instances

PDAC is part of the Integration Cluster for integration and testing

Science Validation environment provides service to Science Pipelines developers to improve pipelines

Also usable (eventually) to validate Data Release in preparation (should this be a separate instance?)
Milestone for deploying this by Dec 2018
Use cases needed from science users

US DAC

Chilean DAC

Commissioning Cluster

Where does LSP development occur?

(See further discussion on 06 Dec 2017 .)

Connections Between Aspects

Workspace is visible to all Aspects

File-based Workspace
Database-based Workspace ("MyDB")

File-Based Workspace

Web-accessible filesystem like WebDAV or VOSpace

Needs to use one or both
If we can do so from a security perspective, we should expose through both
No conflicts expected
Should we explicitly not provide FTP?
- Fritz Mueller will document in the LSP design document that FTP will not be provided.
Can use shell from notebook to use any transfer mechanism, but not planning to provide direct ssh head node for user logins
Available through Portal through hierarchy browser
- Uses Web APIs to access file-based Workspace
Available through Notebook
- Mounted natively, most likely via NFS, not using FUSE mounts of VOSpace
- Notebook's home directory is persistent, mounted over NFS, shared across all instances in same Kubernetes cluster
- Can mount any other NFS filesystems desired (e.g. today's /datasets)
- Node-local storage is not user-accessible except /tmp in container-ephemeral storage
Questions about VOSpace:
- Does it have more capabilities (e.g. authorization) than can be provided by underlying infrastructure?
- Can it play well with other interfaces that are simultaneously modifying underlying files?

Database-Based Workspace

If the user is inside a notebook, how do they create a table?

MyDB: SQLAlchemy and TAP are both supported (outside the notebook it must be TAP)
- Security model should enable direct SQL client access
- Most Qserv-specific syntax can be replaced by ADQL, but ADQL cannot replace qserv_areaspec_box
- Put all databases behind a single query layer such as TAP/ADQL? Strong desire to do so
- Use TAP internally? Maybe use DB-API which could go to TAP or to a direct SQL
MyQserv: Will have a way to upload data either as replicated or distributed

Discovery of tables in SDSS MyDB is difficult

Long discussion about Data Backbone applicability

Need a working group to figure out whether there's a per-user DBB and exactly what metadata is provided by the DBB

Authentication and Authorization

High-level model is in LSE-279:

Users with data rights can self-register if from a recognized institution; otherwise can have human-mediated registration
Can associate with credentials from other identity providers like GitHub or Facebook
All connect with a single internal identity used for all authorization including data rights
Can share data with other users, create groups of users, give rights to groups

Has been about access to data but also needs to be about access to resources

Associated with quotas
Associated with project or other allocation mechanism for resources

Data rights policies are under control of the Operations organization, working group starting to define them; Construction needs to provide sufficiently flexible mechanisms and confirm with working group

Do we need to distinguish data rights by Data Release?
Do we need to expire data rights after a period of time?
Do we need to have rights to different parts of the system (e.g. for misbehavior)?
- Notebook Aspect
Do we need to be able to delegate authorization management to others? Yes
- Non-US PIs have a management interface
- Is there an institutional role within the US (and maybe Chile)? Should it be recommended but not required?
- Users can create their own groups but don't necessarily have control over members

Group quotas are very complicated

CILogon includes a user profile service that exists at NCSA, will subsume LSST staff Contacts DB and ActiveDirectory and will also be used for LSST users

Should user preferences be stored in it?
LDAP can be used to get Unix uids, gids, group names, but a little tricky; direct CILogon scope-based API would be nicer; getting the mapping of the official identity to other identities like GitHub would be useful
- Adam Thornton will write up the desired spec for Alex Withers and Jim Basney, for returning both names and IDs from CILogon
Notebooks will run with "real NCSA" Unix uid
NFS has a 16 group limit; may be too limiting; ACLs may be a way around it or else use NFS-server-based group membership (server would look up in LDAP)
GitHub orgs are auto-converted to groups right now, probably do not want this in the operational system

Today JupyterHub gets a list of groups in the token and knows the name of the group required for authorization (but those groups do not have gids, just names)

Relatively easy to add uids and gids to return from CILogon; will be done by end of January ("Phase 1")
All groups will have Unix gids, even self-service groups

Unknown User (awithers) working on proposing a naming scheme for LSST-related groups in the NCSA user database. Should allow both for centrally-defined groups representing system-level "rights" (e.g., "may use the Notebook Aspect") and for user-defined user collaborations.

How can you determine whether an operation is authorized?

Do you have to try it and fail or can you ask some service/library in advance or does every dataset/table/whatever have an associated list of authorized groups?

How does identity get transferred to database (or, in general, between Aspects)?

OAuth or Kerberos token is transferred when SQL client connection is made

Brian Van Klaveren will write up how tokens obtained from CILogon and/or SciTokens will be passed into the VO/DAX services
Brian Van Klaveren will write up how identities are transferred from the VO/DAX services down to the databases (Qserv and consolidated)

Lifetime of NCSA OAuth token is unclear, as is whether it is specifically associated with the Notebook, or even whether it is a Kerberos token
- Jim Basney: can't be a Kerberos token, must be OAuth, will be able to translate it into further OAuth bearer tokens via SciTokens
- Getting a session bearer token is necessary to be able to switch between Aspects or could use a profile-associated app-based access token
- Token can be instantiated into a notebook container as an environment variable or a file
- Bearer token may not be sufficient for integration with external services like GitHub; their app-based token may be needed

What short-term concrete goal should we set towards which the engineering groups can work?

Brian wants to have a bearer token passed to DAX web services, then integrate a PAM module into Qserv to accept it
DAX is not planning to provide their own OAuth endpoint, assuming that CILogon will provide an app-based access token
SciTokens prototype in Jan/Feb, alpha available in March (but not in production on cilogon.org)
uid/gid mappings available by the end of January
Federation should also be available by the end of January

Bearer tokens likely need to expire in about a week

Need a way to determine when token will expire

imgserv for VOSpace may need to impersonate the user to allow the filesystem to control access, would require setuid

Could spawn a container (but has 10 sec latency)
Could have an authorization service
Could launch a separate process that immediately gives up setuid

LSP Architecture

Minimal LSP Diagram (for discussion and elaboration)

Databases need PAM to talk to identity management

Are user files globally accessible? Not integrated/replicated between DACs

Separate VOSpace and WebDAV from Web API Aspect, need to add Batch system

Simon's diagram is at https://confluence.lsstcorp.org/pages/viewpage.action?pageId=64703024

Removed direct access from any Aspect to release versions of data; Notebooks have to go through Butler or API Aspect which uses Butler or DBB API
But if you can use the Butler, you can go directly to the files
TAP DB-API could talk to MyDB via native protocol; it would provide ADQL parser to provide isolation from underlying DBMS implementation
API aspect talks directly to databases, not via Butler or DBB API
Should WebDAV sit in front of release images? Perhaps, or it could be an object store, but release images should not directly be accessible via WebDAV because their headers may not; there should be
His diagram has now been superseded by the one above

Need to put some non-NFS interface in front of Data Release files in order to control access and provide scalability; do not expose to outside world; require access from Notebook via Data Butler

Don't put that in front of user files

Space shortcuts

Page tree