Date

Attendees

Agenda

  • Intended/desired use of the integration environment at NCSA. There looks to be hardware and cloud (Nebula) resources reserved.
  • How 'wide' would we need to go in order to serve the intended/desired use(s)
  • What is the administration model you use at IN2P3 and is it working well (ie, do you require root or only docker access)
  • What is the current security model for QServ? This could determine where we land the hardware at NCSA
  • Is the current hardware sufficient or should we be looking for alternatives (IE are there bottlenecks ... memory, cpu, disk, network)

Discussion

Fritz: Discussed S15 Large Scale Tests. PanStarrs likely to be used for KPI. Nebula being used for some development but need access to larger datasets.

Jason: NCSA will provide access to Nebula for larger datasets, likely SMB from GPFS.

Fritz: Work at IN2P3 focuses on perf/integration testing. DBDev usage still in progress. Understood that hardware is not under warranty. Is ok to let it 'fade away'. Not using Docker in Nebula yet but availability of docker will be required.

Fritz: IN2P3 is 2x25 QServ clusters

Fabio: Will provide QServ environment as long as necessary, under no pressure to free resources. Helps with understanding the technology. Can be used later for WAN testing IN2P3->NCSA

Fritz: Wants PanStarrs @ IN2P3 when available

Fritz: Multiple developers needing to conduct concurrent 'wide' tests. Need more resources. Need stable environment for SUI integration and dedicated environment for development

Jason: 2x25 for replication testing at NCSA plus PanStarrs environment. Can the integration environment be less nodes?

Fritz: Data is spatially sharded, small number of nodes not useful. Multi node parallelism, less nodes = larger shards = less performance. 

Fritz: Currently using Stripe 82 data, 33.2TB (1.4TB/node + OS)

Yvan: Admin model: 52 nodes (2x25 workers, 1 build, 1 monitor). SSH gateway, all QServ nodes on a private network. Root access not given. Puppet + docker + ganglia + syslog-ng.

Fritz: Model is satisfactory. Timely response from Yvan is critical. Yvan installs/updates packages. Making use of public docker registry; will need local registry at some point. 

Jason: What about the security model of Qserv? How to allow users to access the integration environment? This impacts when/where/how we deploy the integration environment.

Fritz:Security model is a work in progress. Will need a discussion with SUI (Gregory Dubois-Felsmann Unknown User (xiuqin)), QServ, Science users

(ACTION) Jason: I will get  this meeting arranged. It impacts our timeline and designs

Jason: How about using the current hardware as a baseline? Is it sufficient?

Fritz: Hardware is sufficient. More is better (cpu, ram). Working is progressing, expect to grow capabilities which will require memory intensive workloads. Multi master environments coming soon.Secondary Index to require multi TB per Czar (Master) node.

(ACTION) Jason: I will put together a hardware plan and we can discuss with leadership (Kian-Tat LimMario JuricJacek Becla)



1 Comment

  1. Please keep Brian Van Klaveren in the loop for the security model/SUI/Qserv/science users meeting.