Date/Time

8am PT

Meeting Online Connections

The following is the BlueJeans Information:

To join the meeting on a computer or mobile phone: https://bluejeans.com/744525929?src=calendarLink

CIARA BlueJeans Account has invited you to a video meeting.
-----------------------------------
Connecting directly from a room system?
1) Dial: 199.48.152.152 or bjn.vc
2) Enter Meeting ID: 744525929

Just want to dial in on your phone?
1) +1.408.740.7256 (United States)
+1.888.240.2560 (US Toll Free)
+1.408.317.9253 (Alternate number)
(http://bluejeans.com/numbers)
2) Enter Meeting ID: 744525929
3) Press #
-----------------------------------
Want to test your video connection?
http://bluejeans.com/111

Meeting Recording

TBD


Attendees

Goals

  • Coordination of networking activities across LSST

Discussion items

TimeItemWhoNotes

NET Action Items StatusJeff Kantor

Review open action items:

From Fabio Hernandez (IN2P3) via Unknown User (kollross):

Here is the status of connectivity from CC-IN2P3 to NCSA. We current have a shared 20 Gbps link which goes: CC-IN2P3 — RENATER* — GEANT — Internet2 — Starlight — NCSA.

From mid-2017, we will have dedicated bandwidth of 20 Gbps between CC-IN2P3 and NCSA using the same path mentioned above.

Thanks to the ongoing general deployment of 100 Gbps links by RENATER in France, from 2018 on, it will be technically possible for CC-IN2P3 to have one (or maybe more) 100 Gbps link through GEANT if needed. However, the conditions under which this connectivity would be available need to be discussed and negotiated with all the involved parties (RENATER, GEANT, Internet2, etc.) and are subject to the needs of LSST.

Therefore, it would be very useful for CC-IN2P3 to get a more detailed specification of the bandwidth requirements we should prepare for, in the framework of our foreseen contribution to the LSST project.

Per Unknown User (kollross) NCSA is working on bandwidth requirements between NCSA and IN2P3 based on in work Statement of Work for IN2P3.


Requirements Clarifications and Verification PlansJeff Kantor

Discussion on consistency of requirements in LSE-61, LDM-142, etc.

RL: Alert production does not have to be 60 Seconds?
MK: Has not heard any information. Has been working with Jim Parsons. Latency simulator for simulation to Chile. Transfer between nodes down to 7 seconds. Getting close to being able transfer data in required time frame.
JK: There are rumors the alert time might change by a few minutes, but only rumors. In any case, we will drive transfer as fast as possible. Time frame must also include processing. First order: no change as far as the network.


End-to-End Test PlanJeronimo Bezerra

Discussion on updates to plan

JB: 2 major topics. BW guarantee was only discussed South - North. North - South has not yet been discussed. Sunny and rainy cases: total BW reservation documented in the spreadsheet is over 100G. If LSST only has 1 wave to connect to AmLight, what is the approach? The BW amount calculated was 130G.

JK: There are 2 sides in LSST (not REUNA). Between LS and SCL primary side is 100G; other side is 40G. 130G is the sum of the allocation using both sides in parallel.

RL: There has not been discussion about using the REUNA lambda.

SJ: In Chile primary link will be 200G. Backup link the commitment is 40G. If something happens to the primary link, could be that backup is just 40G. If we can activate more, we can have it. Is limit reflected in BW allocation?

JK: Need to revisit sheet, over subscribed in that scenario.

RL: If all up, will have 200G to NCSA. If failure, the app that must keep running is science data.

JK: Worse case scenario is the 40G backup path LS - SCL.

JB: Concern: if reservation exceeds port capacity, policy change will be required. Policies are for static provisioning. Router by router configuration change will be required. Even with SDN will be complex. Can number be lowered and to feed profiles up to 100G? (Displayed a presentation that shows a simulation. ) Idea is to have 1 source and 1 destination. Red arrow means primary path. Next step is to introduce a packet generator. Idea is to experiment with different QoS techniques, forcing the red path to get QoS, and green path to be best effort. Blue path will take another path in the network. This will simulate 3 different traffic types. Red, blue and green will each go beyond 100G. Goal is to simulate everything, then simulate different use cases of network events. If normal routes and MPLS, will require extensive manual configurations, which will not scale with a network of this size. Is it possible to accommodate network types to 100G

JK: Limit will be 100G max path, everything else will be best effort? SDN and/or application would take care of throttling? Feasibility of throttling could be doable. How does the application detect event?

JB: Not planning to cap you at 45M. In the best case will be able to use everything. In worse case, you will have what's in the SLA.

JK: We will configure reservation for worse case, but will provide best case. What if the table represented expected usage instead of minimum guarantee?

JB: If SDN, controller has visibility of the complete topology with QoS policies. Not possible with legacy technologies.

JK: How can we be reactive to failures, but not overly constrain when not in failure mode?

JB: When service is normal, all lanes will be available.

MK: Is the network the right location to deal with failures? For example, 2 links with 2 BGP peerings between NCSA and LSST routers. LSST is sending their full address on both links. Image data will be coming out of /24 network. NCSA has route to both peerings, but local pref to use link for camera data, reachable thru other link as well. If cut on primary link, traffic will be routed to other link. If cut happens and all 140G traffic being pushed down single 100G link.

JB: In Best Effort network, then no worries, but this is a guaranteed BW network design. Different sources are involved, so that adds complexity to the application. From the network perspective, we have an SLA to follow, so the network must care.

JK: This problem will not be solved today. Topic must be tackled very soon. Question: do we have ability to do application level throttling of the traffic, or do we try to archive with configuration of the network?

JB: AmLight has no control of the application. If strategy is to involve application, then an application person needs to join the discussion.


First fiber optic light event planningJeff Kantor

In conjunction with LSST patch design for 2017 and prior to upcoming conferences (ADASS, Supercomputing, etc.) we want to have an event where we transfer traffic from the Summit to NCSA if possible (or the longest segment we can do).

JK: Approaching time when links and equipment will go beyond equipment capacity. There's a strong desire from LSST management to publicize capability. Goal is to have a data transfer over links around the October timeframe. There are secondary desires to accomplish prior to SC17. Need to identify a config and schedule where we can transfer an image or video feed from the summit out thru LS, SCL, and on to NCSA. Current schedule is that by Sept. the DWDM equipment in AURA and REUNA to be able to get at least a few x10G to SCL. That's current discussion on AURA, REUNA side. How can in SCL take a few 10G streams to 100G ring to Florida, then what options do we have to get to NCSA? Goal is to show continuity of fiber paths at a certain BW. Not operational, one time event. We do not want to spend extra money, but can accelerate planned purchases that are not throw-away. What can we achieve in October?

JB: From AmLight not a problem. Challenge will be use of I2. Biggest challenge will be to have this accomplished by Oct. If we're going to do this, then must start immediately.

JK: Is there a plan to leverage resources for SC17?

SJ: Purchase list is 20x10G. Can ask to send a card for the demo. Is it possible to simulate using 10G NICs?

JK: Can put image data in laptops. Goal would be 10x10 not 1x100.

RL: October is possible if everything falls into place.

JK: Use October 15 for internal planning date, will not publicized until we know.

JK: Miami to NCSA side. Matt, can you support 4x10 or 6x10 transfer? Can borrow CCDs to create image.

MK: Confirming the demo config.

JK: As long as 6 files shows up on 6 servers, we can put a host on the end to assemble and display image. We will commission this as a specific action. Need someone to architect the design of this demonstration. JK volunteered Jeronimo and he accepted.
Deliverable: Plan and list of components for this demonstration.
Preference is to achieve functional milestone. Network monitors showing traffic; image on display on La Serena. Transfer. Image displayed at NCSA. Feasible? ADASS conference is a secondary goal. There is not internally from management a requirement to demonstrate at any conference. This is to satisfy LSST management goal. Message here is LSST has implemented fiber network and equipment to transfer scientific data from Chile to NCSA. There's a level of maturity now that allows for data transfer above the 20G we've had for quite a while.


Wrap up and next meetingJeff Kantor

Next meeting date and agenda topics:

July 20th proposed for the next call (interim meeting before regular monthly to establish baseline for managing bandwidth allocation)


Action items

  •  Jeff Kantor Configure a work team to work this in next 2 weeks  
  • Julio Ibarra Poll for an interim meeting by July 20