Date/Time

8am PT

Meeting Online Connections

The following is the BlueJeans Information:

To join the meeting on a computer or mobile phone: https://bluejeans.com/744525929?src=calendarLink

CIARA BlueJeans Account has invited you to a video meeting.
-----------------------------------
Connecting directly from a room system?
1) Dial: 199.48.152.152 or bjn.vc
2) Enter Meeting ID: 744525929

Just want to dial in on your phone?
1) +1.408.740.7256 (United States)
+1.888.240.2560 (US Toll Free)
+1.408.317.9253 (Alternate number)
(http://bluejeans.com/numbers)
2) Enter Meeting ID: 744525929
3) Press #
-----------------------------------
Want to test your video connection?
http://bluejeans.com/111

Meeting Recording

TBD


Attendees

Goals

  • Coordination of networking activities across LSST

Discussion items

Time

(PT)

ItemWhoNotes
0800NET Action Items StatusJeff Kantor

Review open action items.

Leads on open "major tasks:

1) 100G End-to-End Test Plan Jeronimo Bezerra
2) Planning Inter-connections Jeronimo Bezerra
3) LSE-78, LDM-142 Documentation updates Jeff Kantor
4) Fall 2017 demo Jeronimo Bezerra
5) Inter-domain QoS planning: Unknown User (rlambert)

6) ESNet collaboration: Jeff Kantor (with other NCSA, AURA, DOE people)

7) AURA - LSST cross-services (e.g. fail-over) Unknown User (rlambert)

Detailed actions and due dates will be assigned and captured in confluence as we do the work.

ESnet collaboration status Jeff Kantor
ESnet collaboration discussions going well. Up to now, for Florida to Chicago, we have been less specific about how to achieve those connections than from Chile to Florida. Florida to Chicago conceptual baseline: leased service from I2 and dedicated service from ESnet. Deep in discussions with DOE - ASCR and Office of Science. Non-technical governmental discussion is also underway. Prior draft request to ESnet being revised by Donald Petravick and Jeff Kantor based on discussions with DOE.
ESNet contribution would be for operations not construction, phased with some level of support by 2020. In the interim, we will rely on existing infrastructure using I2 and other network resources. No specific date established for DOE approval. Jeff Kantor estimates an agreement in place in about 1 year. Goal is for no NSF construction nor operations money to apply to ESnet. NSF MREFC budget does not includes any funds for ESnet. Carrying traffic likely will not happen until 2020. No change to design documents at this time until more clarity on the agreement with ESnet.

0810Requirements Clarifications and Verification PlansJeff Kantor

A quick update:

Unknown User (rlambert) and Jeff Kantor are responsible during construction for all networks (Summit, Summit - Base, International, US to Chicago) except NCSA/IN2P3 (Chicago - Champaign - Lyon). As noted on the NET home page in confluence (the parent to this page), there are textual network requirements spread out in several documents, including LSE-61 (Summit - Base - Archive Networks) and the new LTS-577 derived from LSE-60 (Summit Network). These requirements are being migrated into the System Engineering standard SysML tool Magic Draw (in the SE and TS projects) and when that is done, Jeff Kantor will work with SE to produce a combined document including all the requirements related to networks in one "easy to read" document. We will temporarily still retain LDM-142 as the document for bandwidths, reservations/allocations, and once stable will incorporate this into requirements documents. Jeff Kantor is working on more full definitions of the 4 traffic types in LDM-142. Note that the Network Verification Matrix already includes all the specifications for all the requirements (but not all the descriptive text). The Verification Matrix is also being captured in Magic Draw.

Also, as discussed in prior meetings, the intent is to relegate LSE-78 to be a design document related to the Chilean and International Long-Haul Networks, to baseline LSE-309 as the design document covering Summit - Base and Base LAN, and to have NCSA produce the design document covering Champaign - Chicago - Lyon networks.

0815Network Design and End-to-End Test Plan and QoS planningJeronimo Bezerra

Continue discussion on LDM-142 utilization table, minimum reservations, and fail-overs. Discussion with DM people did not yield any progress on application-level throttling, so we need to try to accommodate this in the network layers.

Jeronimo Bezerra has established 10 G test bench environment, soon to be 100 G (when equipment arrives) and will start doing tests of various QoS, prioritization, shaping approaches. This will inform the QoS inter-domain scheme and E2E Test Plan update. Will work closely with Unknown User (rlambert) on this, others can join if they wish.

Unknown User (rlambert) Cisco ACI role in this is TBD. Also, there is a mechanism to reserve 40 Gbps end to end across all domains.

Jeff Kantor Philosophical question: Failure detection in the network? Mechanism to notify application?  The only firm requirement is to deliver the Science Transport Data, other data could be dropped, but we need to let application people know if this is going to be possible and when. Also, LS to SCL link is the only link that does not reach 200G.  Priority is to get 40G to 100G.  Telefonica is on the critical path for upgrade secondary link. Current guaranteed allocation numbers that make up 130G are soft.  If 130G is too complicated, scale back allocations to 100G nominal. Still need to handle worst case of 40G.

Sandra Jaque Fibers are in and DWDM equipment is in Santiago, installation expected in September, but very tight to make ADASS.  Current back up is 4G service off 10G. Back up link upgrade is 2019. Back up solutions not yet clear.  Minimum is 40G.  Working to have maximum possible, current thinking is same solution as primary, i.e. another fiber, possibly to North, and via Argentina.

Unknown User (kollross) It is useful to distinguish between short "blip" outages, which the network will handle, and longer extended outages where we have to take more action.

Jeronimo Bezerra TCP will handle dropped packet retransmit in "short outages", not to worry about that. In general if we have capacity we will use it.

0845First fiber optic light event planning

Assignments:

Unknown User (rlambert) Cerro Pachon - La Serena, switch + thumb drives versus switch + laptops

Sandra Jaque La Serena - Santiago, connectivity to ADASS hotel, presentation design, INRIA involvement, fall-back Chile only demonstration

Jeronimo Bezerra Santiago - Chicago, overall architecture/plan, coordinate separate meetings as necessary.

Unknown User (kollross) Chicago - Champaign

Jeff Kantor Data/images to transfer. I have been told that the DAQ at NCSA will not have the capability to load a sample image in this time frame, so we will not be using it.

Discussion:

Unknown User (rlambert) Space in Caseta on Cerro Pachon is limited. Need switch plus thumb drives or switch plus computer(s).

Sandra Jaque ADASS hotel connection should be at least 2 Mbps. Will also talk with INRIA re tiled display.

Jeronimo Bezerra Thumb drives won't show performance. Supercomputing demonstrations require lots of preparation, vendor involvement, show memory - memory, not disk - disk.

Unknown User (kollross) Agree that thumb drives won't show performance and memory - memory is most feasible. NCSA has plenty of 10G attached hosts.

Jeff Kantor Need architecture diagrams and plan for demo configuration, including work to be done, equipment needed, timeline. Goal is more than 2 x 10 Gbps, use 4 x 10 as baseline, and try for up to 10 x 10. Up to 10 laptops be wired up at either end. We want to send data, output iPerf style information to web portal. Can be memory - memory (not disk - disk). Internal requirement is just to transfer data, record speed, show received without corruption. External (e.g. ADASS) requirement is higher, display image and statistics, web portal at conference hotel. There is DM software to stitch an image together from separate pieces and display. Going from Summit to Base could be separate demonstration if we can Base - NCSA with more equipment. Suggest that REUNA coordinate ADASS demo, with fall-back of Chile only demo if not all the rest is ready in time. Can do end to end demo later when all is ready. We don't want to spend money just for the demonstration, we can buy equipment earlier than planned if it is not throw-away.


Wrap up and next meetingJeff Kantor

Next meeting date and agenda topics:

 Same overall agenda as this time


Action items

  • Jeff Kantor Add link to E2E test plan from Network Verification Plan  
  • Jeff Kantor Add link to LTS-577 on NET home page in confluence  
  • Jeff Kantor Propose change to LDM-142 table with total of 100 Gbps rather than 130 Gbps  
  • Jeronimo Bezerra Schedule next demonstration coordination meeting