Network performance monitoring Insight into perfsonar Szymon Trocha, Poznań Supercomputing and Networking Center E-infrastructure Autumn Workshops, Chisinau, Moldova 9 September 2014
Agenda! Network performance monitoring! Introduction to network monitoring tool - perfsonar! perfsonar today 2
Network expectations! Massive data sets North America Europe Japan 25 PB / year Transporting high-volumes of traffic 3
Multiple domains 4
Accross network boundaries! Performance data fragmented and hard to access! Difficult to find measurement capability! Multi-domain problem diagnoses difficult and slow Backbone Network Administrator X X Network Administrator Network Administrator X Campus Network X National Network Key: X = locally held performance data X National Network Network Administrator X Campus Network Network Administrator User = path User Last mile problem 5
Network performance metrics Throughput / achievable bandwidth Latency (one way delay, round trip time) Packet loss Network interface utilization Network performance monitoring tools 6
Network performance monitoring tools! Helps finding and isolating problems in the network (or hosts)! In a timely manner. Immediate access to the complete picture: No more waiting for others to provide their network monitoring data that affects your users experience.! Provides network usage base! Provides a source of network measurements for further diagnostics! Tackling potential problems which may adversely impact the researchers voice, video or data communications 7
How monitoring helps in debugging network problems! Throughput! Stream of TCP data to show how much bandwidth one can get from the network! Differentiate server problems from path problems! Delay! A train of UDP packets is send to show impact! Notice route changes or asymmetric routes! Correlate with traceroute and topology information! Packet loss! Verfiy reliable e2e transmission! Utilization! Show port usage! Over time or on-demand! Beware of firewalls! Community of trust 8
Performance monitoring in practice bandwidth measurements! Know your expectations Delay! Bandwidth Delay Product Bandwidth - link with the smallest bandwidth in a path (bottleneck) BDP=Bandwidth [ b/s ] RTT [s] Bandwidth! Optimal TCP window size Bandwidth expected bandwidth TCP window Bandwidth[ b/s ] RTT [s]! Large BDP (high speed link, bigger RTT) requires large buffers 9
Performance monitoring in practice bandwidth measurements! The bwctl tool! Uses iperf iperf2 iperf3 Note the difference! 10
Performance monitoring in practice bwctl example! Madrid <-> Tallin achievable bandwidth test 27 ms 11
Performance monitoring in practice Calculating optimal TCP window! RTT = 2 * 27 = 54 ms! 1Gb/s network interface! Calculating Bandwidth Delay Product BDP=1 Gb/s 54ms=1000000000 b/s 0,054s=54000000b=6750000B! Optimal TCP window = BDP! TCP window = 6 750 000 Bytes 12
Performance monitoring in practice Actual throughput test 6750000 B 941 Mb/s 13
Performance monitoring in practice Actual throughput test 65536 B 11.5 Mb/s!! Typical default buffer size of 64 kb! TCP autotuning available but observe max settings 14
Turning a FAILs into WINs 12 Step programs encourage you to admit your problems and then work toward a solu:on Tuning the network for science takes :me 15 ESnet Science Engagement (engage@es.net) - 10/09/14
Performance monitoring in practice delay measurements! The owping tool! Uses OWAMP protocol 16
What OWAMP Tells Us OWAMP is a necessity in regular tes:ng if you aren t using this you need to be Queuing ooen occurs in a single direc:on (think what everyone is doing at noon on a college campus) Packet loss (and how ooen/how much occurs over :me) is more valuable than throughput If your router is going to drop a 50B UDP packet, it is most certainly going to drop a 15000B/9000B TCP packet Overlaying data Compare your throughput results against your OWAMP do you see pawerns? Alarm on each, if you are alarming (and we hope you are alarming ) 17 ESnet Science Engagement (engage@es.net) - 10/09/14
What OWAMP Tells Us 18 ESnet Science Engagement (engage@es.net) - 10/09/14
Monitoring motivation Collaboration in R&E Effective operations at an international scale Shared responsibility GÉANT launch event performance: Musicians in Stockholm Dancers in Kuala Lumpur http://www.geant.net/events/launchevent/pages/ EventHighlights-Day1.aspx 19
Shared responsibility! A domain has two roles! Data supplier Deploys the service instances Provides the required data and functionalities Administers and minimizes unavailability! Data user Uses the infrastructure, solves issues Updates operational procedures to take the monitoring service into account Raises awareness internally 20
Introduction to network performance monitoring tool - perfsonar
perfsonar in short! Performance focused Service Oriented Network monitoring Architecture! International collaboration for network monitoring! Two main implementations committed to interoperate:! perfsonar MDM within GÉANT http://perfsonar.geant.net! perfsonar PS within I2/Esnet http://psps.perfsonar.net/! Open OGF protocol to exchange data! Web-service based 22
perfsonar architecture 23
perfsonar history! Interoperable measurement frameworks began in OGF NMWG! perfsonar project began in collaboration between Internet2, GEANT and ESnet! Original set of software was developed jointly by Internet2 and GEANT and Esnet! Then ended up with two code bases! The brand of perfsonar has continued to grow! Having multiple implementations! Increasing in footprint! Perception of going towards a single implementation 24
perfsonar so far! Long-standing (over 10 years) R&D efforts in network performance monitoring! Standardization (OGF)! Creation of middleware platform! Fragmented efforts! Interoperability through matching functionality 25
perfsonar world perfsonar ps MDM ps-ps MonIPÊ 26
Global perfsonar! Total # of perfsonar hosts: 1232! Total # of domains: 312! Selected top tevel domains!.ca 75!.edu 324!.net 211!.org 65!.uk 32!.gov 46!.jp 22 Registered in Lookup Service 3.09.2014, see also Lookup Service Directory Search: http://stats.es.net/servicesdirectory/ 27
perfsonar strategy! Strategy refocus on merging of fragmented performance monitoring efforts! User feedback further strengthened the focus on convergence perfsonar converged 28
Benefits! Deployments will be consistently interoperable due to matching measurement technology! No requirement for multiple installations in the same location! Significant increase in publically usable end points to perform measurements against! Higher efficiency in user communication and training! Synergies in software development 29
perfsonar today
perfsonar converged! Agreements reached on perfsonar Collaboration! Multi-phase plan on achieving a converged perfsonar! Moves from common functionality towards common toolset! Integrating software components by functionality! Common decision making process! Focus on core, production worthy services! User visualization part of converged work and strongly affected but not part of the toolkit! End of 2013, GÉANT, Internet2, Indiana University and Esnet selected the best components from MDM and PS for delivering a consistent high quality experience to users and bundled them into a commonly branded and jointly released platform.! This first operational prototype was successfully demonstrated at TNC2014. It included four measurement points, two in the US, Washington and New York, and two in Europe, Frankfurt and Amsterdam 31
Convergence higlighted! MDM on ps-ps platform! OW Latency! Throughput! Core integration work between ps-ps and MDM! Common software release! Common user support 32
Platform usability choices Ease of use - Live CD image - Toolkit netinstaller - Published packages Ease of administration 33
Upcoming perfsonar releases (1)! MDM! OPPD Current BWCTL MP (1.0) will be part of the 3.4 ps- Performance toolkit OWAMP MP (1.0) will be part of a 3.4.x later release, probably around October/November BWCTL and OWAMP MP with new MA support (esmond) should come with versions 1.1 of both MPs, sometime before the end of 2014! perfsonarui 1.3.2 bug fix release ifor compatibility with 3.4 ps-ps toolkit 1.4 will use the Simple LS and be compatible with the 3.4 toolkit release (to be released in October/November) Not part of the toolkit, but compatibility will be synchronised with it 1.5 to provide support for new MA (esmond), early 2015 34
Upcoming perfsonar releases (2)! MDM! MA Storage in the new toolkit MA (edmond) will be provided for OPPD (BWCTL MP and OWAMP MP) psui query to the new MA will be provided by an updated psui, v1.5 for early 2015 A way to migrate data from existing MDM SQL MA deployments to the new MA! Continued support and bug fixes for MDM products up to end GN3+! existing SQL MA and RRD MA! BWCTL MP! Existing Debian 7 and CentOS 6 repositories 35
Upcoming perfsonar releases (3)! ps Performance toolkit! Current version 3.3.2! Upcoming 3.4 for the end of September! 3.5 scheduled for first half of 2015! Dropping Live CD is possibly coming 36
Quick steps on integration Choose your tools Get monitoring in place Continue regular measurements Track your performance issues 37
Source of information / public sites psps.perfsonar.net perfsonar.geant.net www.perfsonar.net perfsonar.forge.geant.net 38
Credits Roland Karch, FAU Erlangen-Nürnberg Domenico Vicinanza, DANTE Jason Zurawski, ESnet, LBNL Contact us: szymon.trocha@psnc.pl www.geant.net www.twitter.com/geantnews www.facebook.com/geantnetwork www.youtube.com/geanttv 39