NUIT Tech Talk: Trends in Research Data Mobility Pascal Paschos NUIT Academic & Research Technologies, Research Computing Services Matt Wilson NUIT Cyberinfrastructure, Telecommunication and Network Services
Overview Dedicated research networks empower Big Data flows The Chicago region as a research network hub Preview of Northwestern s pilot campus research network (HPSNet High Performance Science Network)
Size of Data
Big data flows strain the capacity of existing networks The need for big data mobility is critical in Academic research LHC (25 PB annual) / up to 200 TB of daily data transfers LSST (20 TB nightly) NASA climatological data Genome Sequences/Biomedical data Business/Market intelligence and practices Policy/Governmental strategies Big Data Flows
Drivers of Research Data Mobility Research results should be reproducible and desired to be extensible Local infrastructure limitations Fostering collaborations Federal funding policy Cross institutional and cross discipline leveraging of core competencies and resources Open access to data & applications advances research Aggregation and management of community data in centralized repositories
Modes of Research Data Mobility How do researchers share or move data Researcher to himself (local-to-nonlocal) Researcher A-to-Researcher B (local-to-local or local-tononlocal) Researcher-to-Community resource (e.g. repository access containing experimental or simulation data) Researcher-to-Computing/Storage Facilities (Academic or Commercial Cloud) Extending problem size Speeding up time to solution For data preservation, aggregation or distribution Data mobility can be bursty, continuous or periodic
Network Definition A collection of infrastructure elements that allows for the transmission of data (bits of information) between two physical locations Hardware (cables, servers, routers, transceivers, switches, bridges) arranged in a optimal physical topology Transport protocols and software administer the internal functions of the communication, e.g. TCP and the handle the data stream e.g. Globus online (GridFTP)
Network Qualifiers A network is qualified in general by: The Bandwidth or maximum throughout: The rate by which bits of information is transmitted along a communication path The Latency: The round-trip delay in the propagation of a signal due to a finite signal speed, time siting in the outgoing queue, and time to process the signal by hardware (e.g. convert light signal to digital signal, write data on disk)
Dedicated Research Network (DRN) Compared to a General Purpose (Commodity) Network: A Dedicated Research Network only transports research data DRNs are non-commercial Optimized for large data flows (elephant flows vs. mouse flows)
DRNs (cont d) DRNs to use advanced high performance infrastructure to minimize latency, increase throughput, avoid congestion, prevent packet loss and monitor performance DRNs allow for interactivity requirements such as experiment control and remote data analysis DRNs do not employ a firewall for inspection of incoming of data
National Examples of Research Networks Internet2 U.S. DoE Energy Sciences Network (ESnet) CANARIE Regional (Chicago) MREN CIC OmniPoP International StarLight AmLight GÉANT
National Research Networks
International Research Networks
Research Networks (World & Domestic)
Chicago is a hub for international research traffic, and icair at Northwestern University is an important part of that status. Chicago is also a hub for regional (ex: MREN) and national (ex: Internet2 and ESnet) traffic StarLight, the "optical STAR TAP" (Science, Technology, and Research Transit Access Point) is hosted on the NU Chicago campus Designed and developed by researchers, for researchers, StarLight interconnects advanced networks worldwide and is a proving ground for next generation national and international optical networks Most major research networks in North America are present at StarLight
The last mile problem Case of Northwestern Chicago provides ample options for connectivity to regional, national and international research networks No dedicated bandwidth for research traffic between Chicago and Evanston Network service to the end user limited to 1Gbps connections in most cases
Campus Research Network at Northwestern A joint proposal to the NSF between NUIT, the Office of Research along with endorsement of a number of faculty was funded by NSF in 2013 This NSF funding begins to build our campus research network plan Separate network infrastructure Inter-campus backbone 10Gbps connections on campus Connectivity to external research networks Support data transfer nodes in the Evanston data center
The Campus Research Network (CRN) Model Bandwidth High speed 10Gbps connectivity to on and off campus facilities Separate physical infrastructure Friction free network paths no firewall and no contention High-performance hardware Security appropriate to science workflows Dedicated Data Transfer Nodes (DTNs) Performance Measurement Nodes (perfsonar software)
HPSNet Overview
Future CRN Expansion The challenge is to expand the CRN beyond the pilot funded by the NSF grant Funding would need to come from campus researchers Costs would be highly location dependent Central IT is continuing to look at funding future incremental expansion of this network service and capabilities NU Research community input to identify needs
Questions?