Hari Subramoni. Education: Employment: Research Interests: Projects:



Similar documents
RDMA over Ethernet - A Preliminary Study

Big Data: Hadoop and Memcached

Can High-Performance Interconnects Benefit Memcached and Hadoop?

A Micro-benchmark Suite for Evaluating Hadoop RPC on High-Performance Networks

Accelerating Big Data Processing with Hadoop, Spark and Memcached

Advancing Applications Performance With InfiniBand

Min Si. Argonne National Laboratory Mathematics and Computer Science Division

Accelerating Spark with RDMA for Big Data Processing: Early Experiences

High Performance Data-Transfers in Grid Environment using GridFTP over InfiniBand

How To Monitor Infiniband Network Data From A Network On A Leaf Switch (Wired) On A Microsoft Powerbook (Wired Or Microsoft) On An Ipa (Wired/Wired) Or Ipa V2 (Wired V2)

Accelerating Big Data Processing with Hadoop, Spark and Memcached

Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster

Performance Evaluation of InfiniBand with PCI Express

Low-Latency Software Defined Network for High Performance Clouds

INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool

Designing Efficient FTP Mechanisms for High Performance Data-Transfer over InfiniBand

Accelera'ng Big Data Processing with Hadoop, Spark and Memcached

A Flexible Cluster Infrastructure for Systems Research and Software Development

LS DYNA Performance Benchmarks and Profiling. January 2009

Why Compromise? A discussion on RDMA versus Send/Receive and the difference between interconnect and application semantics

Enabling High performance Big Data platform with RDMA

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Enabling Multi-pipeline Data Transfer in HDFS for Big Data Applications

Performance Evaluation of the RDMA over Ethernet (RoCE) Standard in Enterprise Data Centers Infrastructure. Abstract:

Mellanox Academy Online Training (E-learning)

HPC enabling of OpenFOAM R for CFD applications

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

State of the Art Cloud Infrastructure

High Throughput File Servers with SMB Direct, Using the 3 Flavors of RDMA network adapters

Interconnecting Future DoE leadership systems

Pedraforca: ARM + GPU prototype

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

SMB Direct for SQL Server and Private Cloud

Accelerating From Cluster to Cloud: Overview of RDMA on Windows HPC. Wenhao Wu Program Manager Windows HPC team

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Interoperability Testing and iwarp Performance. Whitepaper

GASPI A PGAS API for Scalable and Fault Tolerant Computing

Introduction to Infiniband. Hussein N. Harake, Performance U! Winter School

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Hadoop on the Gordon Data Intensive Cluster

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

Cluster Implementation and Management; Scheduling

HPC Programming Framework Research Team

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

A SOFTWARE DEFINED NETWORKING ARCHITECTURE FOR HIGH PERFORMANCE CLOUDS 1

RoCE vs. iwarp Competitive Analysis

FTP 100: An Ultra-High Speed Data Transfer Service over Next Generation 100 Gigabit Per Second Network

Mellanox Cloud and Database Acceleration Solution over Windows Server 2012 SMB Direct

Deep Learning GPU-Based Hardware Platform

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

Driving IBM BigInsights Performance Over GPFS Using InfiniBand+RDMA

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Petascale Software Challenges. William Gropp

Mellanox HPC-X Software Toolkit Release Notes

Measuring MPI Send and Receive Overhead and Application Availability in High Performance Network Interfaces

A Study on the Scalability of Hybrid LS-DYNA on Multicore Architectures

Ultra Low Latency Data Center Switches and iwarp Network Interface Cards

ECLIPSE Best Practices Performance, Productivity, Efficiency. March 2009

SR-IOV: Performance Benefits for Virtualized Interconnects!

Storage, Cloud, Web 2.0, Big Data Driving Growth

GPI Global Address Space Programming Interface

Advanced Computer Networks. High Performance Networking I

MPI / ClusterTools Update and Plans

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems

Software Distributed Shared Memory Scalability and New Applications

A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

Performance of RDMA-Capable Storage Protocols on Wide-Area Network

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

Parallel Computing. Benson Muite. benson.

Interconnect Your Future Enabling the Best Datacenter Return on Investment. TOP500 Supercomputers, November 2015

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

High Performance MPI on IBM 12x InfiniBand Architecture

Transcription:

Hari Subramoni Senior Research Associate, Dept. of Computer Science and Engineering The Ohio State University, Columbus, OH 43210 1277 Tel: (614) 961 2383, Fax: (614) 292 2911, E-mail: subramoni.1@osu.edu http://www.cse.ohio-state.edu/ subramon Education: B. Tech. June 2004 University College of Engineering, Trivandrum Computer Science and Engineering Ph.D. August 2013 The Ohio State University, Columbus, Ohio Computer Science and Engineering Employment: Aug 13 Current Senior Research Associate The Ohio State University Apr 08 Jul 13 Graduate Research Assistant The Ohio State University Jun 11 Aug 11 Temporary Student Intern Lawrence Livermore National Laboratory Oct 09 Dec 09 Research Intern IBM TJ Watson Research Center Dec 06 Aug 07 Member Technical Staff Force10 Networks, India Aug 04 Nov 06 Software Engineer ISoftTech-Sasken Solutions, India Research Interests: Parallel computer architecture, network-based computing, exascale computing, network topology aware computing, QoS, power-aware LAN-WAN communication, fault tolerance, virtualization, high performance job startup and cloud computing. Projects: Involved in design, development, testing and distribution (involved in various major/minor releases - 1.2p1 through 2.1rc1) of MVAPICH2/MVAPICH2-X software stack - an open-source implementation of the MPI-3.1 specification over modern highspeed networks such as InfiniBand, 10/40 GigE, iwarp and RDMA over Converged Ethernet (RoCE). This software is being used by more than 2,275 organizations world-wide in 74 countries and is powering some of the top supercomputing centers in the world, including the 7th ranked Stampede, 11th ranked Pleiades and the 15th ranked Tsubame 2.5. As of Jan 15, more than 230,000 downloads have taken place from this project s site. This software is also being distributed by many InfiniBand, 10GigE/iWARP and RoCE vendors in their software distributions. MVAPICH2-X software package provides support for hybrid MPI+PGAS (UPC and OpenSHMEM) programming models with unified communication runtime for emerging exascale systems. Some of the specific projects worked on while at The Ohio State University are: Scalable network topology discovery mechanism for InifniBand networks; Network topology aware, high performance MPI libraries; High performance, fault-tolerant MPI over modern multi-rail systems; High performance, power aware, file transfer mechanisms over wide area networks; Scalable middleware for financial applications using emerging networking protocols like AMQP; QoS aware high performance MPI; High performance job startup for MPI; High-performance and scalable middleware for Big-Data over InfiniBand, iwarp and RoCE. Studied the impact network topology can have on the performance of HPC applications and investigated techniques to lower the impact through the design of network topology aware, high performance MPI libraries during the summer internship at Lawrence Livermore National Laboratory. Looked at the impact NUMA architectures can have on the communication performance of low latency, on-line trading systems during the internship at IBM T.J Watson Research Center. Select Awards and Recognitions: Outstanding Graduate Research Award - Department of Computer Science and Engineering, The Ohio State University, 2013. First Author of Best Paper/Best Student Paper Finalist - The International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2012).

IEEE Student Travel Award - IEEE Cluster 2011. Selected as Student Volunteer - International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2008). Proposal Writing Experience: Extended help in various phases of proposal writing for the following successful proposals. Interacted with PIs from multiple institutions in defining overall proposal direction. Innovations to Transition a Campus Core Cyberinfrastructure to Serve Diverse and Emerging Researcher Needs, National Science Foundation (OCI), 2012-2014. Unified Runtime for Supporting Hybrid Programming Models on Heterogeneous Architecture, National Science Foundation, SHF, 2012-2015. A Comprehensive Performance Tuning Framework for the MPI Stack, National Science Foundation, SI2-SSI, 2012-2015. HPC Application Energy Measurement and Optimization Tools, DOE SBIR/STTR Phase-I, 2010. Designing QoS-Aware MPI and File Systems Protocols for Emerging InfiniBand Clusters, National Science Foundation, 2010. Topology-Aware MPI Communication and Scheduling for Petascale Systems, National Science Foundation (STCI), 2009. Six more proposals (five to NSF and one to DOE) are under review. Publications: Invited Papers, Book Chapters, Journal Publications: 1. S. Sur, S. Potluri, K. Kandalla, H. Subramoni, K. Tomko, D. K. Panda, Co-Designing MPI Library and Applications for InfiniBand Clusters, Computer, 06 Sep. 2011. IEEE Computer Society Digital Library. 2. D. K. Panda, S. Sur, H. Subramoni and K. Kandalla, Network Support for Collective Communication, Encyclopedia of Parallel Computing, Sep. 2011. 3. H. Subramoni, F. Petrini, V. Agarwal and D. Pasetto, Intra-Socket and Inter-Socket Communication in Multi-core Systems, Computer Architecture Letters 9(1): 13-16, 2010. Refereed Conference/Workshop Publications: 1. S. Chakraborty, H. Subramoni, A. Moody, A. Venkatesh, J. Perkins and D. K. Panda, Non-blocking PMI Extensions for Fast MPI Startup, Int l Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2015), May 2015 Accepted to be published. 2. A. Venkatesh, H. Subramoni, K. Hamidouche and D. K. Panda, A High Performance Broadcast Design with Hardware Multicast and GPUDirect RDMA for Streaming Applications on InfiniBand Clusters, IEEE International Conference on High Performance Computing (HiPC 2014). Dec. 2014. 3. H. Subramoni, K. Kandalla, J. Jose, K. Tomko, K. Schulz, and D. Pekurovsky, Designing Topology Aware Communication Schedules for Alltoall Operations in Large InfiniBand Clusters, Int l Conference on Parallel Processing (ICPP 14). Sep. 2014. 4. S. Chakraborty, H. Subramoni, J. Perkins, A. Moody, M. Arnold, D. K. Panda, PMI Extensions for Scalable MPI Startup, EUROMPI 14. Sep. 2014. 5. H. Subramoni, K. Hamidouche, A. Venkatesh, S. Chakraborty and D. K. Panda, Designing MPI Library with Dynamic Connected Transport (DCT) of InfiniBand : Early Experiences. IEEE International Supercomputing Conference (ISC 14). June 2014. 6. H. Subramoni, D. Bureddy, K. Kandalla, K. Schulz, B. Barth, J. Perkins, M. Arnold, and D. K. Panda, Design of Network Topology Aware Scheduling Services for Large InfiniBand Clusters, IEEE Cluster 13, Sep. 2013.

7. S. Potluri, D. Bureddy, K. Hamidouche, A. Venkatesh, K. Kandalla, H. Subramoni, and D. K. Panda, MVAPICH- PRISM: A Proxy-based Communication Framework using InfiniBand and SCIF for Intel MIC Clusters, Int l Conference on Supercomputing (SC 13), Nov. 2013. 8. K. Hamidouche, S. Potluri, H. Subramoni, K. Kandalla and D. K. Panda, MIC-RO: Enabling Efficient Remote Offload on Heterogeneous Many Integrated Core (MIC) Clusters with InfiniBand, Int l Conference on Supercomputing (ICS 13), June 2013. 9. S. Potluri, D. Bureddy, H. Wang, H. Subramoni and D. K. Panda, Extending OpenSHMEM for GPU Computing, Int l Parallel and Distributed Processing Symposium (IPDPS 13), May 2013. 10. H. Subramoni, S. Potluri, K. Kandalla, B. Barth, J. Vienne, J. Keasler, K. Tomko, K. Schulz, A. Moody and D. K. Panda, Design of a Scalable InfiniBand Topology Service to Enable Network-Topology-Aware Placement of Processes, Int l Conference on Supercomputing (SC 12), Nov. 2012. (Best Paper and Best Student Paper Finalist) 11. H. Subramoni, J. Vienne and D. K. Panda, A Scalable InfiniBand Network-Topology-Aware Performance Analysis Tool for MPI, Int l Workshop on Productivity and Performance (Proper 12), in conjunction with EuroPar, Aug. 2012. 12. N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy and D. K. Panda, High Performance RDMA-Based Design of HDFS over InfiniBand, Int l Conference on Supercomputing (SC 12), Nov. 2012. 13. R. Rajachandrasekar, J. Jaswani, H. Subramoni and D. K. Panda, Minimizing Network Contention in InfiniBand Clusters with a QoS-Aware Data-Staging Framework, Cluster (Cluster 12), Sep. 2012. 14. K. Kandalla, A. Buluç, H. Subramoni, K. Tomko, J. Vienne, L. Oliker, and D. K. Panda, Can Network-Offload based Non-Blocking Neighborhood MPI Collectives Improve Communication Overheads of Irregular Graph Algorithms?. International Workshop on Parallel Algorithms and Parallel Software (IWPAPS 2012), in conjunction with Cluster, Sep. 2012. 15. J. Huang, X. Ouyang, J. Jose, M. W. Rahman, H. Wang, M. Luo, H. Subramoni, C. Murthy and D. K. Panda, High- Performance Design of HBase with RDMA over InfiniBand, Int l Parallel and Distributed Processing Symposium (IPDPS 12), May 2012. 16. K. Kandalla, U. Yang, J. Keasler, T. Kolev, A. Moody, H. Subramoni, K. Tomko, J. Vienne and D. K. Panda, Designing Non-blocking Allreduce with Collective Offload on InfiniBand Clusters: A Case Study with Conjugate Gradient Solvers Int l Parallel and Distributed Processing Symposium (IPDPS 12), May 2012. 17. S. P. Raikar, H. Subramoni, K. Kandalla, J. Vienne and D. K. Panda, Designing Network Failover and Recovery in MPI for Multi-Rail InfiniBand Clusters, Int l Workshop on System Management Techniques, Processes, and Services (SMTPS), in conjunction with Int l Parallel and Distributed Processing Symposium (IPDPS 12), May 2012. 18. J. Jose, H. Subramoni, K. Kandalla, M. W. Rahman, H. Wang, S. Narravula and D. K. Panda, Scalable Memcached design for InfiniBand Clusters using Hybrid Transports, Int l Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2012), May 2012. 19. H. Subramoni, K. Kandalla, J. Vienne, S. Sur, B. Barth, K. Tomko, R. McLay, K. Schulz and D. K. Panda, Design and Evaluation of Network Topology-/Speed-Aware Broadcast Algorithms for InfiniBand Clusters, Cluster 11, Sep. 2011. 20. J. Jose, H. Subramoni, M. Luo, M. Zhang, J. Huang, M. W. Rahman, N. S. Islam, X. Ouyang, H. Wang, S. Sur and D. K. Panda, Memcached Design on High Performance RDMA Capable Interconnects, Int l Conference on Parallel Processing (ICPP 11), Sep. 2011. 21. N. Dandapanthula, H. Subramoni, J. Vienne, K. Kandalla, S. Sur, D. K. Panda, and R. Brightwell, INAM - A Scalable InfiniBand Network Analysis and Monitoring Tool, 4th Int l Workshop on Productivity and Performance (PROPER 2011), in conjunction with EuroPar, Aug. 2011. 22. K. Kandalla, H. Subramoni, J. Vienne, K. Tomko, S. Sur and D. K. Panda, Designing Non-blocking Broadcast with Collective Offload on InfiniBand Clusters: A Case Study with HPL, Hot Interconnect 11, Aug. 2011. 23. K. Kandalla, H. Subramoni,, K. Tomko, D. Pekurovsky, S. Sur and D. K. Panda, High-Performance and Scalable Non- Blocking All-to-All with Collective Offload on InfiniBand Clusters: A Study with Parallel 3D FFT, Int l Supercomputing Conference (ISC), June 2011.

24. H. Subramoni, P. Lai, S. Sur and D. K. Panda, Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters, Int l Conference on Parallel Processing (ICPP 10), Sep. 2010. 25. M. Luo, S. Potluri, P. Lai, E. P. Mancini, H. Subramoni, K. Kandalla, S. Sur and D. K. Panda, High Performance Design and Implementation of Nemesis Communication Layer for Two-sided and One-Sided MPI, The Workshop on Parallel Programming Models and System Software (P2S2 2012), in conjunction with Int l Conference on Parallel Processing (ICPP 10), Sep. 2010. 26. H. Subramoni, K. Kandalla, S. Sur and D. K. Panda, Design and Evaluation of Generalized Collective Communication Primitives with Overlap using ConnectX-2 Offload Engine, Int l Symposium on Hot Interconnects (HotI), Aug. 2010. 27. H. Subramoni, P. Lai, R. Kettimuthu and D. K. Panda, High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand, Int l Symposium on Cluster Computing and the Grid (CCGrid 10), May 2010. 28. H. Subramoni, F. Petrini, V. Agarwal and D. Pasetto, Streaming, low-latency communication in on-line trading systems, Int l Workshop on System Management Techniques, Processes, and Services (SMTPS), in conjunction with Int l Parallel and Distributed Processing Symposium (IPDPS 10), April 2010. 29. K. Kandalla, H. Subramoni, A. Vishnu and D. K. Panda, Designing Topology-Aware Collective Communication Algorithms for Large Scale InfiniBand Clusters: Case Studies with Scatter and Gather, The 10th Workshop on Communication Architecture for Clusters (CAC 10), in conjunction with Int l Parallel and Distributed Processing Symposium (IPDPS 10), April 2010. 30. P. Lai, H. Subramoni, S. Narravula, A. Mamidala and D. K. Panda, Designing Efficient FTP Mechanisms for High Performance Data-Transfer over InfiniBand, Int l Conference on Parallel Processing (ICPP 09), Sep. 2009. 31. H. Subramoni, P. Lai, M. Luo, Dhabaleswar K. Panda, RDMA over Ethernet - A Preliminary Study, Workshop on High Performance Interconnects for Distributed Computing (HPIDC 09), in conjunction with Cluster, Sep. 2009. 32. H. Subramoni, M. Koop, and D. K. Panda, Designing Next Generation Clusters: Evaluation of InfiniBand DDR/QDR on Intel Computing Platforms, 17th Annual Symposium on High-Performance Interconnects (HotI 09), Aug. 2009. 33. K. Kandalla, H. Subramoni, G. Santhanaraman, M. Koop and D. K. Panda, The 9th Workshop on Communication Architecture for Clusters (CAC 09), in conjunction with Int l Parallel and Distributed Processing Symposium (IPDPS 09), May 2009. 34. H. Subramoni, G. Marsh, S. Narravula, P. Lai and D.K. Panda, Design and Evaluation of Benchmarks for Financial Applications using Advanced Message Queuing Protocol (AMQP) over InfiniBand, Workshop on High Performance Computational Finance (in conjunction with SC 08), Austin, TX, November 2008. 35. S. Narravula, H. Subramoni, P. Lai, R. Noronha and D. K. Panda Performance of HPC Middleware over InfiniBand WAN, Int l Conference on Parallel Processing (ICPP 08), Portland, Oregon, Sep. 2008. Presentations: 1. H. Subramoni, K. Kandalla, J. Jose, K. Tomko, K. Schulz, and D. Pekurovsky, Designing Topology Aware Communication Schedules for Alltoall Operations in Large InfiniBand Clusters, Int l Conference on Parallel Processing (ICPP 14). Sep. 2014. 2. H. Subramoni, K. Hamidouche, A. Venkatesh, S. Chakraborty and D. K. Panda, Designing MPI Library with Dynamic Connected Transport (DCT) of InfiniBand : Early Experiences. IEEE International Supercomputing Conference (ISC 14). June 2014. 3. H. Subramoni, D. Bureddy, K. Kandalla, K. Schulz, B. Barth, J. Perkins, M. Arnold, and D. K. Panda, Design of Network Topology Aware Scheduling Services for Large InfiniBand Clusters, IEEE Cluster 13, Sep. 2013. 4. H. Subramoni, S. Potluri, K. Kandalla, B. Barth, J. Vienne, J. Keasler, K. Tomko, K. Schulz, A. Moody, and D. K. Panda, Design of a Scalable InfiniBand Topology Service to Enable Network-Topology-Aware Placement of Processes, SC 12, November 2012. (Best Paper and Best Student Paper Finalist). 5. H. Subramoni, J. Vienne, and D. K. Panda, A Scalable InfiniBand Network Topology-Aware Performance Analysis Tool for MPI, 5th Workshop on Productivity and Performance (PROPER) Aug., 2012. 6. H. Subramoni, K. Kandalla, J. Vienne, S. Sur, B. Barth, K. Tomko, R. McLay, K. Schulz and D. K. Panda, Design and Evaluation of Network Topology-/Speed-Aware Broadcast Algorithms for InfiniBand Clusters, Cluster 11, Sep. 2011.

7. H. Subramoni, P. Lai, S. Sur and D. K. Panda, Improving Application Performance and Predictability using Multiple Virtual Lanes in Modern Multi-Core InfiniBand Clusters, Int l Conference on Parallel Processing (ICPP 10), Sep. 2010. 8. K. Kandalla, E. P. Mancini, S. Sur and D. K. Panda, Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters, Int l Conference on Parallel Processing (ICPP 10), Sep. 2010. 9. H. Subramoni, P. Lai, R. Kettimuthu and D. K. Panda, High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand, Int l Symposium on Cluster Computing and the Grid (CCGrid), May 2010. 10. H. Subramoni, P. Lai, M. Luo, Dhabaleswar K. Panda, RDMA over Ethernet - A Preliminary Study, Workshop on High Performance Interconnects for Distributed Computing (HPIDC 09), in conjunction with Cluster, Sep. 2009. 11. H. Subramoni, M. Koop and D. K. Panda, Designing Next Generation Balanced Clusters with InfiniBand QDR and Intel Nehalem Architecture, Symposium on High Performance Interconnects (HOTI 09), Aug. 2009. 12. H. Subramoni, G. Marsh, S. Narravula, P. Lai and D. K. Panda, Design and Evaluation of Benchmarks for Financial Applications using Advanced Message Queueing Protocol (AMQP) over InfiniBand, Workshop on High Performance Computational Finance (WHPCF 08), in conjunction with SC 08, November 2008. Invited Tutorials: 1. InfiniBand and High-Speed Ethernet for Dummies, SuperComputing (SC 14), Nov 2014. 2. Designing and Using High-End Computing Systems with InfiniBand and High-Speed Ethernet SuperComputing (SC 14), Nov 2014. 3. Optimization and Tuning of MPI and PGAS Applications using MVAPICH2 and MVAPICH2-X XSEDE 14, July, 2014. 4. InfiniBand and High-Speed Ethernet: Overview, Latest Status and Trends Int l Conference on High Performance Switching and Routing (HPSR 14), July, 2014. 5. InfiniBand and High-Speed Ethernet: Overview, Latest Status and Trends Int l Supercomputing Conference (ISC 14), June, 2014. 6. InfiniBand and High-Speed Ethernet for Dummies, SuperComputing (SC 13), Nov 2013. 7. Advanced Topics in InfiniBand and High-speed Ethernet for Designing HEC Systems SuperComputing (SC 13), Nov 2013. 8. InfiniBand and High-Speed Ethernet for Dummies, Int l Supercomputing Conference (ISC 13), June, 2013. 9. Advanced Topics in InfiniBand and High-speed Ethernet for Designing HEC Systems Int l Supercomputing Conference (ISC 13), June, 2013. 10. InfiniBand and High-Speed Ethernet for Dummies, SuperComputing (SC 12), Nov 2012. 11. Designing High-End Computing Systems with InfiniBand and High-Speed Ethernet, SuperComputing (SC 12), Nov 2012. 12. InfiniBand and High-Speed Ethernet for Dummies, Int l Supercomputing Conference (ISC 12), June, 2012. 13. Designing High-End Computing Systems and Programming Models with InfiniBand and High-Speed Ethernet, Int l Supercomputing Conference (ISC 12), June, 2012. 14. InfiniBand and High-Speed Ethernet for Dummies, SuperComputing (SC 11), Nov 2011. 15. Designing High-End Computing Systems with InfiniBand and High-Speed Ethernet, SuperComputing (SC 11), Nov 2011. Invited Talks: 1. Optimizing and Tuning Techniques for Running MVAPICH2 over IB, InfiniBand User Group Meeting (IBUG), April 2014. 2. MVAPICH2 Project Update and GPUDirect RDMA, HPC Advisory Council European Conference, June 2013. 3. Experiencing HPC for Undergraduates - Graduate Student Perspective, SuperComputing (SC 12), Nov 2012. 4. MVAPICH2 Project Update and Big Data Acceleration, HPC Advisory Council European Conference, June 2012.

Selected Professional Activities: IEEE Member Session Chair - Int l Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2014 Program Committee Member - International Conference on Parallel Processing (ICPP), 2014 Organizing Committee Member - MVAPICH User Group Meeting (MUG), 2013, 2014 Attendee - MPI Forum (MPI-3.1) Reviewer, SuperComputing (SC), 2013 Reviewer, Transactions on Parallel and Distributed Systems (TPDS), 2013 Reviewer, Journal of Parallel and Distributed Computing (JPDC), 2009, 2010, 2011, 2012, 2013, 2014 Reviewer, International Conference on Parallel Processing (ICPP), 2009, 2010, 2011 Reviewer, International Parallel & Distributed Processing Symposium (IPDPS), 2010, 2011, 2013 Reviewer, IEEE Cluster, 2010, 2011 Member of Career Guidance and Placement Cell at Under Graduate University