Operating Systems Design 16. Networking: Sockets

Similar documents
Network packet capture in Linux kernelspace

ELEN 602: Computer Communications and Networking. Socket Programming Basics

Network Programming with Sockets. Process Management in UNIX

The POSIX Socket API

CS 416: Opera-ng Systems Design

TCP/IP - Socket Programming

Socket Programming. Request. Reply. Figure 1. Client-Server paradigm

Tutorial on Socket Programming

Introduction to Socket Programming Part I : TCP Clients, Servers; Host information

Implementing Network Software

UNIX Sockets. COS 461 Precept 1

EITF25 Internet Techniques and Applications L5: Wide Area Networks (WAN) Stefan Höst

ICT SEcurity BASICS. Course: Software Defined Radio. Angelo Liguori. SP4TE lab.

SEP Packet Capturing Using the Linux Netfilter Framework Ivan Pronchev

Objectives of Lecture. Network Architecture. Protocols. Contents

What is CSG150 about? Fundamentals of Computer Networking. Course Outline. Lecture 1 Outline. Guevara Noubir noubir@ccs.neu.

Computer Networks Network architecture

Socket Programming. Srinidhi Varadarajan

Network Programming TDC 561

IMPROVING PERFORMANCE OF SMTP RELAY SERVERS AN IN-KERNEL APPROACH MAYURESH KASTURE. (Under the Direction of Kang Li) ABSTRACT

Socket Programming. Kameswari Chebrolu Dept. of Electrical Engineering, IIT Kanpur

IP Network Layer. Datagram ID FLAG Fragment Offset. IP Datagrams. IP Addresses. IP Addresses. CSCE 515: Computer Network Programming TCP/IP

Limi Kalita / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 5 (3), 2014, Socket Programming

Introduction to Socket programming using C

Ethernet. Ethernet. Network Devices

Network Layer: Network Layer and IP Protocol

Protocols and Architecture. Protocol Architecture.

Networks. Inter-process Communication. Pipes. Inter-process Communication

Socket Programming in C/C++

Transport Layer Protocols

Transport Layer. Chapter 3.4. Think about

Protocols. Packets. What's in an IP packet

Indian Institute of Technology Kharagpur. TCP/IP Part I. Prof Indranil Sengupta Computer Science and Engineering Indian Institute of Technology

Xinying Wang, Cong Xu CS 423 Project

Linux Kernel Networking. Raoul Rivas

Lectures on distributed systems: Client-server communication. Paul Krzyzanowski

Chapter 11. User Datagram Protocol (UDP)

Internet Architecture and Philosophy

Chapter 9. IP Secure

Overview of Computer Networks

Overview of TCP/IP. TCP/IP and Internet

Presentation of Diagnosing performance overheads in the Xen virtual machine environment

Mobile IP Network Layer Lesson 02 TCP/IP Suite and IP Protocol

TCP Offload Engines. As network interconnect speeds advance to Gigabit. Introduction to

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX

Have both hardware and software. Want to hide the details from the programmer (user).

Fundamentals of Symbian OS. Sockets. Copyright Symbian Software Ltd.

Linux Kernel Architecture

Introduction to IP v6

Overview. Securing TCP/IP. Introduction to TCP/IP (cont d) Introduction to TCP/IP

Network-Oriented Software Development. Course: CSc4360/CSc6360 Instructor: Dr. Beyah Sessions: M-W, 3:00 4:40pm Lecture 2

Virtualization: TCP/IP Performance Management in a Virtualized Environment Orlando Share Session 9308

Optimizing Point-to-Point Ethernet Cluster Communication

Application Architecture

Enabling Linux* Network Support of Hardware Multiqueue Devices

Lab 6: Building Your Own Firewall

How do I get to

The Performance Analysis of Linux Networking Packet Receiving

INTRODUCTION UNIX NETWORK PROGRAMMING Vol 1, Third Edition by Richard Stevens

Introduction to Analyzer and the ARP protocol

RARP: Reverse Address Resolution Protocol

Overview - Using ADAMS With a Firewall

CCNA R&S: Introduction to Networks. Chapter 5: Ethernet

Lab 1: Packet Sniffing and Wireshark

NS3 Lab 1 TCP/IP Network Programming in C

Packet Sniffing and Spoofing Lab

Performance of Software Switching

Basic Networking Concepts. 1. Introduction 2. Protocols 3. Protocol Layers 4. Network Interconnection/Internet

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

LECTURE 4 NETWORK INFRASTRUCTURE

Programmation Systèmes Cours 9 UNIX Domain Sockets

Computer Networks UDP and TCP

Chapter 11 I/O Management and Disk Scheduling

Introduction To Computer Networking

Unix Network Programming

Linux MDS Firewall Supplement

Networking Test 4 Study Guide

Datacenter Operating Systems

Guide to TCP/IP, Third Edition. Chapter 3: Data Link and Network Layer TCP/IP Protocols

IP - The Internet Protocol

PART OF THE PICTURE: The TCP/IP Communications Architecture

CPS221 Lecture: Layered Network Architecture

Lab 4: Socket Programming: netcat part

Internet Concepts. What is a Network?

Chapter 3. Internet Applications and Network Programming

Session NM059. TCP/IP Programming on VMS. Geoff Bryant Process Software

Internet Ideal: Simple Network Model

10/100/1000 Ethernet MAC with Protocol Acceleration MAC-NET Core

Overview - Using ADAMS With a Firewall

Computer Networks/DV2 Lab

Project 4: IP over DNS Due: 11:59 PM, Dec 14, 2015

Internet Packets. Forwarding Datagrams

Final for ECE374 05/06/13 Solution!!

Troubleshooting Tools

Locating Cache Performance Bottlenecks Using Data Profiling

Transport and Network Layer

Computer Networks - Xarxes de Computadors

ACHILLES CERTIFICATION. SIS Module SLS 1508

First Workshop on Open Source and Internet Technology for Scientific Environment: with case studies from Environmental Monitoring

Internet Control Protocols Reading: Chapter 3

Transcription:

Operating Systems Design 16. Networking: Sockets Paul Krzyzanowski pxk@cs.rutgers.edu 1

Sockets IP lets us send data between machines TCP & UDP are transport layer protocols Contain port number to identify transport endpoint (application) The most popular abstraction for transport layer connectivity: sockets 2

Sockets Dominant API for transport layer connectivity Created at UC Berkeley for 4.2BSD Unix (1983) Design goals Communication between processes should not depend on whether they are on the same machine Communication should be efficient Interface should be compatible with files Support different protocols and naming conventions Sockets is not just for the Internet Protocol family 3

Socket Socket = Abstract object from which messages are sent and received Looks like a file descriptor Application can select particular style of communication Virtual circuit, datagram, message-based, in-order delivery Unrelated processes should be able to locate communication endpoints Sockets can have a name Name should be meaningful in the communications domain 4

Connection-Oriented (TCP) socket operations Client Create a socket Name the socket (assign local address, port) Connect to the other side Server Create a socket Name the socket (assign local address, port) Set the socket for listening Wait for and accept a connection; get a socket for the connection read / write byte streams read / write byte streams close the socket close the socket close the listening socket 5

Connectionless (UDP) socket operations Client Create a socket Name the socket (assign local address, port) Create a socket Server Name the socket (assign local address, port) Send a message Receive a message Receive a message Send a message close the socket close the socket 6

Programming with sockets 7

Socket-related system calls Sockets are the interface the operating system provides for access to the network Next: a connection-oriented example 8

Step 1 Create a socket int s = socket(domain, type, protocol) AF_INET SOCK_STREAM SOCK_DGRAM useful if some families have more than one protocol to support a given service Conceptually similar to open BUT - open creates a new reference to a possibly existing object - socket creates a new instance of an object 9

Step 2 Name the socket (assign address, port) int error = bind(s, addr, addrlen) socket Address structure struct sockaddr* length of address structure 10

Step 3a (server) Set socket to be able to accept connections int error = listen(s, backlog) socket queue length for pending connections 11

Step 3b (server) Wait for a connection from client int snew = accept(s, clntaddr, &clntalen) new socket for this session socket pointer to address structure length of address structure s is only used for managing the queue of connection requests 12

Step 3 (client) Connect to server int error = connect(s, svraddr, svraddrlen) socket address structure struct sockaddr* length of address structure 13

Step 4 Exchange data Connection-oriented I/O read/write recv/send (extra flags) Connectionless I/O: no need for connect, listen, accept sendto/recvfrom sendmsg/recvmsg 14

Step 5 Close connection shutdown(s, how) how: 0: can send but not receive 1: cannot send more data 2: cannot send or receive (=0+1) 15

Socket Internals 16

Logical View Socket layer data socket Transport layer TCP data Network layer IP TCP data Protocol input queue Network interface (driver) layer Ethernet ethernet IP TCP data Device interrupt socket buffer 17

Data Path Application writes data to socket From app Packet received by device From device Move packet to transport layer [TCP creates packet buffer & header] Packet for host? drop Move packet to network layer [IP fills in header] Forward packet? Move packet to network layer [IP fills in header] Move packet to transport layer internal drop external Look up route Move packet to socket Move packet to net device driver [packet goes on send queue] Data goes to application buffer To app Transmit the packet To device 18

OS Network Stack System call Interface Socket-related system calls Generic network interface Sockets implementation layer Network Protocols Transport Layer Network Layer TCP/IPv4, UDP/IPv4, TCP/IPv6 IPv4, IPv6 Abstract Device Interface netdevice, queuing discipline Device Driver Link Layer Ethernet, Wi-Fi, SLIP 19

System call interface Two ways to communicate with the network: 1. Socket-specific call (e.g., socket, bind, shutdown) Directed to sys_socketcall (socket.c) Goes to the target function 2. File call (e.g., read, write, close) File descriptor socket Sockets reside in the process s file table Direct parallel of the VFS structure A socket s f_ops field points to a set of functions for socket operations A socket structure acts as a queuing point for data being transmitted & received A socket has send and receive queues associated with it High & low watermarks to avoid resource exhaustion 20

Sockets layer All network communication takes place via a socket Two socket structures one within another 1. Generic sockets (aka BSD sockets) struct socket 2. Protocol-specific sockets (e.g., INET socket) struct sock socket structure Keeps all the state of a socket including the protocol and operations that can be performed on it Some key members of the structure: struct proto_ops *ops: protocol-specific functions that implement socket operations Common functions to support a variety of protocols TCP, UDP, IP, raw ethernet, other networks struct inode *inode: points to in-memory inode associated with the socket struct sock *sk: protocol-specific (e.g., INET) socket E.g., this contains TCP/IP and UDP/IP specific data for an INET (Internet Address Domain) socket 21

Socket Buffer: struct sk_buff Component for managing the data movement for sockets through the networking layers Contains packet & state data for multiple layers of the protocol stack Don t waste time copying parameters & packet data from layer to layer of the network stack Data sits in a socket buffer (struct sk_buff) As we move through layers, data is only copied twice: 1. From user to kernel space 2. From kernel space to the device (via DMA if available) associated device source device sk_buff: next prev head data tail end dev dev_rx sk sk_buff sk_buff socket sk_buff sk_buff IP TCP data Packet data 22

Socket Buffer: struct sk_buff Each sent or received packet is associated with an sk_buff: Packet data in data->, tail-> Total packet buffer in head->, end-> Header pointers (MAC, IP, TCP header, etc.) Add or remove headers without reallocating memory Identifies device structure (net_device) rx_dev: points to the network device that received the packet dev: identifies net device on which the buffer operates If a routing decision has been made, this is the outbound interface Each socket (connection stream) is associated with a linked list of sk_buffs 23

Keeping track of packet data head data tail end ethernet tail room Allocate new socket buffer data skb = alloc_skb(len, GFP_KERNEL); No packet data: head = data = tail 24

Keeping track of packet data head data tail end head room tail room Make room for protocol headers. skb_reserve(skb, header_len) For IPv4, use sk->sk_prot->max_header Data size is still 0 25

Keeping track of packet data head data tail end head room User data tail room Add user data 26

Keeping track of packet data head data tail end head room TCP User data tail room Add TCP header 27

Keeping track of packet data head data tail end head room IP TCP User data tail room Add IP header 28

Keeping track of packet data head data tail end ethernet ethernet IP TCP User data Add ethernet header The outbound packet is complete! 29

Network protocols Define the specific protocols available (e.g., TCP, UDP) Each networking protocol has a structure called proto Associated with an address family (e.g., AF_INET) Address family is specified by the programmer when creating the socket Defines socket operations that can be performed from the sockets layer to the transport layer Close, connect, disconnect, accept, shutdown, sendmsg, recvmsg, etc. Modular: one module may define one or more protocols Initialized & registered at startup Initialization function: registers a family of protocols The register function adds the protocol to the active protocol list 30

Abstract device interface Layer that interfaces with network device drivers Common set of functions for low-level network device drivers to operate with the higher-level protocol stack 31

Abstract device interface Send a packet to a device Send sk_buff from the protocol layer to a device dev_queue_xmit function enqueues an sk_buff for transmission to the underlying driver Device is defined in sk_buff Device structure contains a method hard_start_xmit: driver function for actually transmitting the data in the sk_buff Receive a packet from a device & send to protocol stack Receive an sk_buff from a device Driver receives a packet and places it into an allocated sk_buff sk_buff passed to the network layer with a call to netif_rx Function enqueues the sk_buff to an upper-layer protocol's queue for processing through netif_rx_schedule 32

Device drivers Drivers to access the network device Examples: ethernet, 802.11n, SLIP Modular, like other devices Described by struct net_device Initialization Driver allocates a net_device structure Initializes it with its functions dev->hard_start_xmit: defines how to transmit a packet Typically the packet is moved to a hardware queue Register interrupt service routine Calls register_netdevice to make the device available to the network stack 33

Sending a message Write data to socket Socket calls appropriate send function (typically INET) Send function verifies status of socket & protocol type Sends data to transport layer routine (typically TCP or UDP) Transport layer Creates a socket buffer (struct sk_buff) Copies data from application layer; fills in header (port #, options, checksum) Passes buffer to the network layer (typically IP) Network layer Fills in buffer with its own headers (IP address, options, checksum) Look up destination route IP layer may fragment data into multiple packets Passes buffer to link layer: to destination route s device output function Link layer: move packet to the device s xmit queue Network driver Wait for scheduler to run the device driver Sends the link header Transmit packet via DMA 34

Routing IP Network layer Two structures: 1. Forwarding Information Base (FIB) Keeps track of details for every known route 2. Cache for destinations in use (hash table) If not found here then check FIB. 35

Receiving a message part 1 Interrupt from network card: packet received Network driver top half Allocate new sk_buff Move data from the hardware buffer into the sk_buff (DMA) Call netif_rx, the generic network reception handler This moves the sk_buff to protocol processing (it s a work queue) When netif_rx returns, the service routine is finished Repeat until no more packets in the device buffers If the packet queue is full, the packet is discarded netif_rx is called in the interrupt service routine Must be quick. Main goal: queue the packet. 36

Receiving a packet part 2 Bottom half Bottom half = softirq = work queues Tuples containing < operation, data > Kernel schedules work to go through pending packet queue Call net_rx_action() Dequeue first sk_buff (packet) Go through list of protocol handlers Each protocol handler registers itself Identifies which protocol type they handle Go through each generic handler first Then go through the receive function registered for the packet s protocol 37

Receiving an IP packet part 3 Network layer Ethernet Protocol: IP IP is a registered as a protocol handler for ETH_P_IP packets Packet header identifies next level protocol E.g., Ethernet header states encapsulated protocol is IPv4 IPv4 header states encapsulated protocol is TCP IP handler will either route the packet, deliver locally, or discard Send either to an outgoing queue (if routing) or to the transport layer Look at protocol field inside the IP packet Calls transport-level handlers (tcp_v4_rcv, udp_rcv, icmp_rcv, ) IP handler includes Netfilter hooks Additional checks for packet filtering, port translation, and extensions 38

Receiving an IP packet part 4 Transport layer Next stage (usually): tcp_v4_rcv() or udp_rcv() Check for transport layer errors Look for a socket that should receive this packet (match local & remote addresses and ports) Call tcp_v4_do_rcv: passing it the sk_buff and socket (sock structure) Adds sk_buff to the end of that socket s receive queue The socket may have specific processing options defined If so, apply them Wake up the process (ready state) if it was blocked on the socket 39

Lots of Interrupts! Assume: Non-jumbo maximum payload size: 1500 bytes TCP acknowledgement (no data): 40 bytes Median packet size: 413 bytes Assume a steady flow of network traffic at: 1 Gbps: ~300,000 packets/second 100 Mbps: ~30,000 packets/second Even 9000-byte jumbo frames give us: 1 Gbps: 14,000 packets per second 14,000 interrupts/second One interrupt per received packet Network traffic can generate a LOT of interrupts!! 40

Interrupt Mitigation: Linux NAPI Linux NAPI: New API (c. 2009) Avoid getting thousands of interrupts per second Disable network device interrupts during high traffic Re-enable interrupts when there are no more packets Polling is better at high loads; interrupts are better at low loads Throttle packets If we get more packets than we can process, leave them in the network card s buffer and let them get overwritten (same as dropping a packet) Better to drop packets early than waste time processing them 41

The End 42