Seminar RVS MC-FTP (Multicast File Transfer Protocol): Simulation and Comparison with BitTorrent Dominic Papritz Universität Bern
Overview > Related work > MC-FTP > BitTorrent > Future work > References 2
Related work > Slurpie, [1] reduce server load and downloadtime of a normal server based file download (HTTP, FTP) data is split into chunks. creates a mesh based overlay network between downloading clients. clients download from peers, and visit server only if no peer has the chunk. > Bullet, [2] is used for high bandwidth data dessimination improve efficency of tree based overlay networks by building a mesh on top of the overlay tree improve recovery over the mesh by disseminate disjoint data to peers. > all newer P2P applications tend to use different models of multicast overlay networks. 3
MC-FTP > runs on top of multicast (independent of mcast technology) Native IP multicast Not widely supported by the Internet Local networks Mbone ID based multicast overlaynetwork Can be used in the Internet without limitations File sharing > can be used for: file sharing between users improving classical server based 1:n data dissemination > a file is split into chunks of equal size, except for the last chunk (chunksize 256KB, can be chosen bigger for large files) > each chunk in a file is identified by an index, started at 0 4
MC-FTP: File Descriptor > a file is represented by a File Descriptor. > the File Descriptor keeps all file related information. MD5-hash of the file as file ID filename file size chunk size list of hashs, one for each chunk (number of chunks) almost similar to the torrent-file in BitTorrent. 5
MC-FTP File Managament Group & File Leader > each File Descriptor has one corresponding File Management Group (represented by a multicast group) > the File Leader is the central point of coordination managing all participating peers managing who sends what, when and how > a File Management Group is used to distribute all status/ controll messages between the peers and the File Leader 6
MC-FTP: File Leader > the File Leader stores information of all peers. peer ID fingerprint of the public key of the peer maximal available upload and download rate of the peer chunk vector of which chunk the peer has downloaded successfully. last seen, peer removed after timeout period corrupted rating memory estimation: (can be minimised by increasing chunksize) with a filesize of 220MB, more than a half million peers can be stored in less than 100 MB of memory with a filesize of 512MB, more than a half million peers can be stored in less than 200 MB of memory 7
MC-FTP: Status Messages > each peer: creates a unique ID and a public/private key pair per file used for peer identification on the File Leader has its maximal upload and download rate specified (first by user) joins the File Managment Group (mcast group) waits until the first Status Message Request arrives from the File Leader, containing: peer ID and public key of the File Leader reply with a Status Message encrypted with the File Leaders public key, containing: peer ID and fingerprint of the public key file ID (hash) Maximal available up and download bandwidth Chunk Vector of currently downloaded chunks (Bitvector) Optional data for peer rating > Status Message Requests are send out periodically 8
MC-FTP: Traffic on File Leader > network load is related to memory usage on the File Leader > each peer sends its hole status to the File Leader > with half a million peers connected (filesize: 220MB), around 100MB is also needed to be transfered to the File Leader within a Status Message Request interval With a connection of 2Mb to the File Leader this takes at minimum 400 seconds! Can be solved by smaller Status Messages, who only contains updates, instead of the full chunk vector 9
MC-FTP: Sending Groups > a Sending Group is a multicast group maintained by the File Leader > a Sending Group contains: A peer as sender The rate to send out data The chunk(s) to send; which ones in which order. 10
MC-FTP: Keep Alive Messages > File Leader sends periodically Keep Alive Messages throught the File Management Group to the peers, containing: the File Leader s peer ID current Sending Groups specified by the File Leader: a multicast group address Peer ID of sender the constant send rate send chunk(s) (one or many, order, repeation) > peers must join the mcast group and start sending, when they are assigned as the sender for the Sending Group > peers may join the mcast group and start downloading accordingly to there chunks and downloadrate. 11
MC-FTP: extended protocol > corresponding File Management Group (mcast address) for a File Descriptor can be retrived by a DHT > File Descriptor could also be obtained by a DHT based search engine > new File Leader must be negotiated, if no exists or old is corrupted No single point of failure > multicast addresses, reserved or released by the File Leader, are also stored in a DHT New File Leader can recover used addresses. > more reliable transfer by using erasure codes > File Leaders managing/disseminating algorithm can be optimized to different usage scenarios. 12
BitTorrent [3] > BitTorrent is a P2P network, that uses tit for tat as a cooperative method for achieving pareto efficency > it consist of: one tracker per file. To find other peers Downloaders/peers A torrent file > peers are self responsible for optimize their download > tracker communication is done over HTTP or HTTPS > peer protocol operates over TCP > files are also split into chunks of fixed size (here called pieces) 13
BitTorrent: torrent file > the torrent file has all necessary information for a peer to download a file URL of the tracker Fileinfo (considering only one file) Name of the file Piece length/size File size SHA1 hashs of each piece File ID is generated as SHA1 hash of the fileinfo 14
BitTorrent: Tracker > the tracker recieves information of all peers and giving them random lists of peers > single point of failure New versions of BitTorrent can use a DHT for recieving other peers (trackerless) > Get Request consists of: File ID Peer ID Peer IP Peer Port > tracker response with: Interval, number of seconds between normal requests List of peers, containing ID, IP and Port of each peer > peers may rerequest on nonscheduled times, if they need more peers 15
BitTorrent: Peer protocol > peer connections are symmetrical > a peer first tries to make a handshake to a new peer. Checks for expected file and peer ID > each downloader reports to all of its peers, what pieces it has. > peers download pieces from all peers they can. > peers upload to other peers accordingly to the Choking Algorithm > piece selection rarest first peer downloads the piece which the fewest of its peers has first this piece has best chance to be requested from other peers > to avoid delays between pieces, that lowers transfer rates splits pieces into sub-pieces always having some number of sub-pieces requests pipelined complets a piece before requesting sub-pieces from other pieces 16
BitTorrent: Choking Algorithm > choking is a temporarily refusal to upload > a peer always unchoke a fixed number of peers (current 4) > which peers to unchoke is based strictly on current download rate from that peer > peers recalculate which peers to choke or unchoke every 10 seconds enough time for TCP to achieve full transfer capacity avoids fibrillation (no rapid change of choke and unchoke) > optimistic unchoke unchokes a peer, regardless of its current download rate which peer to optimistic unchoke is rotated every third rechoke enough time for upload to achieve full transfer capacity enough time for the unchoked peer to reciprocate enough time for the download to achieve full transfer capacity 17
Future Work > full implemention of MC-FTP & BitTorrent (with tracker) on ns-2 > develop/evaluate several File Leader management algorithm Improved for file sharing Improved for other data disseminations > evaluate observable for comparing MC-FTP with BitTorrent average download time packet loss recieve of duplicated data Chunk curruption etc. > evaluate File Leader negotiation/election > how to detect malicious peers > evaluate different DHTs > evaluate if erasure encoding gives more efficency Which erasure encoding to use? 18
Questions???? 19
References [1] Rob Sherwood, Ryan Braud and Bobby Bhattacharjee, Slurpie: A Cooperative Bulk Data Transfer Protocol, IEE INFOCOM 2004 [2] Dejan Kostic, Adolfo Rodriguez, Jeannie Albrecht and Amin Vahdat, Bullet: High Bandwidth Data Dissemination, Using an overlay Mesh, 2003 [3] Bram Cohen, Robustness in BitTorrent, 2003 20