A Scalable Multi-Server Cluster VoIP System Ming-Cheng Liang Li-Tsung Huang Chun-Zer Lee Min Chen Chia-Hung Hsu mcliang@nuk.edu.tw {kpa.huang, chunzer.lee}@gmail.com {minchen, chhsu}@nchc.org.tw Department of Electrical Engineering National University of KaoHsiung 700 KaoHsing Ta-Sheh Rd, Nan-Tsu District, KaoHsiung National Center for High-Performance Computing No. 7, R&D Rd. VI Hsinchu Science Park Hsinchu, Taiwan, R.O.C. 30076 City, 811, Taiwan ABSTRACT A Peer-to-peer mode is a desirable operation mode for SIP based VoIP system since it will pose a least traffic load to the network. However, the operation will usually fail when one or both of the clients are behind a NAT or firewall. Using a server-client mode operation can usually solve the NAT problem, but the transmission delay reduction is a challenging problem when the server load is high. In order for the VoIP system to operate efficiently, it is necessary to keep the system in the peer-to-peer mode whenever possible. In this research, a multi-server cluster VOIP system is proposed. This server cluster consists of a peer-to-peer based master server with a group of slave servers. The slave servers will normally operate in the client-server mode. For a service that does not need client-server operation, the master server will directly handle the necessary SIP signaling and leave the clients in the peer-to-peer mode for their data stream transmission to reduce the network load. If a service needs to operate in a client-server mode, one of the slave server will be chosen for the operation. Depending on the server load and other factors, the number of the slave servers can be changed accordingly. Keywords SIP, VoIP, Server Cluster 1. INTRODUCTION With the development of broadband network and related network technologies, the packet network has become highly competitive with the circuit network. Many Internet Service Providers (ISPs) uses broadband packet network to provide real time services that were used to be provided only possible by circuit network. The Voice over Internet protocol (VoIP) is one of the most referred to services due to the popularity of the voice communication in traditional telephone network. Especially, a SIP [1] based VoIP service system with proper design can provide a voice communication with end to end voice quality that is almost indistinguishable with those in traditional circuit based PSTN service. SIP is an Application layer protocol that can be used to establish, modify, and terminate multimedia sessions such as Internet telephony calls [1] that works independently of transport layer protocols and without dependency on the type of session that is being established. SIP protocol is design in text-based, it easy for humans to read SIP messages and low complexity. So this signaling protocol can more flexible, easy to implement with H.323 and MGCP [2]. The SIP protocol establishment mainly adopts the literalness method definition with HTTP in the transmission layer, so that the SIP protocol can easily be adapted to diversified internet environment. Video phone services, e.g., video conference, video on demand, etc, based on SIP protocol can be real-time transmitted in broadband internet network. Transmission of these multimedia communication services is one of the advantages of internet over traditional PSTN. Especially, when the multimedia VoIP service is based on the SIP protocol, the service can be as simple as using a PSTN telephone and with quality almost the same as PSTN but yet with the transmission of real time video that is hard to achieve in PSTN network. There are usually two types of operation modes for a SIP server. One is a peer-to-peer mode, and the other one is the client-server mode. Session controller is the basic communication way in SIP network architecture. At registration, the user agent (UA) will
send a SIP register message to location server. The user s information will be stored at SIP proxy. When a call is initiated or when a termination is requested, the SIP proxy will route these SIP requests to the proper agent according to the stored information and response to the end user agent for call setup purposes. In the peer-to-peer transmission mode, the SIP proxy will handle only the call setup messages. After the call set up session is created and the agent s information is sent to each agents, there will be a data link sessions created between UAs to support RTP [3] stream using SDP [4] information inside SIP message. The SIP proxy will no longer involve with the call until the agents request to tear down the connection. In the peer-to-peer connection structure, each call will only pose a very small overhead to the SIP proxy, because RTP stream will not need to be handled by the SIP proxy. The peer-to-peer mode of operation is very popular if both user In the client-server transmission mode, not only the setup signal, but also the RTP stream will pass through the SIP proxy. A Back to Back User Agent (B2BUA, Figure 2) is installed in the SIP proxy for client-server mode of operation. The B2BUA is a server that acts as a user agent to both ends of a SIP call. It maintains complete call state and participates in all call requests. Each call is tracked from beginning to end, allowing the operators of the B2BUA to offer value-added features to the call. In the client-server mode, the SIP proxy has very strong control over the call and the control signal path is the same as the RTP path. So, when one or both of the user agents are behind the NAT/firewall, the SIP call can still be completed without problem. Also, additional services, e.g., conference call, agent call monitoring, etc, can be provided by the SIP proxy without difficulty. However, because the SIP proxy needs to handle the RTP stream, the server load is very high. agents are located in the public domain or in the same separate domain. However, when one or both of the agents are behind NAT (the Network Address Translation) [5] [6] or firewall [7] [8], the RTP transmission will fail because the other agent will be unknown to the NAT/firewall. Several different modification to the network structure, such as ALG [9], SBC [10], STUN [12], TURN [13], RTP proxy etc, were proposed to allow the peer-to-peer communication to pass through the NAT/firewall. However, these proposals will either involve in the modification of the NAT/firewall, or in the setup of separate proxies using different protocols. Figure 2. A Client-Server Mode In order to keep both the advantages of the light load characteristic of the peer-to-peer mode of operation and the controllability of the client-server mode of operation, a server cluster structure is proposed in this paper. In this proposal, a master server operated in the peer-to-peer mode is bundled with a group of slave servers operated in the client-server mode. The server cluster will behave transparently as the situations of the user agent are changed. According to the need of the user agent, the call setup relations can be changed accordingly. The system Figure 1. A Peer-to-Peer Mode structure of this proposed server cluster is discussed next.
2. PROPOSED SERVER CLUSTER SYSTEM 2.1. System Architecture In this multi-server cluster structure, the transmission mode for a user agent can be dynamically adjusted between client-server mode and peer-to-peer mode according to the agent s situation. Initially, the master server is the main communication entry with UA, and store UA information in location server. The master server will handle the setup signals, e.g., the establishment, termination, and choose the proper mode of operation for the UA. The slave server will take command from the master server to take over a UA. It will handle SIP messages, Figure 3. The architecture of Multi-Server Cluster VoIP System. act as a server for audio and video transmission and provide additional service for the UA. The system structure is shown in Figure 3, when both agents are in the public network operated normally, master server will use peer to peer transmission mode. This allows both agents to communicate directly so that the server load can be minimized. If an agent is behind the NAT/firewall, the master server will decide that it is most suitable for the agent to be operated in a 2.2. System design If the multi-server cluster system is operated in the peer-to-peer mode, the message flow is defined as in Figure 4. This message flow is similar to the basic SIP Proxy structure. Call setup messages will be handled and transferred solely by the master server. After call setup is completed, the RTP stream will be transmitted directly between the agents. client-server mode and it will hand the agent to a slave server. By examining the information inside SIP header Receiver, Contact and IP, UDP, one can determine whether a user agent is behind a NAT/firewall or not. If the agent is behind a NAT/firewall, the client-server mode will be chosen. In the master-slave mode, each audio/video signal will be received and then sent by the server. As a result, the server will need to handle the data with two times of power. Therefore, in this paper one will setup a slave server group to share the system load created by agents that need to operate at the client-server mode. When it is determined that the UA need to operate in the client-server mode, the master server will modify the SIP SDP information during the initialization process. A slave server in the group will be chosen dynamically to handle the audio/video transmission. No change to the client software or the NAT/firewall software is necessary for this operation. Figure 4. System Flow Master When it is determined that a client-server mode is necessary for an agent, a different process will be handled. A slave server in the group is chosen and an internal request process will be initiated as shown in Figure 5. The master will rewrite the c parameter in the SDP messages into the chosen slave server IP address, the user will transmit the multimedia message
according to the IP address which is defined in SDP. The slave group can be designed to be load sharing to balance the load of the multimedia transmission. Query special flag from DB IF special flag is set THEN: add special prefix to request line CALL Select_Slave return END IF Query from/to's NAT flag from DB IF NAT flag is set THEN: add NAT prefix to request line CALL Select_Slave return END IF IF special flag is unset AND NAT flag is unset THEN: authorize user transaction relay END IF Figure 5. System Flow - Master and Slave Step2 shown in Figure 4 and Figure 5, the check and dispatch request will be handled by the master server. The pseudo code for this operation is show as follows. The master server will check whether a special request has been established for user (e.g., a special flag is marked, or a specific line in the code is marked. For example, 123 may indicate that the call be transferred into the voicemail. Accordingly, a slave server will be assigned to go forward with this additional service. When a 3. DISCUSSION In a traditional SIP-based VoIP network, the SIP Proxy is responsible only for the handling of the setup messages. A RTP proxy is necessary if a client-server transmission as shown in Figure 6 is needed. For comparison, this SER+ RTP proxy as shown in Figure 6 will be used as a comparison benchmark for our multi-server cluster system. The comparison will be based on the system integration, codec supporting and other additional service. user is behind the NAT/firewall, a slave server will be assigned to provide the client-server mode of operation. If none of the special request has been made, it means that the agent could operate in the peer-to-peer mode. In this case, the master server will stay in control and allow the RTP stream to be communicated directly between the users. PROCEDURE Select_Slave fix caller's SDP "c" parameter dynamic select slave server rewrite target ip and port PROCEDURE Invite_Process System Integration: Figure 6. SER with RTP proxy Similarly, the system structure of SER with RTP proxy can be
used to diversify the multi-media traffic load. However, specially designed messages between SER and RTP proxy are necessary. The RTP proxy is used only to relay the multi-media the server system. Using this multi-server cluster VoIP system structure, the size of the system is scalable and the network load will be minimized. data. In the multi-server cluster system, all the communication protocol are SIP based. It can be easily integrated with different SIP based devices. Codec Supporting The SER+RTP proxy system has codec conversion power, the call will be terminated if the codec used in both agents are different. In the multi-server cluster system, the slave server will assume the role of codec conversion. The slave server will automatically convert if both parties are using different codec schemes. Additional Service: In the SER+RTP proxy structure, the SER can be used to provide only the basic phone connection functions and the RTP proxy is used to pass the media. It will be hard to provide additional services under these circumstances. In the multi-server cluster system, however, both the master and slave servers are fully functional IP-PBX system. Additional PSTN services, non-pstn services, voice mail, monitoring, etc can be provided by this system. A feature rich service can be achieved by this multi-server cluster system. 4. CONCLUSION A scalable multi-server cluster VoIP system structure is proposed in this paper. This server cluster system has integrated both the client-server and peer-to-peer transmission mode. It allows a dynamically change of the service mode according to the agent s situation. Normally, an agent will operate in the peer-to-peer mode by the master server. For an agent behind a NAT/firewall, the transmission mode will be changed to the client-server mode and the agent will be handled by a slave server. When a resource intensive service, such as monitoring, voice mail, etc, is required by the agent, the transmission will be changed to the client-server mode accordingly. The number of slave servers can be flexibly changed according to the load of 5. REFERENCES [1] J. Rosenberg, et.al., SIP: Session Initiation Protocol, IETF RFC 3261, June 2002. [2] F. Andreasen, B. Foster, Media Gateway Control Protocol (MGCP), IETF RFC 3435, January 2003. [3] H. Schulzrinne, et.al., RTP: A Transport Protocol for Real-Time Applications, IETF RFC 3550, July 2003. [4] M. Handley, V. Jacobson, SDP: Session Description Protocol, IETF RFC 2327, April 1998. [5] K. Egevang, P. Francis, The IP Network Address Translator (NAT), IETF RFC 1631, May 1994. [6] Newport Networks Ltd., Solving the NAT Traversal Issues for Multimedia over IP Services, www.newport-networks.com, 2006 [7] Dawen Zhou, Benxiong Huang, Yijun Mo, "Distributed Architecture of VOIP for firewall/nat Traversing", International Conference on Wireless Communications, Networking and Mobile Computing, 2005, 23-26 Sept. [8] Koski, P.; Ylinen, J.; Loula, P., The SIP-Based System Used in Connection with a Firewall, Advanced International Conference on Telecommunications and International Conference on Internet and Web Applications and Services, 2006, 19-25 Feb. [9] Jae Cheon Han, Wook Hyun, Sun Ok Park, Il Jin Lee, Mi Young Huh, Shin Gak Kang, "An application level gateway for traversal of SIP transaction through NATs", Advanced Communication Technology, The 8th International Conference, 2006, 20-22 Feb. [10] G. Camarillo, et al., "Functionality of Existing Session Border Controllers (SBC) ", IETF Draft, February14, 2005. [11] Packet Based Multimedia Communication Systems, ITU-T Rec. H.323, Feb. 1998. [12] J. Rosenberg, J. Weinberger, C. Huitema, R. Mahy, STUN - Simple Traversal of User Datagram Protocol
(UDP) Through Network Address Translators (NATs), IETF RFC 3489, March 2003. [13] J. Rosenberg, R. Mahy, C. Huitema, Traversal Using Relay NAT (TURN), draft-rosenberg-midcom-turn-08 (work in progress), September 2005.