End Device Support for AAA in SIP Conferencing Antti Poikela Helsinki University of Technology aspoikel@cc.hut.fi Abstract This study is a literature survey of current problems and solutions for authentication, authorization and accounting in SIP conferencing, emphasizing the challenges particularly in mobile end devices. In this study I will show that SIP authentication is in a mature state while conferencing authorization, on the other hand, is still far from ready. KEYWORDS: SIP, Conferencing, AAA, Authentication, Authorization, Accounting, Mobile 1 Introduction Session Initiation Protocol (SIP) is a protocol for negotiating and creating sessions with other peers. The most well-known example of SIP usage is a VoIP call: SIP is often used to contact the target entity and to negotiate the required session and media parameters. SIP conferencing refers to a SIP session with multiple participants. There are multiple different layouts that the session control can take in a SIP conferencing session. [20] Loosely coupled conference. This model has no central component and no signaling between participants. Fully distributed multi-party conferencing. All conference participants create separate SIP sessions with all other participants. There are no central components. Tightly coupled conference. All participants create a SIP session with a conferencing server (called focus), which can be one of the session participants. Thus, the signaling creates a star topology. As the focus of this paper is in mobile devices, I must consider the third conferencing model to be the most interesting one. The mobile world is still badly suited for p2p communications, since most networks have firewalls to prevent incoming connections to mobile devices, and the mobility aspects might cause the connection to be lost from time to time. Additionally, a centralized focus point allows a natural way to implement session control, and is the most interesting one in the view of AAA. AAA stands for authentication, authorization and accounting. Authentication means the process of validating that the user is who he claims to be. In its simplest form, user could provide an username and a password, which the network elements then verify before accepting the user. Authorization is usually performed after authentication. The authorization process gives or denies the user access to specific resources or services. And finally, accounting tracks how resources or services are used. Accounting information is usually used to charge the customer for the services he has consumed. Similarly to any other call, conference participation is usually limited to a known set of users. In order to enable any kind of participation control (authorization), a reliable method of user authentication is required. And, if the conference is to be offered as a service, a method for accounting must exist. In this paper, I look at possible solutions for SIP conference authentication, authorization and accounting. Tightly coupled conference consists typically of a conferencing focus, a conference policy server, various media mixers and a conference notification service. These are functional elements and can be implemented within one physical server if so required. [20] In this study, I will follow the work done by IETF Centralized Conferencing (XCON) Working Group that introduces a somewhat more detailed conferencing framework. [9] In this framework, a conferencing service consists of a conference control server, a floor control server, one or more foci, and a notification service. Each of these elements have their own protocols to communicate with corresponding clients on the user equipment (UE), and each of the elements interact with conference data. The framework does not define these protocols, but the working group has separate protocol drafts. The 3rd Generation Partnership Project (3GPP, a standardization collaboration of telecommunications related entities) has specification for defining conferencing in the IMS (IP Multimedia Subsystem). [2] IMS is an architecture for delivering IP based multimedia content to mobile users. Conferencing model used by the 3GPP IMS definition is the aforementioned tightly coupled conference that has a separate application server (AS) implementing the functionality of the conferencing focus. Unless otherwise noted, all references to 3GPP specifications in this study refer to the latest one available, which in most cases is the 3GPP Release 8. As these latest specifications are not frozen, they are prone to changes in the future. In section 2 I describe the level of SIP support in current mobile phones. Section 3 is about SIP authentication. Two authentication cases are considered: standard SIP authentication defined by IETF, and IMS authentication defined by 3GPP. In section 4 I look at the ongoing standardization work regarding conferencing authorization and describe a couple of alternative solutions proposed during the process. Chapter 5 shortly summarizes how accounting is related to SIP conferencing, and what requirements it places on user equipment and conferencing server environment. Finally, in chapter 6 I draw conclusions of the current situation and pos-
sible future development of AAA in SIP conferencing. 2 End Device SIP Support Almost all end devices that support custom software installations can be used for applications that require SIP by implementing the SIP stack on application level. However, it helps to have a SIP stack by default, especially in mobile environments. Firstly, in the case of IMS, telephone company support is needed for SIP account provisioning. Conferencing servers that reside within an operator IMS are treated with the same level of requirements regarding security and accessibility as other telecommunications services, so operators will require control over user account provisioning. Furthermore, telecommunications operators may block unauthorized SIP traffic altogether in order to protect the income generated by operator core services. Secondly, even if a mobile device has multiple simultaneous SIP applications running, only one registration is required when using a native SIP stack. And, of course, using a standard SIP stack is likely to speed up application development. On the downside, fixing bugs and adding new features to a native mobile phone SIP stack probably requires updating of the device firmware. SIP capability is a relatively new feature in mobile phones. The most notable platform currently having built-in SIP support is Symbian. SIP is required in Symbian version 9.2, though Nokia has chosen to include it already in mobile phones using Symbian version 9.1. [22] Other manufacturers will include the stack with version 9.2 phones. Phone manufacturers use different interfaces with the Symbian platform. The dominant interface, S60, is used e.g. by Nokia, LG, Panasonic, and Samsung. An alternative interface, UIQ, is used e.g. in some Sony Ericsson and Motorola mobile phones. The most recent version of the S60 interface is called S60 3rd edition. All Nokia S60 3rd edition devices have at least Symbian version 9.1 and thus contain a SIP stack. SIP support can be enabled in some S60 2nd edition devices, too, but requires installation of additional components. Currently there are 30 S60 3rd edition and 13 S60 2nd edition mobile phones by Nokia. [21] The first 3rd edition device was launched in April 2005, and S60 3rd edition is currently the dominant Nokia smart phone platform. The last 2nd edition model was released in April 2006 and it is highly unlikely that there would be new 2nd edition devices released any more. Additionally, LG and Motorola have both announced one and Samsung five devices that use Symbian version 9.2 and thus have SIP functionality available. Most Symbian devices that have a SIP stack include SIP support for Java ME (Java Micro Edition, a set of APIs targeted for devices with restricted resources), too. JSR180 (Java Specification Requests 180: SIP API for Java ME) defines SIP support for Java ME. Sony Ericsson s Java Platform 8 (JP-8) has support for JSR180. Currently the platform can be found in seven phone models, though all of them are not yet available for consumers. [15] Other commonly used mobile device platforms do not currently support SIP out of the box. However, SIP conferencing is naturally not limited to devices with native SIP support, as nothing prevents an application developer to include a SIP stack of his own. Most readily available SIP stacks are not 3GPP compliant, though. 3 Authentication Authentication in SIP conferencing basically only requires that the underlying SIP session is authenticated. Standard ways for SIP authentication exists, namely HTTP digest authentication specified by IETF and an authentication scheme defined by 3GPP. [4, 13] The SIP stacks included in mobile devices support both IETF and 3GPP authentication modes. The used authentication mode can usually not be decided on the fly, but has to be explicitly specified as a part of the SIP profile configuration. Authentication is meaningful only within the user s own realm, i.e. within one SIP proxy or IMS. Some kind of inter-domain authentication is required if users from multiple realms want to conference together, but this is outside of the scope of this study. Conference-unaware user agent (UA) means an UA that can be used in conferencing even though it has no such concept. In this case, the UA treats conferencing calls as normal calls between two user agents. Of course, this requires that all conference-related call control and media mixing is done by the conferencing focus. Similarly, a conference-aware UA understands the concept of conferencing. In a tightly coupled conference, each participant authenticates with the central focus, identically to any other SIP authentication. Conference-aware user agent gains no benefit compared to a conference-unaware UA. 3.1 IETF SIP Authentication When a network entity (user agent server, UAS) has to authenticate an unauthenticated user (user agent client, UAC), it first sends back a SIP message rejecting the request (401 Unauthorized) with a challenge that describes the required authentication scheme and its parameters. The user should then re-send the original request with the properly calculated credentials. [13] If the credentials are correct, user will be treated as authenticated and the request shall be processed. If the credentials are not correct, the UAS may repeat its authentication request or reject the authentication with 403 Forbidden. If UAC has no credentials for the current UAS, it may retry the request as an anonymous user, and it is up to UAS if such users are allowed. 3.2 3GPP SIP Authentication 3GPP SIP authentication is identical to the IETF authentication scheme as far as the message flow is concerned: in this model, too, authentication is requested using 401 Unauthorized, rejected using 401 Forbidden, and accepted using 200 OK. The details, however, differ.
IMS authentication is done using the user s IM Subscriber Identity Module (ISIM), if possible. ISIM is an application located in the user s UICC (Universal Integrated Circuit Card, also known as a SIM card), providing IMS security data and services. [1] In practice, currently SIP account information has to be specified in either mobile phone s integrated SIP stack settings or application s own preferences, depending on the platform used. To use a conferencing service in an IMS, the user must register to the IMS. SIP proxies and servers within an IMS are called Call Session Control Functions (CSCF). When user initiates a SIP session in an IMS environment the UE first sends a REGISTER message to a Proxy-CSCF (P- CSCF), which may reside in a visited network if the user is roaming. From the P-CSCF, the message is directed to a Interrogating-CSCF (I-CSCF) that asks the Home Subscriber Server (HSS) which SIP registrar should be used for this connection. HSS tells an appropriate registrar, called Serving- CSCF (S-CSCF) that finally does the registration processing. In order to do that, it will need to connect to the same HSS for fetching profile details regarding the user. [11] Each IMS subscriber has one or more private identities (IMPI) and public identities (IMPU). Authentication is done using IMPI, and IMPU is used as an identity in communication with other parties. [5] User may be authenticated during registration, reregistration, deregistration and registration of additional public identities. When S-CSCF receives an initial REGISTER message, it will respond with 401 Unauthorized that includes authentication challenge. When UE receives this message, it shall first check if the request is valid, using integrity protection. Integrity protection is done between UE and P-CSCF. [1] If the response is valid, it will resend the REGISTER message with populated authentication parameters as defined. [4] Upon receiving the REGISTER with the authentication challenge response, S-CSCF will check the authentication response within the message. If the response matches the expected value, a 200 OK message will be sent and the user identity will be considered to be registered. 3.2.1 Authentication at the AS 3GPP defines a standard way to authenticate users at application server (AS). [4] The definition states that once the AS receives a SIP message without authentication credentials, it shall first check if the message should be treated as an anonymous request. Anonymity is expressed by including a Privacy header with a value of "id" or "user" in the SIP request. If the request is not an anonymous one, the AS will check if the request contains a P-Asserted-Identity header. If the header is present, it means that the user has already been authenticated within the same trusted domain and no further authentication actions are required. If the header is not present, the AS will have to answer with 401 Unauthorized and proceed to authenticate the user according to the standard SIP authentication. [13] This might happen if the AS is located outside a trusted domain. 4 Authorization Users can join existing conference sessions by directly joining the conference using a conference URI, or they can be invited by a third party using the REFER method. [20, 2] This only specifies the method of joining itself, not how the users should be authorized for joining the conference. Inconference authorization requires application level solutions. In this section, I look at various suggestions for solving the authorization problem. Conference authorization is implemented with a conference policy. Conference policy refers to a defined set of rules that allow or restrict operations on a conference. [20, 9] The conference policy could restrict the number of users in a public conference, or only allow a predefined set of users to join (using access lists) if the conference is private, or have some highly complex rules that suit the situation at hand. In IETF conferencing framework, the conference policy modification mechanism is left undefined. [20] 3GPP states that conference policy control protocol is felt as an essential part of a complete conferencing standard, but the current specifications do not address the issue. Thus, there is no standardized conference control protocol for mobile applications as of now. [2] For conference authorization to be meaningful, conference administrators need to be able to have control over the authorization rules. Conference policy manipulation could be done, for example, using a web application or some proprietary protocol, but as stated by 3GPP, a standardized way would be important, especially in the mobile telecommunications world where users are used to interoperable solutions. This, in turn, would effectively mean that authorized users would need to be able to manipulate the conference policies using conference-aware mobile devices, and a standard protocol would thus be required. As mentioned before, the IETF XCON working group is focused to develop protocols for the needs of tightly coupled conferences. The work has resulted in four RFCs, two active drafts and numerous now expired drafts. One of the specified deliverables for the group is an authorization control mechanism. The expired drafts include proposals for policy control, too, but none of the finished RFCs or active documents address the issue. Neither are there any road maps available for clarifying the future of policy control standardization. In the lack of finished standards, I will now present some of the alternative methods for conference policy control that the XCON group has proposed in the past. None of them are in active development, but the variance in possible solutions highlights the fact that there are multiple different approaches to realize the authorization protocol. [14] XCON working group has specified requirements for a conference policy control. The requirements have have been laid out in a draft called Requirements for Conference Policy Control Protocol. [18] 4.1 Conference Policy Control Protocol Conference Policy Control Protocol (CPCP) is an IETF draft that specifies a protocol for controlling conference policies. The protocol enables manipulation of conference authoriza-
tion rules by defining an XML schema for conference policy rule document. Specification does not address the issue of delivering the document to the conference policy server. [17] CPCP implements user authorization with access lists. A privileged user can send to the conferencing focus a document that lists users who will be invited or referred to the conference, or users can be barred from joining by blacklisting them. 3GPP IM conferencing specification version 6.1.0 defined CPCP as the conference control protocol, but all references to CPCP has been removed from later versions. [2] 4.2 Conference Policy Manipulation Using XCAP Another IETF draft proposes policy manipulation using XML Configuration Access Protocol (XCAP). In this scenario, the conference policy server has an XCAP interface that privileged users can access in order to manipulate conference policies. [16] The draft only proposes the policy transport and manipulation protocol. The protocol could be used to transport CPCP policy rule documents to the conference policy server or to modify elements in an existing policy rule document. 4.3 Centralized Conferencing Manipulation Protocol Centralized Conferencing Manipulation Protocol (CCMP) is an IETF draft, suggesting a Simple Object Access Protocol (SOAP) based protocol for conference object creation, manipulation and deletion. [8] This would include conference participant control. The protocol requires that the conference policy server creates a web service, based e.g. on a Conference Information Data Model [19], that the UE can access. The SOAP protocol works by exchanging XML documents using HTTP. Conference Control Package [7] is a draft complementing CCMP. It proposes that SIP control framework [10] could be used to deliver the messages defined by the CCMP. The SIP control framework is a draft proposing a way for controlling devices with two new SIP messages, CONTROL and RE- FER. 5 Accounting SIP conference accounting may consists, for example, of conference session length, the number of conference participants, the used network resources, or any other indicator that is seen suitable. Accounting is an operation that is done transparently in the system offering the conferencing service, and no UE support is required. Therefore, accounting is is not fully on the scope of this study. 5.1 SIP Accounting IETF has proposed requirements for SIP session accounting in a draft AAA Requirements for IP Telephony/Multimedia. The proposal includes, among others, requirement for SIP servers to be able to gather session length and other information usable in accounting, requirement that the accounting messages reflect the changes in SIP session if it the changes should affect charging, and requirement that the Home AAA Server can initiate deregistration of the user s SIP session. The draft hasn t progressed to a RFC and has now expired. [12] 5.2 Accounting in the IMS Accounting for the transferred data is a standard accounting procedure for telecommunications operators. However, the operators try to avoid the role of being merely a bit pipe, and the whole IMS movement is part of the effort to maintain operator role as a service provider. Naturally, accounting is a part of the standards track. Accounting ability is one of the main drivers of telecommunications operators, and is naturally a requirement for conferencing within an operator IMS. IMS charging is standardized by 3GPP. [6] There are two charging mechanisms: offline charging and online charging. Offline charging does not affect the service that is being charged, while online charging can affect the service. Charging is done by creating Charging Data Records (CDR) that include user and service identification, and network elements and resources used by the billable event. IMS uses Diameter protocol [3] for both offline and online charging. Application servers can use both offline and online charging. The AS can acquire the available charging function addresses from the SIP message header fields. The charging function address will be inserted by the P-CSCF so that network components within the IMS can charge the session, and the P-CSCF also makes sure that the charging function address is removed from all messages delivered outside the IMS. 6 Conclusions SIP conference authentication is realized by authenticating the underlying SIP session. SIP authentication is well standardized and in a mature state. Currently there are tens of SIP capable mobile devices on the market, but only in the high-end devices from a couple of manufacturers. Additionally, there are some devices with closed SIP stacks that are used for example in VoIP signaling. It is clear that SIP support will increase in the future. Conference authorization itself is a mechanism within the conferencing server and requires no mobile device support. However, controlling the authorization rules requires UE support. Currently there is no standardized mechanisms for conference policy control, and support for policy control in current mobile devices is therefore non-existent. Since the standardization is usually quite a long process, it might still take years before a standard is finalized. On top of that, it takes time to create mobile devices supporting the standard and wait for them to enter the market. Conference accounting is handled in the service environment and requires no support from the UE. In an IMS environment, standardized accounting exists, but other envi-
ronments probably need proprietary methods to handle SIP conference accounting. SIP conferencing can be used with currently available solutions, but the user agents have to be conference-unaware or use proprietary protocols. It is unlikely that proprietary solutions would get a large following within telecommunications operators, since proprietary methods make wide interoperation quite improbable. Proprietary solutions may be implemented in the Internet world, but in that case accounting becomes much more complicated. References [1] 3GPP. Access security for IP-based services. Technical report, 3rd Generation Partnership Project, October html-info/33203.htm. [2] 3GPP. Conferencing using the IP Multimedia (IM) Core Network (CN) subsystem. Technical report, 3rd Generation Partnership Project, September html-info/24147.htm. [3] 3GPP. Diameter charging applications. Technical report, 3rd Generation Partnership Project, October html-info/32299.htm. [4] 3GPP. IP multimedia call control protocol based on Session Initiation Protocol (SIP) and Session Description Protocol (SDP). Technical report, 3rd Generation Partnership Project, September html-info/24229.htm. [5] 3GPP. IP Multimedia Subsystem (IMS). Technical report, 3rd Generation Partnership Project, September html-info/23228.htm. [6] 3GPP. IP Multimedia Subsystem (IMS) charging. Technical report, 3rd Generation Partnership Project, October 2007. http://www.3gpp.org/ftp/ Specs/html-info/32260.htm. [7] M. Barnes and C. Boulton. A Conference Control cal report, The Internet Engineering Task Force, October 2004. http://tools.ietf.org/html/ Package for the Session Initiation Protocol. Technical report, The Internet Engineering Task Force, draft-ietf-xcon-cpcp-01. May 2007. http://tools.ietf.org/html/ draft-boulton-xcon-conference-control-package-01. [18] P. Koskelainen and H. Khartabil. Requirements for Conference Policy Control Protocol. Technical report, The Internet Engineering Task Force, Au- [8] M. Barnes, C. Boulton, C. Boulton, and H. Schulzrinne. Centralized Conferencing Manipulation Protocol. Technical report, The draft-ietf-xcon-cpcp-reqs-04.txt. gust 2004. http://www.tools.ietf.org/id/ Internet Engineering Task Force, February 2007. http://tools.ietf.org/html/ draft-barnes-xcon-ccmp-02. [9] M. Barnes, C. Boulton, and O. Levin. A Framework for Centralized Conferencing. Technical report, The Internet Engineering Task Force, November 2007. http://www.ietf.org/internet-drafts/ draft-ietf-xcon-framework-10.txt. [10] C. Boulton, T. Melanchuk, S. McGlashan, and A. Shiratzky. A Control Framework for the Session Initiation Protocol (SIP). Technical report, The Internet Engineering Task Force, February 2007. http://tools.ietf.org/html/ draft-boulton-sip-control-framework-05. [11] G. Camarillo and G. Blanco. The Session Initiation Protocol (SIP) P-User-Database Private-Header (P-Header). RFC 4457, The Internet Engineering Task Force, April 2006. http://www.ietf.org/rfc/ rfc4457.txt. [12] H. B. et. al. AAA Requirements for IP Telephony/Multimedia. Technical report, The Internet Engineering Task Force, March 2002. http://www.softarmor.com/wgdb/docs/ draft-calhoun-sip-aaa-reqs-04.txt. [13] J. R. et al. SIP: Session Initiation Protocol. RFC 3261, The Internet Engineering Task Force, June 2002. http://ietf.org/rfc/rfc3261.txt. [14] T. I. E. T. Force. Centralized Conferencing (XCON) Working Group. Technical report, The Internet Engineering Task Force, September 2007. http://www.ietf.org/html.charters/ xcon-charter.html. [15] New W890 and K660 revealed, two Java Platform 8 (JP-8) phones supporting MSA, November 2007. http://developer.sonyericsson. com/site/global/newsandevents/ latestnews/newnov07/p_neww890k660_ jp8phones_msa.jsp. [16] H. Khartabil. An Extensible Markup Language (XML) Configuration Access Protocol (XCAP) Usages for Conference Policy Manipulation and Conference Policy Privelges Manipulation. Technical report, The Internet Engineering Task Force, October 2004. http://tools.ietf.org/html/ draft-ietf-xcon-cpcp-xcap-03. [17] H. Khartabil, P. Koskelainen, and A. Niemi. The Conference Policy Control Protocol (CPCP). Techni- [19] O. Novo, G. Camarillo, and D. Morgan. Conference Information Data Model for Centralized Conferencing (XCON). Technical report, The Internet Engineering Task Force, April 2007. http://tools.ietf.org/html/ draft-ietf-xcon-common-data-model-06.
[20] J. Rosenberg. A Framework for Conferencing with the Session Initiation Protocol (SIP). RFC 4353, The Internet Engineering Task Force, February 2006. http: //ietf.org/rfc/rfc4353.txt. [21] S60.com. S60 phones. http://www.s60.com/ life/s60phones. [22] M. Shackman. What s new for developers in Symbian OS v9.2?, July 2006. http://developer.symbian.com/main/ downloads/papers/whatsnew9.2/what s_ new_symbian_os_v9.2.pdf.