Building Robust Signaling Networks



Similar documents
Evolved Packet Core features build resiliency and preference

Network functions virtualization and software management

Telecommunication Services Engineering (TSE) Lab. Chapter III 4G Long Term Evolution (LTE) and Evolved Packet Core (EPC)

Delivery of Voice and Text Messages over LTE

SERVICE CONTINUITY. Ensuring voice service

Applying Software Defined Networks and Virtualization Concepts for Next Generation Mobile Broadband Networks

10 METRICS TO MONITOR IN THE LTE NETWORK. [ WhitePaper ]

464XLAT in mobile networks

Wanderlust: Enabling roaming in the LTE era. Don Troshynski Vice President, Solutions Architecture

Nokia Siemens Networks Flexi Network Gateway. Brochure

NTT DOCOMO Technical Journal. Core Network Infrastructure and Congestion Control Technology for M2M Communications

Long-Term Evolution. Mobile Telecommunications Networks WMNet Lab

Diameter in the Evolved Packet Core

Whitepaper. 10 Metrics to Monitor in the LTE Network. blog.sevone.com

Accelerate Private Clouds with an Optimized Network

App coverage. ericsson White paper Uen Rev B August 2015

APPLICATION DELIVERY CONTROLLERS AND THEIR ROLES IN THE MOBILE NETWORK

Advanced SIP Series: SIP and 3GPP Operations

NFV & SDN World. Practical Approaches to NFV Orchestration Deployment. Terry McCabe CTO Mobile Business Unit

Virtual CPE and Software Defined Networking

Corporate Network Services of Tomorrow Business-Aware VPNs

S-Series SBC Interconnect Solutions. A GENBAND Application Note May 2009

Demo 1. Network Path and Quality Validation in the Evolved Packet Core

UDC IN A BOX. A complete User Data Management Solution to meet different business needs

Business Case for Juniper Networks Virtualized Mobile Control Gateway

Cisco and EMC Solutions for Application Acceleration and Branch Office Infrastructure Consolidation

4G Mobile Networks At Risk

Securing the Interconnect Signaling Network Security

Diameter Interworking. Interworking Eases Network Transition, Ensures Widest Range of Roaming and Increases Roaming Revenues

Contents. Preface. Acknowledgement. About the Author. Part I UMTS Networks

Network Functions Virtualization (NFV) for Next Generation Networks (NGN)

Get the best performance from your LTE Network with MOBIPASS

An Oracle White Paper December The Time for Diameter Is Now: Why Service Providers Should Implement Diameter Today

Implementing LTE International Data Roaming

White Paper. Requirements of Network Virtualization

Approaching these challenges with the right tools and solutions is critical for mobile operators success on the evolution to LTE.

Session Border Controllers: Addressing Tomorrow s Requirements

Supporting mobility in the RAN cloud

Inter-Domain QoS Control Mechanism in IMS based Horizontal Converged Networks

THE CONVERGENCE OF NETWORK PERFORMANCE MONITORING AND APPLICATION PERFORMANCE MANAGEMENT

Voice over IP over LTE (VoLTE) Impacts on LTE access. EFORT

Co-existence of Wireless LAN and Cellular Henry Haverinen Senior Specialist Nokia Enterprise Solutions

Krishan Sabnani Bell Labs. Converged Networks of the Future

3G/Wi-Fi Seamless Offload

DATA-DRIVEN EFFICIENCY

SDN and NFV in the WAN

LTE Mobility Enhancements

Best Effort gets Better with MPLS. Superior network flexibility and resiliency at a lower cost with support for voice, video and future applications

LTE Performance and Analysis using Atoll Simulation

MPLS: Key Factors to Consider When Selecting Your MPLS Provider Whitepaper

White Paper. The Assurance Checklist for Branch Networks A pragmatic guide for building high performance branch office networks.

Signaling is growing 50% faster than data traffic

Nokia Siemens Networks Flexi Network Server

Integrating Lawful Intercept into the Next Generation 4G LTE Network

ALCATEL-LUCENT 7750 SERVICE ROUTER NEXT-GENERATION MOBILE GATEWAY FOR LTE/4G AND 2G/3G AND ANCHOR FOR CELLULAR-WI-FI CONVERGENCE

of the existing VoLTE roaming and interconnection architecture. This article compares existing circuit-switched models with the earlier

Application Performance Management

Requirements and Service Scenarios for QoS enabled Mobile VoIP Service

Voice, Video and Data Convergence > A best-practice approach for transitioning your network infrastructure. White Paper

NETWORK ISSUES: COSTS & OPTIONS

Implementing Conditional Conference Call Use Case over IMS and Non IMS Testbed an experimental results through comparison approach

End-2-End QoS Provisioning in UMTS networks

The Need for Session Delivery Networks

WHITEPAPER MPLS: Key Factors to Consider When Selecting Your MPLS Provider

Delivering Managed Services Using Next Generation Branch Architectures

White Paper Traffix Systems October 2011

Advanced SIP Series: SIP and 3GPP

How To Provide Qos Based Routing In The Internet

Efficient evolution to all-ip

Overview of GSMA VoLTE Profile. minimum required functions [3]. 2. Background

The Next Generation Network:

HIGH-PERFORMANCE SOLUTIONS FOR MONITORING AND SECURING YOUR NETWORK A Next-Generation Intelligent Network Access Guide OPEN UP TO THE OPPORTUNITIES

Convergent data center for future network

LTE service area. 3G service area. EPS : Evolved Packet System. Currently Planning & Coordination Office 1 C *

EXPLOITING SIMILARITIES BETWEEN SIP AND RAS: THE ROLE OF THE RAS PROVIDER IN INTERNET TELEPHONY. Nick Marly, Dominique Chantrain, Jurgen Hofkens

TÓPICOS AVANÇADOS EM REDES ADVANCED TOPICS IN NETWORKS

IBM Global Technology Services March Virtualization for disaster recovery: areas of focus and consideration.

Mobility and cellular networks

How to deal with a thousand nodes: M2M communication over cellular networks. A. Maeder NEC Laboratories Europe andreas.maeder@neclab.

LTE Overview October 6, 2011

White paper. Reliable and Scalable TETRA networks

Practical Security Testing for LTE Networks BlackHat Abu Dhabi December 2012 Martyn Ruks & Nils

A Proposed Model For QoS guarantee In IMSbased Video Conference services

White paper. Business Applications of Wide Area Ethernet

Voice over IP is Transforming Business Communications

ADVOSS SIP APPLICATION SERVERS

TO PACKET CORE. EVOLving THE PACKET CORE TO An NFV/sdN ARCHITECTURE

Dialogic BorderNet Session Border Controller Solutions

Intel Network Builders Solution Brief. Intel and ASTRI* Help Mobile Network Operators Support Small Cell Networks

Testing Network Virtualization For Data Center and Cloud VERYX TECHNOLOGIES

Spirent CLEAR Mobility. End-to-End Mobile Network Infrastructure Test and Lab Automation Solutions

Networking Technologies for 5G

Control Plane Orchestration: The Evolution of Service Innovation Attributes

Uninterrupted Internet:

How to secure an LTE-network: Just applying the 3GPP security standards and that's it?

Multi-protocol Label Switching

Acme Packet Net-Net SIP Multimedia-Xpress

Security Testing 4G (LTE) Networks 44con 6th September 2012 Martyn Ruks & Nils

Radware ADC-VX Solution. The Agility of Virtual; The Predictability of Physical

Transcription:

ericsson White paper Uen 284 23-3268 July 2015 Building Robust Signaling Networks MEETING THE CHALLENGES OF THE RISING SIGNALING STORM Distributed signaling network robustness that follows the concept of three protection lines provides operators with a robust and scalable network architecture beyond the capabilities of existing overload protection mechanisms at node level. Applying this concept enables mobile networks to handle the growth in signaling without unnecessary over-dimensioning, and ensures service delivery to consumers even in cases of heavy signaling load, node failures and malicious activities that lead to signaling storms.

Introduction Today s consumers increasingly expect high availability from communication and data services. In this environment, network failure scenarios can trigger a massive amount of signaling a signaling storm caused by automatic reconnection requests from multiple connected devices. Robust and scalable network solutions are therefore required to optimize operator revenue and maximize the consumer experience. This white paper discusses best practices, and provides recommendations for building highly scalable and robust signaling networks. A robust and distributed signaling network provided by the concept of three protection lines will be introduced as the recommended network architecture. This concept provides a scalable and robust signaling network beyond the capabilities of existing network protection mechanisms at node level. The principles presented in this paper are valid for Signaling System #7-based and Diameter-based signaling. However, the paper will focus solely on Diameter signaling, which is an important control protocol for LTE networks and IMS. BUILDING ROBUST SIGNALING NETWORKS INTRODUCTION 2

Challenges for the signaling network Current developments in telecommunication technologies and markets stress the importance of having flexible and robust congestion control mechanisms in order to maximize performance and service availability. The traditional approach of protecting individual nodes has been around since the introduction of GSM overload in the network is addressed with dedicated protection mechanisms in overloaded nodes. Standardization bodies promoted these protection mechanisms, which became widely adopted in the industry. These mechanisms served their purpose successfully, until they were confronted with an increased complexity of mobile networks and new usage scenarios not considered in earlier specifications. Nowadays, a constellation of different network access technologies, such as 2G, 3G, LTE, Wi-Fi and fixed, coexist to provide seamless access to voice and data services. The huge penetration of smartphones has dramatically increased data consumption and bandwidth requirements, and smartphone subscriptions will more than double from now until 2020 [1]. The emergence of the Internet of Things (IoT) means networks must simultaneously face up to new usage scenarios, which in some cases drastically increase signaling demands. In addition, these connected devices can be a source of signaling storms in cases of simultaneous connection requests after network disturbances. The continuous modernization of operator networks, with increasing centralization of resources in higher-capacity systems, implies that signaling storms have a bigger impact on the networks, as incidents on centralized resources are likely to affect a larger number of users. Affected users will perform reattempts, thereby initiating a snowball effect that could overload the signaling network multiple times. As Figure 1 summarizes, signaling networks therefore face significant challenges. As a result, operators are demanding new overload protection mechanisms to enable them to provide the generally accepted five nines, or even greater network availability. The journey that the industry has initiated toward cloud computing, with the transformation of the current network nodes into virtualized network functions (VNFs), seems at first glance to be a handy solution to cope with signaling storms in the network. The reality, however, is quite different, as this approach may generate an illusion of infinite resources and lead to underestimation of the importance of overload protection mechanisms. Scaleout mechanisms in the cloud will provide increased flexibility to handle steady growth, although these would not be fast enough to cope with sudden signaling peaks, which will escalate quicker than the network functions can cope with. WLAN 2G,3G, Fixed LTE Multiple access technologies Subscribers move instantly between access technologies Smartphone subscriptions The number of smartphone subscriptions is constantly increasing Increasing network complexity requires a higher amount of signaling P- CSCF SGSN MME PGW Figure 1: Challenges for the signaling network. Network complexity DSC Data traffic growth HSS OCS PCRF Growth in data traffic implies more signaling traffic Centralization of resources Node outages affect more subscribers Internet of Things More connected devices impose new traffic patterns upon the signaling network BUILDING ROBUST SIGNALING NETWORKS CHALLENGES FOR THE SIGNALING NETWORK 3

Under these circumstances, it is clear that the protection strategies that have been standardized and widely adopted in mobile networks are no longer sufficient. High-traffic peaks and network failure scenarios can result in massive signaling storms, leading to lengthy outages of network services. In its Annual Incident Reports 2012, ENISA (the European Union Agency for Network and Information Security) reports that there were 79 major telecom outages in 2012 [2]. System failures were the root cause of 75 percent of these incidents. Each overload-related incident affected an average of around 9.4 million user connections. One of the biggest effects of network outages and service degradation is the increase in the rate of subscriber churn. At the same time, operators spend USD 15 billion a year to overcome network outages and service degradations [3]. On average, operators spend 1.5 percent of their annual revenues on dealing with these issues. Some even estimate this figure to be as high as 5 percent. One strategy to mitigate overload problems is to over-dimension the network for the peak signaling load. This adds complexity and implies higher opex and capex, leading to an additional financial burden that puts an operator in a less competitive position. On the other hand, it seems to be a feasible strategy to cover up for signaling peaks that are two to three times above the average traffic load. But when a signaling storm occurs, the load can easily increase 10 times above the average, stretching the need for overdimensioning to unrealistic limits. In addition, over-dimensioning the network does not eliminate the risk of a reduction in the overall signaling capacity. A typical node behavior during overload is depicted in Figure 2. Up to the engineered capacity, the message throughput is in line with the offered traffic, and it still increases slightly when overload is reached. In cases of massive overload, however, the throughput drops heavily with a further increase in the offered load. Another strategy to address overload is to blindly reject signaling messages. The problem with this strategy is that the throughput of successful services delivered to the consumer is heavily reduced, even if only a small percentage of messages are rejected, as depicted in Figure 3. For instance, a successful VoLTE call requires the successful processing of roughly 20 signaling sequences. If only one message in this sequence is rejected, the call will be discarded. The recommended strategy is to aim for a robust and scalable network architecture. The chosen network architecture and protection mechanisms need to be capable of handling the growth in signaling without unnecessary overdimensioning, as well as ensuring service delivery to consumers during cases of heavy signaling load, node failures and malicious activities that lead to signaling storms. Processed throughput Engineered capacity Normal operation Overload Engineered capacity Figure 2: Processed throughput in relation to offered load. Success rate Consumer service delivery 100% Rejected consumer service delivery Increasing number of messages needed per service delivery Rejected signaling traffic 100% With standard node level overload control Figure 3: Success rate of the signaling traffic in relation to the consumer service delivery. Offered load 100% success rate of the signaling traffic refers to 100% success rate of the consumer service delivery A small reduction in signalling success rate leads to a huge reduction in consumer service delivery Success rate signaling traffic BUILDING ROBUST SIGNALING NETWORKS CHALLENGES FOR THE SIGNALING NETWORK 4

End-to-end strategy and principles to achieve robust signaling networks The objectives for a robust signaling network are: > > to reduce the network impact of smartphone and device signaling > > to maximize throughput in cases of overload > > fast recovery from overload and failure scenarios > > maintainability of the signaling network > > scalability of the signaling network. A robust and distributed signaling network should be based on the following end-to-end strategy and principles: > > careful network architecture to provide the basis for a robust and scalable signaling network. > > optimize the signaling traffic and minimize the amount of signaling to manage the network services > > a distributed and coordinated overload protection mechanism across several network elements to maximize the throughput in peak load scenarios. NETWORK ARCHITECTURE The signaling network architecture characteristics should be based on simplicity by using the right amount of infrastructure and features to get a manageable network entity. The signaling network should be divided into manageable smaller components. Modular network design enables operators to isolate problems within a module, while the rest of the network continues to function. This means fewer users are affected and the overall uptime of the network is increased. The basic mechanism to prevent physical failure of the transport plane is redundant components and more than one possible physical path to reach the destination client originating from the source client via the transport network. QoS can be used to prevent control-plane failure. Best performance is achieved in overload or failure situations when the overload protection is distributed across several network elements. Each network element on its own should be redundant. The availability of the network elements can be further enhanced when redundant node types are deployed in different geographical places. Scalability should be supported on node and network level with the aim of: > > managing the growth in signaling traffic efficiently > > being flexible to extend an established network configuration > > having a long-term strategy to resolve overload cases. OPTIMIZATION One aim of optimization is to minimize the signaling, and by this to reduce the network impact of smartphone and device signaling. Recommended ways to achieve this are: > > to reduce paging traffic by starting paging in last known location for non-time-critical traffic before the paging request is extended to other parts of the network > > to limit the effect of LTE idle timer decrease in the user equipment (UE) by only performing authentication at every 10th or 20th connection setup, due to the increased number of connection setup requests from the UE > > to drop or reject excess traffic from misbehaving UE > > to drop or reject traffic from malicious attacks. BUILDING ROBUST SIGNALING NETWORKS END-TO-END STRATEGY AND PRINCIPLES TO ACHIEVE ROBUST SIGNALING NETWORKS 5

A second aspect of optimization is to distribute the load evenly in the network. One recommendation is to build an appropriate structure with pooled network resources for easy capacity expansion and even load distribution. This allows for a much more efficient use of network resources. DISTRIBUTED AND COORDINATED OVERLOAD CONTROL Each network element should provide a working overload control mechanism to prevent its own resources from being overloaded. This is also an essential function for pooled resources, as there would otherwise be a risk that a peak in the signaling traffic could bring down one pooled device after the other. At network level, overload should be handled as closely to the overload source as possible to minimize recovery time. A propagation of the signaling peak in the network must be prevented at any cost in order to avoid a service outage on a larger scale. An example of a propagation of a signaling peak is described below. After a major network outage, which could be the result of a transport network or Mobility Management Entity (MME) failure, a large number of UEs will discover the network and try to re-attach again to it. This multitude of reattaches causes a signaling storm, which will affect large parts of the network. Typical signaling scenarios under such circumstances are: > > The MME completes the authentication process, updates the location in the Home Subscriber Server (HSS), and reestablishes the bearers. > > The serving gateway must also recreate the bearers. > > The packet data network gateway (PDN-GW) recreates the bearers and reestablishes sessions to, for example, the policy and charging rules function (PCRF). > > New full IMS registration is needed for VoLTE UE. > > The HSS needs to provide authentication information to the MME and register the location (additional transactions are needed for IMS registration for VoLTE subscribers). During overload, the signaling throughput can be optimized by intelligent traffic prioritization. Ways of achieving intelligent traffic prioritization are given below. > > Adding some application logic to the traffic management function of dedicated nodes in the network can enable the nodes to determine whether a signaling message belongs to a new subscriber transaction or an ongoing subscriber transaction. In cases of overload, the message triggering a new subscriber transaction will be rejected in favor of the message related to an ongoing subscriber transaction. This will optimize the throughput at the application level. > > Signaling traffic in overload situations can be prioritized based on importance, such as emergency calls and priority services, ahead of delay tolerant access, such as energy meters. > > An overload protection system should be adaptive to the current situation and allow higher throughput when the overload eases, and throttle more of the traffic as the overload gets worse. Using fixed rate throttling limits will not fulfill requirements for different situations. In some overload scenarios, the network can handle more traffic and in others much less. > > The throughput of the system under overload can be optimized by the concept of throughput elasticity, where the latency is allowed to increase. However, it is important that the maximum available latency budget is never exceeded on an end-to-end level. BUILDING ROBUST SIGNALING NETWORKS END-TO-END STRATEGY AND PRINCIPLES TO ACHIEVE ROBUST SIGNALING NETWORKS 6

Distributed signaling network robustness the concept of three protection lines The principles to achieve a robust signaling network are best represented in the network architecture depicted in Figure 4. A robust and distributed signaling network should follow the concept of three protection lines to protect the operator s service offering from being affected by a signaling storm. The first protection line comprises the components that act as entry points to the core network for smartphone and device signaling. Examples are the Serving GPRS Support Node Mobility Management Entity (SGSN-MME), the proxy call session control function (P-CSCF) and the PDN-GW. The second line of protection consists of the nodes providing routing capabilities for the signaling traffic. It is typically represented by a Diameter Signaling Controller (DSC) or a Signaling Transfer Point. The third line of protection is represented by the end systems hosting the application data and logic. User data management systems such as the HSS are assigned to this third line of protection. The three lines of protection provide: Minimize signaling traffic Optimize signaling Distribute load evenly Use network resources efficiently Distributed overload protection Scalable and maintainable network architecture > > distributed network architecture to allow for an end-to-end overload protection solution in distributed layers > > the ability to cover up failures, misconfiguration or misoperation evident in one protection line in the next, higher protection line > > maximized signaling throughput during overload conditions > > scalable and maintainable network architecture > > efficient use of network resources by distributing the signaling load evenly in the network > > optimized signaling procedures to minimize the signaling traffic. FIRST PROTECTION LINE The SGSN-MME being part of the Evolved Packet Core (EPC) is an example of the first line of protection. > > Representing the entry point to the core network, the EPC is closest to potential overload End-to-end overload protection solution 1 st line 2 nd line 3 rd line P- CSCF SGSN MME PGW DSC Scalability and maintainability Figure 4: Network picture of distributed signaling network robustness the concept of three protection lines. HSS OCS PCRF Backup always available Maximize throughput If one layer fails, another can take over Maximize throughput under overload conditions BUILDING ROBUST SIGNALING NETWORKS DISTRIBUTED SIGNALING NETWORK ROBUSTNESS THE CONCEPT OF THREE PROTECTION LINES 7

sources outside of the core network. The most efficient way to minimize the recovery time is to apply optimization and overload protection of the signaling traffic in the first protection line. > > Nodes in the EPC aim to optimize the signaling. One strategy is to perform smart and adaptive paging to minimize the number of paging requests. A second strategy is to limit excessive signaling from dedicated UE. > > Nodes in the EPC should have a proven overload protection function that shields the nodes from failure and ensures high throughput during extreme overload. This is achieved by proper prioritization of services and subscribers, and distinguishing between initial traffic for a subscriber and subsequent signaling. SECOND PROTECTION LINE An example of the second line of protection is the DSC hosting the function of a Diameter Agent as specified in RFC 6733 Diameter Base Protocol. > > A Diameter Agent simplifies the network architecture and reduces the number of connections that need to be maintained in the network. The DSC acts as a centralized signaling router in the network. > > The Diameter Agent itself should provide a proven overload protection mechanism and be able to process offered traffic that exceeds the dimensioned capacity of the node multiple times in cases where the first protection line cannot sufficiently limit a traffic peak. > > The Diameter Agent should provide load balancing capabilities toward other interfacing Diameter peers. This ensures optimal usage of network elements and reduces load peaks on dedicated Diameter servers such as the PCRF, the online charging system, and the HSS, which typically reside in the third protection line. > > The load balancing function should consider the varying capacity of the interfacing Diameter peers. > > The load balancing function could steer the preference for which Diameter peers are primarily used as the routing target, and by this opt for local servers over remote servers. > > Nevertheless, the load balancing function could be modified dynamically considering the actual traffic sent, which would actively reduce traffic peaks that would be otherwise sent to already overloaded diameter peers. > > The Diameter Agent should provide traffic shaping capabilities for interfacing Diameter servers or clients. This actively prevents the propagation of signaling storms in the network. > > By traffic shaping outgoing diameter traffic, the Diameter Agent prevents overloading interfacing servers. It thus offers further protection to peers against potential signaling bursts from nodes inside the network or from roaming partners. > > By traffic shaping incoming diameter traffic, the Diameter Agent prevents Diameter clients from abusively using network resources beyond operator-determined limits. One of the paramount examples is a restarting MME node that could flood the HSS servers, causing deterioration to the services of all other MME nodes. > > Typically, a Diameter Agent has very limited knowledge of the semantics of Diameter applications. To prioritize traffic in an intelligent way, the Diameter Agent could be configured with semantically relevant data, so that in congestion situations, messages of ongoing sessions can be treated with higher priority than messages related to new sessions. This adds the capability to perform application-aware traffic management in the Diameter Agent. > > The Diameter Agent is also in control of all Diameter interfaces in a network. An overload indication on one interface can lead to a reduction of traffic on another interface. In cases where a charging server becomes overloaded, the DSC can initiate throttling on initial attach messages. This reduces the number of new subscribers following the principle of addressing overload protection as close to the source of overload as possible. THIRD PROTECTION LINE User data management systems and, in particular, user databases sit at the end of the signaling chain. During a network overload scenario, user databases will therefore naturally be under pressure, likely becoming the first overloaded component in the network. On the other hand, user databases are the heart of telecommunications networks in the sense that they are vital to deliver consumer services, and a failure or degradation of the user database performance might compromise the complete network. In the user data management space, 3GPP standardized a BUILDING ROBUST SIGNALING NETWORKS DISTRIBUTED SIGNALING NETWORK ROBUSTNESS THE CONCEPT OF THREE PROTECTION LINES 8

data-layered architecture named User Data Convergence [4]. In this architecture, traditional network databases in the core network such as the home location register (HLR), the HSS, authentication, authorization and accounting (AAA), and the policy controller are split into Application Front Ends (AFEs), which handle the business logic, and a user data repository (UDR), which takes the role of the user database storing the user data. In order to minimize the severity and duration of overload incidents, user data management systems should: > > secure their availability and user data integrity at all times, no matter how severe the overload incident might be. For that purpose, overload protection functionalities are a must for both AFEs and UDRs. > > maximize end-to-end useful throughput during overload. Different complementary strategies can be applied, such as: > > ensuring throughput elasticity during overload with adequate latency tradeoff > > intelligent traffic throttling: cooperative load regulation in the user data management systems (Front End (FE) and UDR). As previously described in Figure 2, a system will typically experience throughput degradation with increasing levels of overload, and databases are not an exemption. User data management systems require a more intelligent throttling of excess traffic during overload to ensure that resources from the UDR are fully utilized to process useful end-to-end traffic. The cornerstone of this mechanism is the cooperation between the AFEs in the network, such as the HLR, the HSS, AAA and the UDR. The UDR should constantly monitor the resource utilization levels, such as the response time and length of the buffers. As soon as some of the resources reach their limit, the situation is reported back to the AFE as an overload indication. Based on the overload indication sent from the UDR, the AFEs should throttle the traffic according to the following principles: Processed throughput (MAP, diameter) Maximum throughput Engineered capacity Dimensioned capacity Normal operation Overload Engineered capacity > > The most important signaling messages are not throttled. > > Ongoing operations that involve several messages to the UDR are prioritized over new operations. Typically, a Mobile Application Part (MAP)/Diameter operation received in the AFE involves several messages sent to the UDR. The throttling level should be continuously adjusted based on a dynamic and real-time feedback loop. This ensures the UDR always performs to its maximum capacity and avoids the throughput degradation caused by devoting resources to rejecting the excess traffic. Figure 5 shows the expected behavior of a user data management system with cooperative load regulation between AFE, and a UDR with throughput elasticity during overload as compared with a standard system. With standard node level UDR overload control Figure 5: Overload performance behaviour of a user data management system (AFE+UDR) with cooperative load regulation. With AFE-UDR intelligent throttling Offered load (MAP, diameter) BUILDING ROBUST SIGNALING NETWORKS DISTRIBUTED SIGNALING NETWORK ROBUSTNESS THE CONCEPT OF THREE PROTECTION LINES 9

Conclusion Operators are confronted with increasing complexity in mobile networks. At the same time, signaling traffic is continuously increasing, driven by the growth of smartphone traffic and new usage scenarios presented by the IoT. The performance and availability of the signaling network are essential for service delivery to the customer. Existing overload protection mechanisms that only focus on dedicated overloaded nodes cannot prevent larger-scale outages. Any outages in the signaling network will lead to service interruptions, causing financial losses and increasing the risk of subscriber churn. A robust and distributed signaling network that follows the concept of three protection lines provides an end-to-end overload protection solution that fulfills the objectives of a robust signaling network. A robust and scalable network architecture, optimization of the signaling traffic, and a distributed and coordinated overload protection mechanism boosts the availability of the signaling network beyond today s measures. BUILDING ROBUST SIGNALING NETWORKS CONCLUSION 10

References [1] Ericsson, February 2015, Ericsson Mobility Report, Mobile World Congress Edition, available at: http://www.ericsson.com/res/docs/2015/ericsson-mobility-report-feb-2015-interim.pdf [2] ENISA (European Union Agency for Network and Information Security), August 2013, Annual Incident Reports 2012, available at: http://www.enisa.europa.eu/activities/resilience-and-ciip/incidents-reporting/annual-reports/ annual-incident-reports-2012-1/annual-incident-reports-2012/at_download/fullreport [3] Heavy Reading, October 2013, Mobile Network Outages & Service Degradations: A Heavy Reading Survey Analysis, available at: http://www.heavyreading.com/details.asp?sku_id=3103&skuitem_itemid=1524&promo_code=&aff_ code=&next_url=%2flist.asp%3fpage_type%3dall_reports [4] 3GPP TS 23.335, accessed June 2014, User Data Convergence (UDC); Technical realization and information flows; Stage 2, available at: http://www.3gpp.org/dynareport/23335.htm BUILDING ROBUST SIGNALING NETWORKS REFERENCES 11

GLOSSARY AAA AFE DSC ENISA EPC FE HLR HSS IMS IOT MAP MME OCS PCRF P-CSCF PDN-GW SGSN Signaling Network UDR UE VNF authentication, authorization and accounting Application Front End Diameter Signaling Controller European Union Agency for Network and Information Security Evolved Packet Core Front End home location register Home Subscriber Server IP Multimedia Systems Internet of Things Mobile Application Part Mobility Management Entity Online Charging System policy and charging rules function proxy call session control function packet data network gateway Serving GPRS Support Node Part of a telecommunications network carrying signaling traffic user data repository user equipment virtualized network function 2015 Ericsson AB All rights reserved BUILDING ROBUST SIGNALING NETWORKS GLOSSARY 12