Building Heavy Load Messaging System

Similar documents
GigaSpaces Real-Time Analytics for Big Data

Zend and IBM: Bringing the power of PHP applications to the enterprise

Integration Maturity Model Capability #5: Infrastructure and Operations

MySQL. Leveraging. Features for Availability & Scalability ABSTRACT: By Srinivasa Krishna Mamillapalli

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Scaling Database Performance in Azure

WHITE PAPER RUN VDI IN THE CLOUD WITH PANZURA SKYBRIDGE

Open source business rules management system

Software-Defined Networks Powered by VellOS

How To Test A Web Server

Digital Messaging Platform. Digital Messaging Platform. AgilityHarmony. Orchestrate more meaningful relationships between you and your customers

Solr Cloud vs Replication

MakeMyTrip CUSTOMER SUCCESS STORY

Open Source Business Rules Management System Enables Active Decisions

Developing a Highly Available, Dynamic Hybrid Cloud Environment

THE WINDOWS AZURE PROGRAMMING MODEL

This document has for purpose to elaborate on how Secomea have addressed all these topics with a solution consisting of the three components:

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Migration Scenario: Migrating Backend Processing Pipeline to the AWS Cloud

Performance TesTing expertise in case studies a Q & ing T es T

Accelerating Time to Market:

Optimize Your SharePoint 2010 Content Using Microsoft s New Storage Guidance

Responsive, resilient, elastic and message driven system

PLUMgrid Toolbox: Tools to Install, Operate and Monitor Your Virtual Network Infrastructure

IBM PureFlex System. The infrastructure system with integrated expertise

COMP5426 Parallel and Distributed Computing. Distributed Systems: Client/Server and Clusters

EMC NETWORKER AND DATADOMAIN

Distributed Software Development with Perforce Perforce Consulting Guide

Scalable Architecture on Amazon AWS Cloud

Integrating Web Messaging into the Enterprise Middleware Layer

CLOUD DEVELOPMENT BEST PRACTICES & SUPPORT APPLICATIONS

Cloud Computing and Advanced Relationship Analytics

ARC VIEW. Stratus High-Availability Solutions Designed to Virtually Eliminate Downtime. Keywords. Summary. By Craig Resnick

Building Multi-Site & Ultra-Large Scale Cloud with Openstack Cascading

Real Time Data Processing using Spark Streaming

REDEFINE SIMPLICITY TOP REASONS: EMC VSPEX BLUE FOR VIRTUALIZED ENVIRONMENTS

Cisco Data Center Services for OpenStack

Cisco Integrated Video Surveillance Solution: Expand the Capabilities and Value of Physical Security Investments

Software Development In the Cloud Cloud management and ALM

Cloud CRM. Scalable solutions for enterprise deployment

90% of your Big Data problem isn t Big Data.

Ikasan ESB Reference Architecture Review

SYSPRO Integration SYSPRO Integration Framework

DLT Solutions and Amazon Web Services

THE ENSIGHTEN PROMISE. The Power to Collect, Own and Activate Omni-Channel Data

API Architecture. for the Data Interoperability at OSU initiative

Intel IT s Cloud Journey. Speaker: [speaker name], Intel IT

Migration and Disaster Recovery Underground in the NEC / Iron Mountain National Data Center with the RackWare Management Module

A Survey Study on Monitoring Service for Grid

Software Life-Cycle Management

COMPARISON OF VMware VSHPERE HA/FT vs stratus

CUMULUX WHICH CLOUD PLATFORM IS RIGHT FOR YOU? COMPARING CLOUD PLATFORMS. Review Business and Technology Series

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1.

CLOUD COMPUTING SOLUTION - BENEFITS AND TESTING CHALLENGES

Database Development Best Practices. Database Development Best Practices. Copyright 2006 Quest Software

SysAid Cloud Architecture Including Security and Disaster Recovery Plan

Chapter 5. Regression Testing of Web-Components

How Solace Message Routers Reduce the Cost of IT Infrastructure

IBM WebSphere MQ File Transfer Edition, Version 7.0

Databricks. A Primer

Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads

BigMemory & Hybris : Working together to improve the e-commerce customer experience

Technology Insight Series

emind Webydo Moves to the Google Cloud Platform (GCP) with Emind For a Scalable Cloud Customers Stories by Overview About Webydo

Everything You Need To Know About Cloud Computing

Cloud Computing Patterns Fundamentals to Design, Build, and Manage Cloud Applications

Scaling Out With Apache Spark. DTL Meeting Slides based on

GLOBAL DIGITAL ENTERTAINMENT CONTENT AND SERVICES PROVIDER JESTA DIGITAL MIGRATES TO THE ULTRAESB

Pluribus Netvisor Solution Brief

Sentinet for BizTalk Server SENTINET 3.1

ENZO UNIFIED SOLVES THE CHALLENGES OF OUT-OF-BAND SQL SERVER PROCESSING

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

VMware Hybrid Cloud. Accelerate Your Time to Value

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Planning and Implementing Disaster Recovery for DICOM Medical Images

Table of Contents. Abstract. Cloud computing basics. The app economy. The API platform for the app economy

1 Product. Open Text is the leading fax server vendor in the world. *

Efficient database auditing

Databricks. A Primer

EVOLVED DATA CENTER ARCHITECTURE

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

Nine Use Cases for Endace Systems in a Modern Trading Environment

Transcription:

CASE STUDY Building Heavy Load Messaging System About IntelliSMS Intelli Messaging simplifies mobile communication methods so you can cost effectively build mobile communication into your business processes; Marketing or Operations. Whether you are an application provider, telecommunication services company or a large business we have the service and application offerings to support your needs. About SoftwareMill SoftwareMill, Typesafe Consulting partner, is a software house based in Poland that specializes in delivering customized software solutions. Their clients come from the telecommunications, banking, logistics and entertainment industries. A wide range of systems can be found in their portfolio, starting from high-performance mobile application backends, through fault-tolerant messaging gateways, big-data reporting and credit management systems. The Challenge IntelliSMS is present on the SMS market for a long time, since the year 2000, and through the years evolved to offer a wide range of SMS-related services: a wholesale enterprise SMS gateway, bulk and individual SMS sends, account and billing management for resellers, REST API for integrating with applications, 2-way SMS - replying to unique messages, and many others. A couple generations of the MessageCore platform, backing the above mentioned services, were developed.

The previous system, MessageCore4 (MC4), worked well in the beginning, but as more and more customers signed up and message volumes increased scalability problems started to surface, which hindered the business. Also, the code base grew large and hard to maintain, due to a number of features being added and removed over the years. That s when IntelliSMS turned to SoftwareMill to take part in developing the next generation system, MC5. The core requirements were to create a scalable architecture which would accommodate the increasing messaging volume. Another important requirement was for the messaging to be reliable. A large part of the messaging traffic is alert messaging, so it is vital that the messages are not lost. Combining a replicated, persistent message storage, which will deliver messages even in case of a downstream service not being available or a server crashing permanently, with the scalability and performance requirements, provided a challenging task for our team to work on. When creating the new system, an important factor was to maintain a clean, extendable code base, where new features can be implemented and plugged in relatively quickly, without a need to alter existing code in a significant way.

The Solution We started by drafting a system architecture. While we are strong believers in the Agile methodologies of developing and delivering software, and we try to minimize the amount of upfront planning and design, some planning is always necessary, to roughly guide the development. A good architecture allows the team to meet the basic system requirements, while being flexible enough to allow specific design decisions to be made as late as possible. The architecture of MC5 assumed dividing the system into a number of independent components: a submission server, where users submitted sms send request, and queries for sms status workers, which through a series of pluggable handlers provided message routing, billing, reacting to reply SMS messages SMSC communication servers, which send the messages downstream to specific SMS providers reporting servers, to provide statistics and insights into SMS sending/delivery pattern

Each component is independent, can be scaled independently, and doesn t have to work at the same time as the other components. This is especially important for the SMSC communication servers, which connect to third-party providers; such links are inherently unreliable, and can go up and down unpredictably. The components are written using pure Java; some of them are deployed in a servlet container, but most as stand-alone, fat-jars, which allows for easy management and fast provisioning. This adds to the simplicity of the system, which results in a lower operations and maintenance overhead. The components communicate asynchronously through a persistent, replicated message queue. The queue implementation is pluggable, and currently we are using a custom solution based on MongoDB. As we are also using MongoDB for other purposes, and as MongoDB is great both because of the ease of installation, use and performance for streaming, replicated operations, it was a natural choice. Our Mongo-based queue easily handles thousands of persistent, replicated messages per second, and can be scaled using the Starling/Kestrel model, by adding new sets of replicated nodes and using a random set of nodes when sending/receiving messages. Apart from messaging, the system needs to persist SMS information at each stage of processing the message: when the SMS is accepted for sending, routed, delivered, and finally billed. This forms a natural stream of events, which get persisted to a Mongo collection. Not only we are able to handle very large volume of events efficiently (each event is immutable, so the Mongo collection is essentially append-only), but we get a clear audit trail for each message, in case we need to debug the system behavior. Finally, the reporting system uses SQL database, which is populated using event sourcing from the Mongo collections. This can also be done efficiently, thanks to how Mongo works and the immutable, streaming nature of events.

An SQL database was chosen for reporting because of the familiarity of the system users with the language and the flexibility in forming queries, and getting responses quickly. Designing for performance and scalability is one thing, making sure the system is actually performant and scalable is another. That is why one of the first tasks we completed, after a skeleton of the system was done, was setting up an overnight, automated stress test, where the system was sending and receiving simulated SMS messages over 4 hours. This let us iron out many integration issues, on the Operating System, Load Balancer, Mongo and application levels. As the test runs each day, after a day of work we quickly know when we have degraded the system s performance. As mentioned in the introduction, code quality and maintainability was key when developing the project. We are using a number of now standard methods, like (fast) unit and (slower) integration tests, adhering to clean code rules, paying attention to naming, keeping classes small, with a single responsibility and so on. Each change is also code-reviewed by other team members, both to catch bugs and design flows, but also to spread the knowledge about the system. Remote project delivery But technology isn t everything. Equally, if not more important is the way teams collaborate to deliver software. As IntelliSMS is based in Melbourne, Australia, and SoftwareMill in Warsaw, Poland, we formed a truly distributed team (SoftwareMill is additionally a fully distributed company, our employees come from many parts of the country), working remotely across two quite different time zones. We tried to keep the formalism of our cooperation as low as possible. When it seemed beneficial, we use tools such as TinyPM or Confluence. We had no strict requirement documents, instead we tried to communicate as frequently as possible to deliver what will be the highest value for IntelliSMS, and hence the highest value for IntelliSMS users. That way we could adapt quickly based on the continuously gathered feedback.

Moreover, at SoftwareMill we have a number of proven methods for effective asynchronous and synchronous communication in a distributed setup. In the end the fact that we work remotely was barely noticeable. The Result The system is in production for almost two years now; it has proved to work well under load, and the architecture met the original requirements. The code base is kept in a good shape, allowing existing developers to modify the code quickly and new developers to join the project without problems. We have delivered a performant and resilient SMS platform, which can be scaled to meet future traffic demands, and evolved to include new features, to meet emerging business needs. With a very difficult development task with some very high benchmarks SoftwareMill has delivered a great result and done so very cost effectively. - Peter Humphries, Executive Director at Intelli Messaging. Contact We take your mind off software development. Just drop us a line: - we ll get back to you! SoftwareMill delivers custom software solutions: web applications, back end systems and enterprise solutions. We specialize in Java, Scala and Cloud technologies with particular interest in JBoss, Amazon Web Services and Big Data projects. We develop software solutions with care and strong belief in the agile approach.