EXECUTIVE SUMMARY Big Data is not an uncommon term in the technology industry anymore. It s of big interest to many leading IT providers and archiving companies. But what is Big Data? While many have formed their own definition, Sonian believes that big data generally refers to a set of data or information so large that it requires new architectures to store, manage, search and analyze its value. More enterprise data will be created in the next three years than in the history of the planet. Customers need to preserve this information, and will pay for features that mine data for actionable intelligence. Greg Arnette Founder and CTO - Sonian, Inc. Big Data is not new and has been around for a long time. Nearly everyday, 2.5 quintillion bytes of data are created. Out of all the data that currently exists, 90% of it has been created within the last two years. With large amounts of business information living in traditional mailboxes and computers, besides the traditional enterprise software systems, managing non- transactional data has become even more daunting. From log files, to click stream data, to web indexing, email messages and attachments, social media posts and more, internet data centers are collecting massive volumes of data that need to be processed at a low cost in order to drive monetary value. According to International Data Corporation (IDC), enterprise data allegedly doubles every 18 months. IT struggles to cost- effectively manage all this growth, and then there is the huge, untapped value in all that data. However, the cloud can handle and organize this enterprise data. THE IMPORTANCE OF BIG DATA Companies must understand the importance of Big Data in order to address the unstructured information overload. Big Data is a key component as an enterprise looks to address its business agility, which cannot evolve without IT agility. In having an aggregate view of all its data a company can answer difficult questions from customers or analysts. Most importantly, by analyzing their Big Data trends, statistics, and other actionable information to help decide on their next move, companies can grow their business by uncovering important information. Marketing and sales departments want a 360- degree lifetime view of the customer; they want a centralized view of the customer s retail, website, email and telephone interactions that can be pulled up immediately when a customer walks in the door or talks with a call center agent. Likewise, as more enterprises engage in social media to answer customer questions, correct misunderstandings or to learn from customer feedback, responding in real time or in near real time to Facebook or Twitter updates becomes significantly more necessary.
In today s technology world, it is vital for an organization to understand that it doesn t make sense to grow storage capacity linearly. A company should however, be analyzing and understanding the information it has so it can then identify how much of that information is valuable, establish processes, and leverage tools that enable it to effectively keep the information that is valuable and discard the information that is not. This is a key driver in the decision to move to cloud computing. Cloud computing is helping to democratize the data management issue that face companies today and in the years to come. THE BIG DATA MARKET OPPORTUNITY Big Data has created a large market opportunity for technology vendors. These vendors are broken into two main categories: storage vendors and data warehousing and BI vendors. Storage vendors focus on the sheer size of the information while data warehousing and BI vendors focus on the need for advanced statistical and predictive analytical capabilities to sift through the vast volumes of information. Today, some of the largest technology companies are investing in Big Data. Sonian partners with many of these large companies, which enables our customers to receive the best of breed cloud computing solution. The Numbers Speak for Themselves In 2010, the Digital Universe (a fancy term for all the data created by consumers and businesses on earth, including video, audio, documents, etc.) will grow by 1.2 zettabytes, or 1.2 million petabytes. In 2009, there were 150 exabytes of traffic on the Internet, and 245 exabytes in 2010, and the Internet could hit 1,000 exabytes of traffic by 2015 thanks to more than one billion people joining the web. By 2020, the Digital Universe will be 44 times as large as it was in 2009. By 2020, much of this data will be held in cloud environments or will be "touched by cloud," which means data that transits through a cloud service or is temporarily held in a cloud application. Almost every industry today has petabytes of customer, business, network, and web data but few organizations have the right tools to extract optimal insights from this glut of information. Stacey Higginbotham Conference Chair and Senior Writer at GigaOM TAMING BIG DATA WITH THE CLOUD A recent statement by Joe Corvaia, Vice President, Solutions Engineering, at Broadview Networks sums up the business value of cloud computing and Big Data. Cloud computing has evolved from terminology used in connection with big data centers and large IT departments to something understood by business decision- makers at all levels. Business owners are increasingly aware of how the cloud model can change the way they consume and use information technology. Now that it has become more common knowledge in board rooms, we can expect to have many more discussions about how cloud can increase productivity, improve operations, and reduce the total cost of IT ownership."
Cloud Computing: Big Data Technology paper, Booz Allen McKinsey & Company published a whitepaper in May 2011 titled Big Data: The next frontier for innovation, competition and productivity. Within the report, it is made crystal clear the importance that not only the volume but the value that Big Data will have in providing corporations with an important tool in looking for ways to compete. The research points out that structured data is growing at a tremendous rate and is not focused solely on unstructured data and other non- SQL sources. The importance of Big Data is called out in one of the many key statements found within the report Simply making data more easily accessible to relevant stakeholders in a timely manner can create tremendous value. At Sonian we believe that Big Data is the future of the enterprise and tackling Big Data will determine the winners and losers in the next wave of cloud computing innovation. GETTING YOUR ARMS AROUND BIG DATA Today, it is a standard practice to consider a company handling over a petabyte (1,000 terabytes) of data is a Big Data company. But it is not standard practice to consider Big Data as just a size problem. Many companies having fewer than hundreds of terabytes and they too can have a Big Data problem, because the consumption and nature of information has dramatically changed. As a Big Data company, Sonian views the growth challenges of Big Data across several dimensions - quantity, speed, and size. Quantity refers to a large quantity of small items, like a Twitter data feed. Each tweet is limited to 140 characters, but with over 200 million users, Twitter can accumulate a lot of data. Size refers to a small quantity of large items, like Netflix. This video distribution company can stream hundreds of thousands of movies and shows, each about 4GB in size, which we term large. Speed refers to the ability to quickly find a digital needle in a gigantic data haystack. With big data typically being unstructured, new kinds of searching mechanisms (including alternative indexing strategies) are required to handle their quantity and size.
Size, speed and quantity each have their own unique challenges. But together, they create an enormous challenge for today s enterprise corporations. Managing information can be time consuming and costly, which can quickly evaporate valuable human resources. Big Data is hard to manage on- premises. The backup, recovery, and indexing processes can all get out of control with ever growing data sets. SECURING BIG DATA While size, speed and quantity are important dimensions of managing Big Data, security is also important and should be acknowledged. Security is vital to corporations because enterprise data is not public data, it is private data. Organizations are beginning to realize the tremendous risk with the explosion of unstructured information in email, images, log files, user- generated content, documents, videos, blogs, contracts, wikis, web content and more. To add to that risk, about 80 percent of the data in an enterprise is unstructured, that of which cannot easily be categorized into databases. Leveraging the public cloud, Sonian provides customers with 99.99% data retention SLA by using numerous geographically separate data centers that leverage high- speed networks to interconnect with one another. Unlike other SaaS archiving providers, Sonian provides a private data silo for you and your company. We store data in multiple clouds providing customers with a Global RAID that is better than anything offered on- premise or in a single cloud. With this highly reliable storage infrastructure, we offer a three- layered security system. Sonian uses Secure Socket Layers (SSL) to encrypt all communication between the web browser and the data center. In addition, our use of a processing pipeline ensures performance and data privacy between our customer accounts. YOUR EMAIL DATA IS BIG DATA Email management continues to be a significant challenge for most organizations. With email volumes growing exponentially at 20% each year, organizations across the globe are faced with a daunting task of managing this BIG DATA problem by adding new hardware and software, maintenance, and administration to accommodate the explosive volume of email. And to complicate matters, many organizations are obligated to satisfy specific email retention and retrieval regulations, which cannot be met with traditional out- of- the- box email server/client products (PST files). Corporate management needs control over an organization s data and access to historical email. Emails are permanent records of a company and represent versions of past conversations and documents. Emails are considered business records and need to be managed accordingly to mitigate risk. Regulations and internal policies consider email to be a form of business communication that requires regulatory monitoring and oversight. IT technicians are overburdened with technology issues. Daily challenges include finding and recovering email from backup tapes, CDs and failed hard drives, controlling mailbox sizes and managing increasing email volumes.
THE SONIAN SOLUTION FOR BIG DATA Today, Sonian s cloud- powered archiving system ingests and indexes over 15 million objects per day. And that number continues to grow. With enterprise data growing at such a fast rate, the cloud is becoming an important ingredient to solve these enterprise storage needs. Sonian s utilization of a cloud infrastructure to deliver SaaS services enables enterprise companies to become more agile, making that which is complex simpler. Since Sonian was founded in 2007, the company has accumulated over 8,000 customers. Providing a secure service and fast search, Sonian is currently managing over a petabyte of data. When your enterprise wants to mash- up your data with other public and private data sources, Sonian has the answer. Within a matter of hours, using no hardware or software, Sonian customers can store and search all types of data (emails, attachments, social media, etc.) with unlimited retention, for as long as that data is needed. Ultimately, Sonian provides a safe place to park enterprise data for long- term compliance and analytical benefits. Sonian s Mission is to Organize Enterprise Data and Make it Accessible and Useful.