DATAOPT SOLUTIONS What Is Big Data?
WHAT IS BIG DATA? It s more than just large amounts of data, though that s definitely one component. The more interesting dimension is about the types of data. So Big Data is increasingly about more complex structures and how you go about capturing and analyzing transaction and interaction data. We also see a lot of different data sources such as image, text, voice, machine data and so on, whose structure can change depending on the analysis. Big Data is also about big analytics. It s about applying more sophisticated algorithms to get a deeper insight out of the data without compromising on the scale. At the end of the day, the most important thing about the data is what you can do with it. Unlike with other techniques such as BI, Big Data allows people to tap into much larger amounts of data on the fly. It s about using the data to discover new insights.
BUSINESS REQUIREMENT ON BIG DATA Because of the business requirement of analyzing vast amount of ever changing structured and unstructured Big Data almost instantaneously, companies will be hard pressed to do this on their own. But given the fact that Big Data stored in cloud can be accessed from anywhere the internet is available and can be analysed almost instantaneously by third party service providers, outsourcing companies can offer to their clients value added services in the area of Big Data analytics without heavy investments on the part of clients in specialized hardware and software as was the case with traditional data analytics. This will bring down significantly costs (especially fixed costs) associated with building and maintaining analytics infrastructure and solution center. I expect all major IT Services and Consulting companies to invest heavily in building delivery capability in the area of Big Data/Analytics to tap into this opportunity.
WHAT MAKES IT BIG DATA? Volume: The amount of data generated by companies and their customers, competitors, and partners continues to grow exponentially Velocity: Data continues changing at an increasing rate of speed, making it difficult for companies to capture and analyze Variety: It s no longer enough to collect just transactional data Analysts are increasingly interested in new data types. These data types add richness that supports more detailed analyses Complexity: With more details and sources, the data is more complex and difficult to analyze
CHARACTERISTICS OF BIG DATA
AN APPETITE OF DATA
BIG DATA SOURCES
NEW DATA CATEGORIES How does your enterprise s data suddenly balloon from gigabytes to hundreds of terabytes and then on to petabytes? One way is that you start working with entirely new classes of information. While much of this new information is relational in nature, much is not. In the past, most relational databases held records of complete, finalized transactions. In the world of Big Data, sub-transactional data plays a big part, too, and here are a few examples: Click trails through a website Shopping cart manipulation Tweets Feedback & Comments Text messages
WHY NOT REPLACE ANALYTICAL RELATIONAL DATABASES WITH HADOOP? Analytical relational databases were created for rapid access to large data sets by many concurrent users. Typical analytical databases support SQL and connectivity to a large ecosystem of analysis tools. They efficiently combine complex data sets, automate data partitioning and index techniques, and provide complex analytics on structured data. They also offer security, workload management, and service-level guarantees on top of a relational store. Thus, the database abstracts the user from the mundane tasks of partitioning data and optimizing query performance. Since Hadoop is founded on a distributed file system and not a relational database, it removes the requirement of data schema. Unfortunately, Hadoop also eliminates the benefits of an analytical relational database, such as interactive data access and a broad ecosystem of SQL compatible tools. Integrating the best parts of Hadoop with the benefits of analytical relational databases is the optimum solution for a big data analytics architecture.
WHAT YOU CAN DO WITH BIG DATA? Big Data has the potential to revolutionize the way you do business. It can provide new insights into everything about your enterprise, including the following: The way your customers locate and interact with you The way you deliver products and services to the marketplace The position of organization vs. your competitors Strategies you can implement to increase profitability And many more
THE CHALLENGES OF CONVERTING BIG DATA Addressing the multiple challenges posed by big data volumes is not easy. Unlike transactional data, which can be stored in a stable schema that changes infrequently, interactional data types are more dynamic. They require an evolving schema, which is defined dynamically often on-the-fly at query runtime. The ability to load data quickly, and evolve the schema over time if needed, is a tremendous advantage for analysts who want to reduce time to valuable insights. Some data formats may not fit well into a schema without heavy pre-processing or may have requirements for loading and storing in their native format. Dealing with this variety of data types efficiently can be difficult. As a result, many organizations simply delete this data or never bother to capture it at all.