The Rise of Big Data Spurs a Revolution in Big Analytics
|
|
|
- Roderick Shelton
- 10 years ago
- Views:
Transcription
1 REVOLUTION ANALYTICS EXECUTIVE BRIEFING The Rise of Big Data Spurs a Revolution in Big Analytics By Norman H. Nie, CEO Revolution Analytics The enormous growth in the amount of data that the global economy now generates has been well documented, but the magnitude of its potential impact to drive competitive advantage has not. It is my hope that this briefing urges all stakeholders executives who must fund analytics initiatives, IT teams that support them and data scientists, who uncover and communicate meaningful insight to go boldly in the direction of Big Analytics. This opportunity is enormous and without precedent. Growth in Data Volumes Big Data refers to datasets that are terabytes to petabytes (and even exabytes) in size and comes from various sources, including mobile digital data creation devices and digital devices used in both the public and private sectors. The massive size of Big Data go beyond the ability of average database software tools to capture, store, manage, and analyze them effectively. Previously, limited innovations in hardware, processing power, and data storage were the Achilles heel of Big Data analysis. Yet, recent technological innovations and advancements have helped in overcoming previous limitations to Big Data analysis. Organizations today are collecting a broader variety of data at a greater velocity than before, resulting in immense growth in data volume. According to a 2011 study by the McKinsey Global Institute, organizations in varying sectors capture trillions of bytes of information about their customers, suppliers, and operations through digital systems. For example, millions of networked sensors embedded in mobile phones, automobiles and other products, are continually creating and communicating data. The result is a 40 percent projected annual growth in the volume of data generated. Fifteen out of 17 sectors in the U.S. economy have more data stored per company than the U.S. Library of Congress which has collected more than 235 terabytes of data in April 2011 alone. By any definition, that s Big Data. Clearly, we have an abundance of data. But what makes this a genuinely revolutionary moment is that thanks to Moore s Law we also have the computing power and the ability to store, retrieve, and analyze such data, all at an affordable price. Recent innovations have had the combined effect of providing commodity multi-core processors and distributed computing frameworks that allow software such as R to exploit this new generation of hardware. Other innovations that utilize these changes include external memory algorithms, data chunking, and the ability to execute these in distributed computing environments. Today s computing hardware is more than capable of crunching through datasets that were considered insurmountable just a few years back. Copyright 2011 Revolution Analytics 1
2 Advances in Analytics and Visualization through the Evolution of Second-Generation, Big Analytics Platforms Until recently, only clustered samples of observations were used for statistical analysis, despite the burgeoning size and proportions of datasets. Reliance on a sample of the population within a universe of observations was a necessity for statistical analysis because of both the limitations and costs of hardware and the limitations of software to work with big data. Thus, the method of sampling compromised the degree of accuracy and depth of analysis possible. While most organizations have invested heavily in first-generation analytics applications, recent advances in second-generation Big Analytics platforms such as Revolution R Enterprise have improved both analytical and organizational performance. Big Analytics platforms are optimized for Big Data and utilize today s massive datasets, thanks to recent performance advancements coupled with innovative analytic platforms. In addition to the analytic routines themselves, data visualization techniques and the way the analytics are executed on various kinds of hardware platforms have drastically improved and increased in capabilities. Armed with the advantages of Big Data, advanced computing hardware, and a new generation of software capable of exploiting these changes, the potential impact of Big Analytics is not trivial. For example, McKinsey estimates that Big Data could help the U.S. health care industry generate $300 billion in value every year, resulting in an 8-percent reduction in national spending on health care. Big Data could help U.S. retailers boost margins by 60 percent. According to McKinsey, governments in Europe could save $149 billion by using Big Data to improve operational efficiency. Open Source R plus Revolution Analytics: A Duo of Disruptive Forces in Analytics Revolution R Enterprise is a driving force in the Big Analytics Revolution. It is based on R, an open source programming language designed expressly for statistical analysis. With more than two million users and growing, R has emerged as the de facto standard for computational statistics and predictive analytics. The worldwide R community has generated more than 4,300 packages (specialized techniques and tools) that can be downloaded free from sites such as CRAN (Comprehensive R Archive Network). Basically, this means that if you are looking for an advanced analytic technique, you can probably find what you need or something very close to it with a few keystrokes and the click of a mouse. Despite the many advantages of working with open-source R, one challenge is its in-memory computation model, which limits the amount of data that can be analyzed to the size of available RAM. Building on of the unique architecture of R as a functional programming language, Revolution Analytics has engineered groundbreaking additions to R through their analytics software Revolution R Enterprise. Today, Revolution R Enterprise is the fastest, most economical, and most powerful predictive analytics product on the market. Revolution R Enterprise also provides a highly efficient file structure along with data chunking capabilities to enable the processing of big data sets across all available computational resources, either on a single server or in a distributed environment. In addition, it includes multi-threaded external-memory algorithms for the key statistical procedures used to analyze Big Data. These two features, included in Revolution R Enterprise as the RevoScaleR package, provide for analysis of virtually unlimited data sizes and an almost linear increase in performance. Furthermore, Revolution R Enterprise provides capabilities to work with data stored in large-scale data warehousing architectures including Hadoop and IBM Netezza. Thus, inherent within the Revolution R architecture is the ability to continue integration into additional high speed, distributed computing platforms. These additions are all designed to benefit from Revolution Analytics ongoing parallelization of statistical software to run on a wide variety of platforms. 2
3 Business and Strategic Advantages of Analyzing Big Data The new wave in technological innovation and the rise of Big Data represents a significant turning point that provides a clear opportunity to break from past practices and revolutionize data analytics. Google s innovative MapReduce was a programming model designed to process large datasets, and served as a first step towards Big Data processing. Today, new platforms have also emerged in the market to utilize Big Data and engage in Big Analytics. What, then, can Big Analytics do that could not be accomplished with smaller datasets? Let s look at five specific advantages of moving towards Big Analytics. Move Beyond Linear Approximation First, Big Analytics allows one to move beyond linear approximation models towards complex models of greater sophistication. While it is true that bigger doesn t always mean better, small datasets often limit our ability to make accurate predictions and assessments. Previous limitations in technology and data limited us to certain statistical techniques, particularly in the form of linear approximation models. Linear approximation models make causal inferences and predictions between two separate variables of interest. When the explanatory variable and the outcome variable reflect a linear relationship on an X-Y axis, linear models are an optimal tool to predict and analyze results. The method is commonly used in business environments for forecasting and financial analysis. For example, investors and traders may calculate linear models to recognize how prices, stocks, or revenues increase or decrease depending on certain factors such as timing, the level of GDP, and the like. However, using linear models on non-linear relationships limits our ability to make accurate predictions. For example, an investor may want to determine the trend and long-term movement of GDP or stock prices over an extended period of time. However, certain events or specific periods may influence the outcome and may not be capable of being captured in a linear model. To overcome such possibilities, more sophisticated models are necessary. Today, access to large datasets allows for more sophisticated models that are accurate and precise. An example is the manipulation of a continuous variable into a set of discrete, categorical variables. When we have a variable with a wide range of values, say annual income or corporate sales revenue, we may want to observe how specific values or intervals within the variable influence the outcome of interest. We can convert our continuous variable into discrete, categorical variables (or factors, as we refer to them in R) by converting them into variables that takes on the value 1 if the observation is included within the category, and 0 otherwise. While increasing the complexity and sophistication of the model, the shift from a continuous to a categorical variable can better determine the exact relationship. 3
4 An example illustrates this concept: We want to determine the relationship between total years of education, total years of experience in the workforce, and annual earnings. To determine this relationship, we can rely on US Census data and obtain a sample of the population with millions of observations. Regressing years of education and years in the labor force on annual earnings as continuous variables shows that both variables have a significant and positive linear impact on annual earnings. While the linear approximation appears to work, appearances can be deceiving. In fact, the linear model above is under-predicting, inadequate and misleading. When looking at both years of education and years in the labor force, the linear approximation hides as much as it reveals. If you convert years of education and years in the labor force into factors and regress them on annual earnings, we are able to make better predictions. It turns out that education has no discernible impact on earnings until an individual has attained a high school degree. After attaining a high school diploma, each additional year and degree (e.g., BA, MA, and professional degrees) brings higher earnings and greater rewards. This impact is reflected less accurately in a linear model. While earnings explosively grow year by year for the first 15 years in the workforce, there is very slow and marginal growth for the next decade, and income remains constant or actually declines afterwards. Think of what such a simple change in model specification means to an organization in terms of predicting revenues, frequency of sales, or manufacturing costs and production yields. More sophisticated models can lead to greater precision and accuracy in forecasts. The reason for using traditional approximation techniques was straightforward: We had no choice and we were forced to do so because we simply didn t have enough data or the technological capabilities. However, this is no longer the case and we can vastly improve predictability when we move beyond linear approximation models with access to larger datasets, powerful technological capabilities, and most importantly, through the emergence of effective tools like Revolution R to juggle both developments. Furthermore, it s also important to note that analyzing the data set which is large, but not large enough to qualify as Big Data took just five seconds on a commodity 8-core laptop. The recent turn towards Big Data and the emergence of tools like Revolution R that are capable of dealing with such data allow for greater accuracy and precision. By moving away from the limitations of past models and techniques, our ability to make predictions and assessments has drastically improved, and this advance benefits every sector of our economy. As businesses, corporations, and organizations are under greater pressure to produce more accurate and precise forecasts to maximize output, the demand of working with big data and utilizing the most efficient and effective tools have never been greater. Data Mining and Scoring Second, Big Analytics allows for predictive analyses utilizing techniques such as data mining and statistical modeling. When the goal of analysis is to build the best model to see the reality, the use of such techniques coupled with Big Data vastly improves predictability and scoring. These techniques can handle hundreds of thousands of coefficients based on millions of observations in large datasets to produce scores for each observation. For example, credit worthiness, purchasing propensity, trading volumes, and churn in mobile telephone clients can all be approached this way. 4
5 Here s an example, based on the data made famous in the American Statistical Association s 2009 Data Expo Challenge ( The data set contains information on all U.S. domestic flights between 1987 and 2008, and includes more than 123 million observations. It nicely illustrates the opportunities that Big Data sets provide for building models that are not feasible with smaller data sets. For example, linear regressions estimate a linear model for time spent in the air as a function of the interactions of origin airports and destination airports plus the effects of unique carriers and days of the week. This model, which produces over 122,000 coefficients, could not be estimated with a small dataset. If you replace time in the air with minutes of daily cell phone use or determinates of trading volume taking into account thousands of conditions on hundreds of millions of trades, you can see the potential economic value of this approach. As more companies adopt cloud-computing strategies, predicting peak cloud loads will become another critical use for these kinds of newer, non-traditional analytic techniques. Again, it is worth noting that the processing time for this problem on an 8-core standard laptop is approximately 140 seconds with Revolution Analytics RevoScaleR package. Scoring on the basis of this model requires little or no additional time. Thus, even predictive analysis like data mining and statistical modeling is revolutionized with Big Data. Big Data and Rare Events Third, Big Data vastly improves the ability to locate and analyze the impact of rare events that might escape detection in smaller data sets with more limited variables and observations. Big Data analytics are much more likely to find the Black Swans, or uncommon events that often get overlooked as possible outcomes. Consider the analysis of casualty insurance claims. In a typical year, the vast majority of policyholders file no claims, and only a fraction of policyholders actually file claims. However, among this small fraction of policyholders, a smaller fraction of policyholders files costly claims of large magnitude and ultimately determines the loss or profitability of the whole portfolio. While claims related to floods, hurricanes, earthquakes, and tornadoes are relatively rare, they are also disproportionately large and costly. Underwriters use special variants of general linear models (Poisson and Gamma regressions) to analyze claims data. These datasets are enormous in size and tax the capacity limits of legacy software by tying up powerful servers for dozens of hours to complete a single analysis. However, when using newer, predictive analytics designed with Big Data in mind, one can radically reduce the cost and time of handling these types of files. Imagine the strategic business value of being able to predict rare events with greater accuracy. 5
6 Extracting and Analyzing Low Incidence Populations Fourth, Big Data helps to extract and analyze cases that focus on low incidence populations. In many instances, data scientists have trouble locating low incidence populations within a large data set. Low incidence populations might be rare manufacturing failures, treatment outcomes of patients with rare diseases or extremely narrow segments of high-value customers. For example, low incidence populations can refer to individuals with genetic disorders amongst a sample population. The challenge of locating low incidence populations is made even more difficult when only a fraction of the dataset is sampled for analysis, a prevalent concern in the past whereby data scientists utilized samples that were collected from the total population. The few observations one has to work with makes it difficult to capture the outcome of interest and make predictive assessments and forecasts. To analyze such low incidence populations requires screening and extracting numerous observations to find enough cases in what may be little more than only.001 % of a population. In situations like these, new hardware and software systems utilizing parallel computing are capable of generating insight that go beyond the scope or ability of traditional analytics. Take 23andMe, a revolutionary personal genomics and biotechnology company at the forefront of technology utilizing the power of the internet and recent advances in DNA analysis technology to engage in genetic research. In addition to providing members with personal genetic information using web-based interactive tools, 23andMe offers members an opportunity to participate in 23andMe s research studies and advance genetic research. Through their initiative, 23andMe has built a database of over 100,000 members, and is currently researching over 100 diseases, conditions, and traits. With their rapidly expanding genomic database and their unique ability to pinpoint specific groups of interest, 23andMe is at the forefront and intersection of scientific and technological discovery. For example, 23andMe s most recently published study includes a discovery of the origins of two new genetic regions associated with Parkinson s disease. Furthermore, by engaging with big data and computational power, Wired Magazine found that 23andMe s research took a mere eight months to complete. In contrast, Wired estimated that a typical research project undertaken by the NIH would have taken six years to come to the same conclusion. 23andMe provides an exemplary case of how the new revolution in big data can influence scientific discovery and research. Recent developments and the presence of Big Data allow for greater predictive and analytical power. With larger datasets, there is less concern for data scientists to make inaccurate forecasts. This has made the accuracy of extracting and analyzing low incidence populations much more effective. 6
7 Big Data and the Conundrum of Statistical Significance Finally, Big Data allows one to move beyond inference and statistical significance and move towards meaningful and accurate analyses. With first-generation tools, analyses were based on experimentation and inference, and data scientists and statisticians sought to maximize statistical significance in their models. The concept of statistical significance emerged out of experimental design with random assignment of small numbers of observations; necessitated by hardware constraints, limits to analytical tools, and the difficulty of obtain the entire universe of cases. Small, representative samples of the population are tested against chance and are generalized to the entire population. In this regard, the methods and tools utilized were aimed at maximizing the significance of models. Results that maximized probability estimates (or p-values) were both vital and necessary. By contrast, the core advantage of Big Analytics is its use of massively larger datasets that represents the majority, if not most of the population of interest. Since Big Data is unbiased and incorporates the entire population, there is little need to emphasize statistical significance and allows one to move towards making accurate and meaningful analyses. Perhaps this is the real value of Big Data it is what it is. Big Data does not rely on and depend on techniques to smooth over issues, a concern that previously haunted smaller datasets that may have been biased and unreflective of the population of interest. Big Analytics implemented with second-generation tools like Revolution R Enterprise leads to more accurate, reliable, and useful analyses for decision-making in the real world. Conclusion As this briefing has demonstrated, Big Analytics comes with a wide range of advances and benefits and the opportunity for greater innovation in business models, products, and services. Access to larger datasets with powerful tools means greater accuracy, transparency, and predictive power. Through access to Big Analytics, data scientists and statisticians from all sectors of the economy are now empowered to experiment with the large dataset and discover new opportunities and needs, expose greater variability, and improve performance and forecasting. The combination of increasing access to Big Data, access to affordable, high-performance hardware, and the emergence of second-generation, analytical tools like Revolution R Enterprise provides greater reasons for stakeholders to invest in Big Analytics. The convergence of these trends means that data analysts need no longer rely on traditional analytic methods that were developed largely to cope with the inherent challenges of using small data sets and running complex analyses on expensive configurations of hardware and software. More important, this convergence provides organizations with the capabilities required to analyze large data sets quickly and cost-effectively for the first time in history. The dawn of the Big Data Revolution brings us to a turning point and unique moment in the history of data analysis that eclipses the traditions of the past. This revolution is neither theoretical nor trivial and represents a genuine leap forward and a clear opportunity to realize enormous gains in efficiency, productivity, revenue, and profitability. The Age of Big Analytics is here. 7
8 About the Author Norman H. Nie is President and CEO of Revolution Analytics, a company aimed at expanding the use of R to galvanize a predictive analytics market he helped create as co-founder of SPSS. With a distinguished career in both business and academia, Nie brings strategic vision, experienced management and proven execution. Nie co-invented the Statistical Package for Social Sciences (SPSS) while a graduate student at Stanford University and co-founded a company around it in He served as president and CEO through 1992 and as Chairman of the Board of Directors from 1992 to Under his leadership, SPSS grew to become a leader and pioneer in statistical analysis. He is also Chairman of Board of Directors for Knowledge Networks, a survey research firm he co-founded in Nie is Professor Emeritus in Political Science at the University of Chicago and Stanford University, where he also founded the Stanford Institute for Quantitative Study of Society (SIQSS). He is a two-time recipient of the Woodrow Wilson Award (for the best book in Political Science published in the prior year) and was recognized with a lifetime achievement award by the American Association of Public Opinion Research (AAPOR). He was recently appointed Fellow of The American Academy for the Arts and Sciences. Nie was educated at the University of the Americas in Mexico City, Washington University in St. Louis and Stanford University, where he received a Ph.D. in political science. About Revolution Analytics Revolution Analytics delivers advanced analytics software at half the cost of existing solutions. Led by predictive analytics pioneer and SPSS co-founder Norman Nie, the company brings high performance, productivity, and enterprise readiness to open-source R, the most powerful statistics software in the world. Leading organizations including Merck, Bank of America and Acxiom rely on Revolution R Enterprise for their data analysis, development and mission-critical production needs. Revolution Analytics is committed to fostering the growth of the R community, and offers free licenses of Revolution R Enterprise to academia. Revolution Analytics is headquartered in Palo Alto, Calif. and backed by North Bridge Venture Partners and Intel Capital. Contact Us Join the R Revolution at [email protected] Telephone:
RevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
Harnessing the power of advanced analytics with IBM Netezza
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
Cloud Workload Planning and Placement: A New Opportunity
Cloud Workload Planning and Placement: A New Opportunity Table of Contents 3 Challenges 4 Gravitant s Unique Approach 4 cloudmatrix Overview 5 How Does it Work? 5 Application Screener 6 Cloud Compare 6
Three proven methods to achieve a higher ROI from data mining
IBM SPSS Modeler Three proven methods to achieve a higher ROI from data mining Take your business results to the next level Highlights: Incorporate additional types of data in your predictive models By
APICS INSIGHTS AND INNOVATIONS EXPLORING THE BIG DATA REVOLUTION
APICS INSIGHTS AND INNOVATIONS EXPLORING THE BIG DATA REVOLUTION APICS INSIGHTS AND INNOVATIONS ABOUT THIS REPORT Big data is a collection of data and technology that accesses, integrates and reports all
Big Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres [email protected] Talk outline! We talk about Petabyte?
In-Database Analytics
Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing
The Power of Predictive Analytics
The Power of Predictive Analytics Derive real-time insights with accuracy and ease SOLUTION OVERVIEW www.sybase.com KXEN S INFINITEINSIGHT AND SYBASE IQ FEATURES & BENEFITS AT A GLANCE Ensure greater accuracy
Using Data Mining to Detect Insurance Fraud
IBM SPSS Modeler Using Data Mining to Detect Insurance Fraud Improve accuracy and minimize loss Highlights: Combine powerful analytical techniques with existing fraud detection and prevention efforts Build
Intro to Big Data and Business Intelligence
Intro to Big Data and Business Intelligence Anjana Susarla Eli Broad College of Business What is Business Intelligence A Simple Definition: The applications and technologies transforming Business Data
AdTheorent s. The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising. The Intelligent Impression TM
AdTheorent s Real-Time Learning Machine (RTLM) The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising Worldwide mobile advertising revenue is forecast to reach $11.4 billion
BEYOND BI: Big Data Analytic Use Cases
BEYOND BI: Big Data Analytic Use Cases Big Data Analytics Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence
Better decision making under uncertain conditions using Monte Carlo Simulation
IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics
SAP Brief SAP HANA Objectives Transform Your Future with Better Business Insight Using Predictive Analytics Dealing with the new reality Dealing with the new reality Organizations like yours can identify
The Rise of Industrial Big Data
GE Intelligent Platforms The Rise of Industrial Big Data Leveraging large time-series data sets to drive innovation, competitiveness and growth capitalizing on the big data opportunity The Rise of Industrial
How Big Is Big Data Adoption? Survey Results. Survey Results... 4. Big Data Company Strategy... 6
Survey Results Table of Contents Survey Results... 4 Big Data Company Strategy... 6 Big Data Business Drivers and Benefits Received... 8 Big Data Integration... 10 Big Data Implementation Challenges...
Big data and its transformational effects
Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven
[x+1] Completes Next-Generation POE; Its Origin Enterprise Data Management Platform for Automated, Big Data-Driven Marketing Optimization
REVOLUTION CASE STUDY [x+1] Completes Next-Generation POE; Its Origin Enterprise Data Management Platform for Automated, Big Data-Driven Marketing Optimization Revolution R Enterprise Tapped for High-Performance,
ANALYTICS BUILT FOR INTERNET OF THINGS
ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that
TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS
9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence
WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics
WHITE PAPER Harnessing the Power of Advanced How an appliance approach simplifies the use of advanced analytics Introduction The Netezza TwinFin i-class advanced analytics appliance pushes the limits of
DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases
DATAMEER WHITE PAPER Beyond BI Big Data Analytic Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence
Using In-Memory Computing to Simplify Big Data Analytics
SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed
Prescriptive Analytics. A business guide
Prescriptive Analytics A business guide May 2014 Contents 3 The Business Value of Prescriptive Analytics 4 What is Prescriptive Analytics? 6 Prescriptive Analytics Methods 7 Integration 8 Business Applications
ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V
ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V WHITE PAPER Create the Data Center of the Future Accelerate
Using Data Mining to Detect Insurance Fraud
IBM SPSS Modeler Using Data Mining to Detect Insurance Fraud Improve accuracy and minimize loss Highlights: combines powerful analytical techniques with existing fraud detection and prevention efforts
Explode Six Direct Marketing Myths
White Paper Explode Six Direct Marketing Myths Maximize Campaign Effectiveness with Analytic Technology Table of contents Introduction: Achieve high-precision marketing with analytics... 2 Myth #1: Simple
Big Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
Understanding the Benefits of IBM SPSS Statistics Server
IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster
Big Data, Big Traffic. And the WAN
Big Data, Big Traffic And the WAN Internet Research Group January, 2012 About The Internet Research Group www.irg-intl.com The Internet Research Group (IRG) provides market research and market strategy
Big Data Analytics in R
Big Data Analytics in R Big Opportunity, Big Challenge September, 2011 Revolution Confidential Most advanced statistical analysis software available Half the cost of commercial alternatives The professor
ICD-10 Advantages Require Advanced Analytics
Cognizant 20-20 Insights ICD-10 Advantages Require Advanced Analytics Compliance alone will not deliver on ICD-10 s potential to improve quality of care, reduce costs and elevate efficiency. Organizations
Get to Know the IBM SPSS Product Portfolio
IBM Software Business Analytics Product portfolio Get to Know the IBM SPSS Product Portfolio Offering integrated analytical capabilities that help organizations use data to drive improved outcomes 123
There s no way around it: learning about Big Data means
In This Chapter Chapter 1 Introducing Big Data Beginning with Big Data Meeting MapReduce Saying hello to Hadoop Making connections between Big Data, MapReduce, and Hadoop There s no way around it: learning
!!!!! BIG DATA IN A DAY!
BIG DATA IN A DAY December 2, 2013 Underwritten by Copyright 2013 The Big Data Group, LLC. All Rights Reserved. All trademarks and registered trademarks are the property of their respective holders. EXECUTIVE
White Paper. How Streaming Data Analytics Enables Real-Time Decisions
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
Navigating Big Data business analytics
mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what
BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
Big data big talk or big results?
Whitepaper 28.8.2013 1 / 6 Big data big talk or big results? Authors: Michael Falck COO Marko Nikula Chief Architect [email protected] Businesses, business analysts and commentators have
CONNECTING DATA WITH BUSINESS
CONNECTING DATA WITH BUSINESS Big Data and Data Science consulting Business Value through Data Knowledge Synergic Partners is a specialized Big Data, Data Science and Data Engineering consultancy firm
Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014
White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page
KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011
Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis
We are Big Data A Sonian Whitepaper
EXECUTIVE SUMMARY Big Data is not an uncommon term in the technology industry anymore. It s of big interest to many leading IT providers and archiving companies. But what is Big Data? While many have formed
PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ]
[ WhitePaper ] PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. Over the past decade, the value of log data for monitoring and diagnosing complex networks has become increasingly obvious.
High-Performance Analytics
High-Performance Analytics David Pope January 2012 Principal Solutions Architect High Performance Analytics Practice Saturday, April 21, 2012 Agenda Who Is SAS / SAS Technology Evolution Current Trends
Big Data and Healthcare Payers WHITE PAPER
Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other
Strategic Decisions Supported by SAP Big Data Solutions. Angélica Bedoya / Strategic Solutions GTM Mar /2014
Strategic Decisions Supported by SAP Big Data Solutions Angélica Bedoya / Strategic Solutions GTM Mar /2014 What critical new signals Might you be missing? Use Analytics Today 10% 75% Need Analytics by
Big Data at Cloud Scale
Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For
Reaping the Rewards of Big Data
Reaping the Rewards of Big Data TABLE OF CONTENTS INTRODUCTION: 2 TABLE OF CONTENTS FINDING #1: BIG DATA PLATFORMS ARE ESSENTIAL FOR A MAJORITY OF ORGANIZATIONS TO MANAGE FUTURE BIG DATA CHALLENGES. 4
The big data revolution
The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing
Big Data Hope or Hype?
Big Data Hope or Hype? David J. Hand Imperial College, London and Winton Capital Management Big data science, September 2013 1 Google trends on big data Google search 1 Sept 2013: 1.6 billion hits on big
Leading Genomics. Diagnostic. Discove. Collab. harma. Shanghai Cambridge, MA Reykjavik
Leading Genomics Diagnostic harma Discove Collab Shanghai Cambridge, MA Reykjavik Global leadership for using the genome to create better medicine WuXi NextCODE provides a uniquely proven and integrated
Big Data Challenges. technology basics for data scientists. Spring - 2014. Jordi Torres, UPC - BSC www.jorditorres.
Big Data Challenges technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Data Deluge: Due to the changes in big data generation Example: Biomedicine
Hadoop for Enterprises:
Hadoop for Enterprises: Overcoming the Major Challenges Introduction to Big Data Big Data are information assets that are high volume, velocity, and variety. Big Data demands cost-effective, innovative
Achieving customer loyalty with customer analytics
IBM Software Business Analytics Customer Analytics Achieving customer loyalty with customer analytics 2 Achieving customer loyalty with customer analytics Contents 2 Overview 3 Using satisfaction to drive
High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances
High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances Highlights IBM Netezza and SAS together provide appliances and analytic software solutions that help organizations improve
International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]
Business Analytics and the Nexus of Information
Business Analytics and the Nexus of Information 2 The Impact of the Nexus of Forces 4 From the Gartner Files: Information and the Nexus of Forces: Delivering and Analyzing Data 6 About IBM Business Analytics
Delivering the power of the world s most successful genomics platform
Delivering the power of the world s most successful genomics platform NextCODE Health is bringing the full power of the world s largest and most successful genomics platform to everyday clinical care NextCODE
Big Data 101: Harvest Real Value & Avoid Hollow Hype
Big Data 101: Harvest Real Value & Avoid Hollow Hype 2 Executive Summary Odds are you are hearing the growing hype around the potential for big data to revolutionize our ability to assimilate and act on
Outlook for Cloud technology in Big Data and Mobility
2014 Outlook for Cloud technology in Big Data and Mobility Wayne Collette, CFA, Senior Portfolio Manager Cloud technology is a significant and ongoing trend affecting the technology industry. A new breed
HIGH PERFORMANCE ANALYTICS FOR TERADATA
F HIGH PERFORMANCE ANALYTICS FOR TERADATA F F BORN AND BRED IN FINANCIAL SERVICES AND HEALTHCARE. DECADES OF EXPERIENCE IN PARALLEL PROGRAMMING AND ANALYTICS. FOCUSED ON MAKING DATA SCIENCE HIGHLY PERFORMING
Flash Memory Arrays Enabling the Virtualized Data Center. July 2010
Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,
In-Memory Analytics for Big Data
In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...
Data Management Practices for Intelligent Asset Management in a Public Water Utility
Data Management Practices for Intelligent Asset Management in a Public Water Utility Author: Rod van Buskirk, Ph.D. Introduction Concerned about potential failure of aging infrastructure, water and wastewater
Understanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
Introduction to Strategic Supply Chain Network Design Perspectives and Methodologies to Tackle the Most Challenging Supply Chain Network Dilemmas
Introduction to Strategic Supply Chain Network Design Perspectives and Methodologies to Tackle the Most Challenging Supply Chain Network Dilemmas D E L I V E R I N G S U P P L Y C H A I N E X C E L L E
Why your business decisions still rely more on gut feel than data driven insights.
Why your business decisions still rely more on gut feel than data driven insights. THERE ARE BIG PROMISES FROM BIG DATA, BUT FEW ARE CONNECTING INSIGHTS TO HIGH CONFIDENCE DECISION-MAKING 85% of Business
Interactive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
Generating the Business Value of Big Data:
Leveraging People, Processes, and Technology Generating the Business Value of Big Data: Analyzing Data to Make Better Decisions Authors: Rajesh Ramasubramanian, MBA, PMP, Program Manager, Catapult Technology
The Business Analyst s Guide to Hadoop
White Paper The Business Analyst s Guide to Hadoop Get Ready, Get Set, and Go: A Three-Step Guide to Implementing Hadoop-based Analytics By Alteryx and Hortonworks (T)here is considerable evidence that
Build an effective data integration strategy to drive innovation
IBM Software Thought Leadership White Paper September 2010 Build an effective data integration strategy to drive innovation Five questions business leaders must ask 2 Build an effective data integration
Virtual Data Warehouse Appliances
infrastructure (WX 2 and blade server Kognitio provides solutions to business problems that require acquisition, rationalization and analysis of large and/or complex data The Kognitio Technology and Data
