E-Book Big-data analytics explained: current trends, technologies and issues As business intelligence and data analytics become more and more critical to business success, organizations in various data-intensive industries are looking to analyze larger and larger data sets. But big data analytics is complicated and high-cost, and there s an array of technology choices to consider. Before launching an initiative, be sure you know what you re getting into. This ebook, designed for IT, BI and analytics professionals as well as business executives, will give readers a comprehensive overview of big-data analytics trends, technologies and challenges. Get real-world insight and advice on developing, implementing and managing a big-data analytics program for your organization. Sponsored By:
E-Book Big-data analytics easier said than done but doable Table of Contents Five first steps to creating an effective big-data analytics program The wrong way: Worst practices in big-data analytics programs Big-data analytics programs require tech savvy, business know-how Resources from Hewlett-Packard Sponsored By: Page 2 of 15
Big-data analytics easier said than done but doable By Lyndsay Wise, SearchBusinessAnalytics.com Contributor Big data has become one of the most talked-about trends and yes, buzzwords within the business intelligence (BI), analytics and data management markets. A growing number of organizations are looking to BI and analytics vendors to help them answer business questions in big-data environments. Unfortunately, gaining visibility into pools of big data is easier said than done. And with vendors marketing a wide variety of technology offerings aimed at addressing the challenges of big-data analytics, businesses may be hard-pressed to identify the one that best meets their needs. So, what is big data really? A recent story by the IT publication eweek offered the following take on it, based partly on Gartner Inc. s definition of the term: Big data refers to the volume, variety and velocity of structured and unstructured data pouring through networks into processors and storage devices, along with the conversion of such data into business advice for enterprises. That hits the mark in terms of data management and the analytics part of the equation, but it misses the essential aspect of the business challenges surrounding big data: complexity. For instance, big-data installations often involve information from social media networks, emails, sensors, Web activity logs and other data sources that doesn t fit easily into traditional data warehouse systems. And in many cases, disparate data needs to be pulled together in order to make sense of it on a broader level. That can have big implications for business rules, table joins and other components of big-data analytics systems. The complexity of big data is what really makes it different from more conventional data when it comes to storage and query management, and it s the main reason why analytical database and data analytics software vendors have had to step up their game to help companies deal with big data. Understanding big data is the first step in assessing your technology needs and putting a big-data analytics plan in place. The second is understanding the market and the current Sponsored By: Page 3 of 15
trends that are affecting organizations looking to derive business value, and competitive advantages, from increasingly large and diverse data sets. Many businesses have always had large data sets, of course. But now, more and more companies are storing terabytes and terabytes of information, if not petabytes. In addition, they re looking to analyze key data multiple times daily or even in real time a change from traditional BI processes for examining historical data on a weekly or monthly basis. And they want to process more and more complex queries that involve a variety of different data sets. That might include transaction data from enterprise resource planning and customer relationship management systems, plus social media and geospatial data, internal documents and other forms of information. Increasingly, companies also want to give business users self-service BI capabilities and make it easier for them to understand analytical findings. All of that can play into a big-data analytics strategy, and technology vendors are addressing those needs in different ways. Many database and data warehouse vendors are focusing on the ability to process large amounts of complex data in a timely fashion. Some are using columnar data stores in an effort to enable quicker query performance, or providing built-in query optimizers, or adding support for open source technologies such as Hadoop and MapReduce. In-memory analytics tools may help speed up the analysis process by reducing the need to transfer data from disk drives, while data virtualization software and other real-time data integration technologies can be used to assemble information from disparate data sources on the fly. Ready-made analytics applications are being tailored to vertical markets that routinely have to deal with big data for instance, the telecommunications, financial services and online gaming industries. Data visualization tools can simplify the process of presenting the results of big-data analytics queries to corporate executives and business managers. Organizations that fit into the categories described above in relation to their data and analytics needs should begin by considering the following issues and questions, among others, before creating an implementation plan and finalizing their big-data infrastructure choices: Sponsored By: Page 4 of 15
The required timeliness of data, as not all databases support real-time data availability. The interrelatedness of data and the complexity of the business rules that will be needed to link various data sources to get a broad view of corporate performance, sales opportunities, customer behavior, risk factors and other business metrics. The amount of historical data that needs to be included for analysis purposes. If one data source only contains two years of information but five are required, how will that be handled? Which technology vendors have experience with big-data analytics in the same industry, and what is their track record? Who is responsible for the various data entities within an organization, and how will those people be involved in the big-data analytics initiative? Those considerations don t constitute in-depth requirements planning, but they can help businesses get started on the road to deploying a big-data analytics system and identifying the technology that will support it. About the Author: Lyndsay Wise is president and founder of WiseAnalytics, an independent analyst firm based in Toronto that focuses on business intelligence, master data management and unstructured data management. Sponsored By: Page 5 of 15
Five first steps to creating an effective big-data analytics program By Lyndsay Wise, SearchBusinessAnalytics.com Contributor Organizations embarking on big data analytics projects require a strong implementation plan to make sure that the analytics process works for them. Choosing the technology that will be used is only half the battle when preparing for a big-data initiative. Once a company identifies the right database software and analytics tools and begins to put the technology infrastructure in place, it s ready to move to the next level and develop a real strategy for success. The importance of effective project management processes to creating a successful big-data analytics program also cannot be overstated. The following five tips offer advice on steps that businesses should take to help ensure a smooth deployment: Decide what data to include and what to leave out. By their very nature, big-data analytics initiatives involve large data sets. But that doesn t mean all of a company s data sources, or all of the information within a relevant data source, will need to be analyzed. Organizations need to identify the strategic data that will lead to valuable analytic insights. For instance, what combination of information can help pinpoint key customer-retention factors? Or what data is required to uncover hidden patterns in stock market transactions? Focusing on a project s business goals in the planning stages can help an organization hone in on the exact analytics that are required, after which it can and should look at the data needed to meet those business goals. In some cases, that indeed will mean including everything. In others, though, it means using only a subset of the big data on hand. Build effective business rules and then work through the complexity they create. Coping with complexity is the key aspect of most big-data analytics initiatives. In order to get the right analytical outputs, it s essential to include business-focused data owners in the process to make sure that all of the necessary business rules are identified in advance. Once the rules are documented, technical staffers can assess how much complexity they create and the work required to turn the data inputs into relevant and valuable findings. That leads into the next phase of the implementation, as discussed below. Sponsored By: Page 6 of 15
Translate business rules into relevant analytics in a collaborative fashion. Business rules are just the first step in developing effective big-data analytics applications. Next, IT or analytics professionals need to create the analytical queries and algorithms required to generate the desired outputs. But that shouldn t be done in a vacuum. The better and more accurate that queries are in the first place, the less redevelopment will be required. Many projects require continual reiterations due to a lack of communication between the project team and business departments. Ongoing communication and collaboration leads to a much smoother analytics development process. Have a maintenance plan. In addition to the initial development work, a successful bigdata analytics initiative requires ongoing attention and updates. Regular query maintenance and keeping on top of changes in business requirements are important, but they represent only one aspect of managing an analytics program. As data volumes continue to increase and business users become more familiar with the analytics process, more questions that they want answered will inevitably arise. The analytics team must be able to keep up with the additional requests in a timely fashion. Also, one of the requirements when evaluating big-data analytics hardware and software options is assessing their ability to support iterative development processes in dynamic business environments. An analytics system will retain its value over time if it can adapt to changing requirements. Keep your users in mind all of them. With interest growing in self-service business intelligence (BI) capabilities, it shouldn t be shocking that a focus on end users is a key factor in big-data analytics programs. Having a robust IT infrastructure that can handle large data sets and both structured and unstructured information is important, of course. But so is developing a system that is usable and easy to interact with, and doing so means taking the varying needs of users into account. Different types of people from senior executives to operational workers, business analysts and statisticians will be accessing big-data analytics applications in one way or another, and their adoption of the tools will help ensure overall project success. That requires different levels of interactivity that match user expectations and the amount of experience they have with analytics tools for instance, building dashboards and data visualizations to present findings in an easy-tounderstand way to business managers and workers who aren t inclined to run their own bigdata analytics queries. Sponsored By: Page 7 of 15
There s no one way to ensure big-data analytics success. But following a set of frameworks and best practices, including the tips outlined above, can help organizations keep their bigdata initiatives on track. The technical details of a big-data installation are quite intensive and need to be looked at and considered in an in-depth manner. That isn t enough, though: Both the technical aspects and the business factors need to be taken into account to make sure that organizations get the desired outcomes from their big-data analytics investments. Sponsored By: Page 8 of 15
The wrong way: Worst practices in big-data analytics programs By Rick Sherman, SearchBusinessAnalytics.com Contributor Big data analytics is hot. Read any IT publication or website and you ll see business intelligence (BI) vendors and their systems integration partners pitching products and services to help organizations implement and manage big-data analytics systems. The ads and the big-data analytics press releases and case studies that vendors are rushing out might make you think it s easy that all you need for a successful deployment is a particular technology. If only it were that simple. While BI vendors are happy to tell you about their customers who are successfully leveraging big data for analytics uses, they re not so quick to discuss those who have failed. There are many potential reasons why big-data analytics projects fall short of their goals and expectations. You can find lots of advice on big-data analytics best practices; below are some worst practices so you know what to avoid. If we build, it they will come. This repeats the classic mistake made when organizations develop their first data warehousing or BI system. Too often, IT as well as BI and analytics program managers get sold on the technology hype and forget that business value is their first priority; data analysis technology is simply a tool used to generate that value. Instead of blindly adopting and deploying something, big-data analytics proponents first need to determine the business purposes that would be served by the technology in order to establish a business case and then choose and implement the right analytics tools for the job at hand. Without a solid understanding of business requirements, the danger is that project teams will end up creating a big-data disk farm that really isn t worth anything to the organization, earning them an unwanted spot in the data doghouse. Assuming that the software will have all the answers. Building an analytics system, especially one involving big data, can be complex and resource-intensive. As a result, many organizations hope the software they deploy will be a silver bullet that magically does it all for them. People should know better, of course but still they have hope. Software does Sponsored By: Page 9 of 15
help, sometimes dramatically. But big-data analytics is only as good as the data being analyzed and the analytical skills of those using the tools. Not understanding that you need to think differently. Often, people keep trying what has worked for them in the past, even when confronted with a different situation. In the case of big data, some organizations assume that big just means more transactions and large data volumes. It may, but many big-data analytics initiatives involve unstructured and semi-structured information that needs to be managed and analyzed in fundamentally different ways than is the case with the structured data in enterprise applications and data warehouses. As a result, new methods and tools might be required to capture, cleanse, store, integrate and access at least some of your big data. Forgetting all the lessons of the past. Sometimes enterprises go to the other extreme and think that everything is different with big data and they have to start from scratch. This mistake can be even more fatal to a big-data analytics project s success than thinking that nothing is different. Just because the data you re looking to analyze is structured differently doesn t mean the fundamental laws of data management have been rewritten. Not having the requisite business and analytical expertise. A corollary to the misconception that the technology can do it all is the belief that all you need are IT staffers to implement big-data analytics software. First, in keeping with the theme of generating business value mentioned above, an effective big-data analytics initiative has to incorporate extensive business and industry knowledge into both the system design stage and ongoing operations. Second, many organizations underestimate the extent of analytical skills that are needed. If big-data analysis is only about building reports and dashboards, enterprises likely can leverage their existing BI expertise. However, big-data analytics typically involves more advanced processes, such as data mining and predictive analytics. That requires analytics professionals with statistical, actuarial and other sophisticated skills, which might mean new hiring for organizations that are making their first forays into advanced analytics. Treating the project like it s a science experiment. Too often, companies measure the success of big-data analytics programs merely by the fact that data is being collected and then analyzed. In reality, collecting and analyzing the data is just the beginning. Analytics only produces business value if it is incorporated into business processes, enabling business Sponsored By: Page 10 of 15
managers and users to act upon the findings to improve organizational performance and results. To be truly effective, an analytics program also needs to include a feedback loop for communicating the success of actions taken as a result of analytical findings, followed by a refinement of the analytical models based on the business results. Promising and trying to do too much. Many big-data analytics projects fall into a big trap: Proponents oversell how fast they can deploy the systems and how significant the business benefits will be. Over-promising and under-delivering is the surest way to get the business to walk away from any technology, and it often sets back the use of the particular technology within an organization for a long time even if many other enterprises are achieving success. In addition, when you set expectations that the benefits will come easily and quickly, business executives have a tendency to underestimate the required level of involvement and commitment. And when a sufficient resource commitment isn t there, the expected benefits usually don t come easily or quickly and the project is labeled as a failure. Big-data analytics can produce significant business value for an organization, but it also can go horribly wrong if you aren t careful and don t learn from the mistakes made by other companies. Don t be the next poster child for how not to manage a big-data analytics deployment. About the Author: Rick Sherman is the founder of Athena IT Solutions, a Stow, Mass.- based firm that provides data warehouse and business intelligence consulting, training and vendor services. In addition to having more than 20 years of IT experience, Sherman writes on IT topics and is a frequent speaker at industry events. He blogs at The Data Doghouse and can be reached at rsherman@athena-solutions.com. Sponsored By: Page 11 of 15
Big-data analytics programs require tech savvy, business know-how By Rick Sherman, SearachBusinessAnalytics.com Contributor Articles and case studies written about big data analytics implementations proclaim the successes of organizations, typically with a focus on the technologies used to build the systems. Product information is useful, of course, but it s also critical for enterprises embarking on these projects to be sure they have people with the right skills and experience in place to successfully leverage big-data analytics tools. Knowledge of new or emerging big-data technologies, such as Hadoop, is often required, especially if a project involves unstructured or semi-structured data. But even in such cases, Hadoop know-how is only a small portion of the overall staffing requirements; skills are also needed in areas such as business intelligence (BI), data integration and data warehousing. Business and industry expertise is another must-have for a successful big-data analytics project, and many organizations also require advanced analytics skills, such as predictive analytics and data mining capabilities. Data scientist is a new job title that is getting some industry buzz. But it s really a combination of skills from the areas listed above, including predictive modeling, both statistical and business analysis, and data integration development. I ve seen some job postings for data scientists that require a statistical degree as well as an industry-specific one. Having both degrees and matching professional experience is great, but what are the chances of finding job candidates who do? Pretty slim! Taking a more realistic look at what to plan for, I describe below the key skill sets and roles that should be part of big-data analytics deployments. Instead of listing specific job titles, I ve kept it general; organizations can bundle together the required roles and responsibilities in various ways based on the capabilities and experience levels of their existing workforce and new employees that they hire. Business knowledge. In addition to technical skills, companies engaging in big-data analytics projects need to involve people with extensive business and industry expertise. Sponsored By: Page 12 of 15
That must include knowledge of a company s own business strategy and operations as well as its competitors, current industry conditions and emerging trends, customer demographics and both macro- and microeconomic factors. Much of the business value derived from big-data analytics comes not from textbook key performance indicators but rather from insights and metrics gleaned as part of the analytics process. That process is part science (i.e., statistics) and part art, with users doing what-if analysis to gain actionable information about an organization s business operations. Such findings are only possible with the participation of business managers and workers who have firsthand knowledge of business strategies and issues. Business analysts. This group also has an important role to play in helping organizations to understand the business ramifications of big data. In addition to doing analytics themselves, business analysts can be tasked with things such as gathering business and data requirements for a project, helping to design dashboards, reports and data visualizations for presenting analytical findings to business users and assisting in measuring the business value of the analytics program. BI developers. These people work with the business analysts to build the required dashboards, reports and visualizations for business users. In addition, depending on internal needs and the BI tools that an organization is using, they can enable self-service BI capabilities by preparing the data and the required BI templates for use by business executives and workers. Predictive model builders. In general, predictive models for analyzing big data need to be custom-built. Deep statistical skills are essential to creating good models; too often, companies underestimate the required skill level and hire people who have used statistical tools, including models developed by others, but don t have knowledge of the mathematical principles underlying predictive models. Predictive modelers also need some business knowledge and an understanding of how to gather and integrate data into models, although in both cases they can leverage other people s more extensive expertise in creating and refining models. Another key skill for modelers to have is the ability to assess how comprehensive, clean and current the Sponsored By: Page 13 of 15
available data is before building models. There often are data gaps, and a modeler has to be able to close them. Data architects. A big-data analytics project needs someone to design the data architecture and guide its development. Typically, data architects will need to incorporate various data structures into an architecture, along with processes for capturing and storing data and making it available for analysis. This role involves traditional IT skills such as data modeling, data profiling, data quality, data integration and data governance. Data integration developers. Data integration is important enough to require its own developers. These folks design and develop integration processes to handle the full spectrum of data structures in a big-data environment. In doing so, they ideally ensure that integration is done not in silos but as part of a comprehensive data architecture. It s also best to use packaged data integration tools that support multiple forms of data, including structured, unstructured and semi-structured sources. Avoid the temptation to develop custom code to handle extract, transform and load operations on pools of big data; hand coding can increase a project s overall costs not to mention the likelihood of an organization being saddled with undocumented data integration processes that can t scale up as data volumes continue to grow. Technology architects. This role involves designing the underlying IT infrastructure that will support a big-data analytics initiative. In addition to understanding the principles of designing traditional data warehousing and BI systems, technology architects need to have expertise in the new technologies that often are used in big-data projects. And if an organization plans to implement open source tools, such as Hadoop, the architects need to understand how to configure, design, develop and deploy such systems. It can be exciting to tackle a big-data analytics project and to acquire new technologies to support the analytics process. But don t be lulled into thinking that tools alone spell big-data success. The real key to success is ensuring that your organization has the skills it needs to effectively manage large and varied data sets as well as the analytics that will be used to derive business value from the available data. Sponsored By: Page 14 of 15
Resources from Hewlett-Packard Monetize: Big Data with Real time Analytics Vertica Customer Uses Cases Real-Time Analytics Strategies from the Real-World About Hewlett-Packard Hewlett-Packard is one of the world's largest computer companies and the foremost producer of test and measurement instruments. The company's more than 29,000 products are used by people for personal use and in industry, business, engineering, science, medicine and education. In addition, the company makes networking products, medical electronic equipment, instruments and systems for chemical analysis, handheld calculators and electronic components. HP is among the top 20 on the Fortune 500 list. The company had net revenue of $42.9 billion in its 1997 fiscal year. More than 56 percent of its business comes from outside the United States, and more than two-thirds of that is from Europe. Other principal markets are Japan, Canada, Australasia, the Far East and Latin America. HP ranks among the top 10 U.S. exporters. HP is No. 5 among Fortune's Most Admired Companies and No. 10 among Fortune's Best Companies to Work for in America. Sponsored By: Page 15 of 15