Best Practices for Scaling a Big Data Analytics Project
Putting an effective "big data" analytics plan in place can be a challenging proposition; thankfully, many proven data management and business intelligence best practices translate well to big data analytics. Discover best practices for scaling your big data project once you get started. Familiar Disciplines By: Beth Stackpole, Contributor With new terms, new skill sets, new products and new providers, the world of big data analytics can seem unfamiliar, but tried-and-true data management best practices do hold up well in this still-emerging discipline. As with any business intelligence (BI) and data warehouse initiative, experts say it s critical to have a clear understanding of an organization s data management requirements and a well-defined strategy before venturing too far down the big data analytics path. Big data analytics is widely hyped, and companies across all sectors are being flooded with new data sources and ever-larger amounts of information. Yet, making a big investment to attack the big data problem without first figuring out how doing so can really add value to the business is one of the most serious missteps for would-be users. Don t get too hung up on the technology -- start from a business perspective and have the conversation between the CIO, data scientists and businesspeople to figure out what the business objectives are and what value can be derived, and drive backwards from there, said David Menninger, an analyst at Ventana Research Inc. who focuses on BI, analytics and information management technologies. Defining exactly what data is available and mapping out how an organization can best leverage those resources is a key part of that exercise. CIOs, IT managers and BI and data warehouse professionals need to examine what Page 2 of 5
data is being retained, aggregated and utilized and compare that with what data is being thrown away, Menninger said. It s also critical, he added, to consider external data sources that are currently not being tapped but could be a compelling addition to the mix. Even if companies aren t sure how and when they plan to jump into big data analytics, there are benefits to going through this kind of an evaluation sooner rather than later, according to Menninger. And beginning the process of capturing data can also make you better prepared for the eventual leap. Even if you don t know what you re going to use it for, start capturing the information, he said. Otherwise, there is a missed opportunity, because you won t have that rich history of information [to draw on]. Start small with big data Analyzing big data sets is yet another instance where it makes sense to define small, high-value opportunities and use them as a starting point. As companies expand the data sources and types of information they re looking to analyze, and start to create the all-important analytical models that can help them uncover patterns and correlations in both structured and unstructured data, they need to be vigilant about homing in on the findings that are most important to their stated business objectives. If you end up in a place where all you re doing is looking for new patterns and you can t do anything with them, you ve hit a dead spot, said Gartner Inc. analyst Yvonne Genovese. ComScore Inc., a Reston, Va.-based company that tracks Internet usage and provides Web analytics and marketing intelligence services to corporate customers, knew early on that it would need some sort of big data strategy. But it picked very targeted spots and built out its big data analytics program over time. We started with small bites -- taking individual [data] flows and migrating them into different systems, said Will Duckworth, comscore s vice president of software engineering. If you re working with any kind of scale, you can t roll something like this out overnight. Page 3 of 5
Scale is something comscore is very conscious of, given the amount of data the company processes. Back in 2009, when it started collecting 300 million records a day, Duckworth began searching in earnest for a new set of systems and a technology infrastructure that could handle comscore s data processing needs -- now totaling 23 billion records a day and still growing -- in a far more cost-efficient fashion. but don t forget to think big Leveraging open source Hadoop technologies and emerging packaged analytics tools, Duckworth has been able to make the open source environment more familiar to business analysts trained in using SQL. He says companies need to consider scale as a primary factor when mapping out a big data analytics roadmap. You have to consider what the ramp-up will look like -- how much data will you be putting in six months from now, how many more servers will you need to handle that, is the software up to the task, he explained. People don t think about how much it is going to grow or how popular the solution might be once it s rolled into production. The other thing companies commonly lose sight of as they get enveloped in the new normal that is big data is that the old normal rules around data management still apply. Information governance practices are just as important today with the notion of big data as they were yesterday with data warehousing, said Marcus Collins, another Gartner analyst. Even though companies want flexibility in terms of processing, remember that information is a corporate asset and should be treated as such. Page 4 of 5
Free resources for technology professionals TechTarget publishes targeted technology media that address your need for information and resources for researching products, developing strategy and making cost-effective purchase decisions. Our network of technology-specific Web sites gives you access to industry experts, independent content and analysis and the Web s largest library of vendor-provided white papers, webcasts, podcasts, videos, virtual trade shows, research reports and more drawing on the rich R&D resources of technology providers to address market trends, challenges and solutions. Our live events and virtual seminars give you access to vendor neutral, expert commentary and advice on the issues and challenges you face daily. Our social community IT Knowledge Exchange allows you to share real world information in real time with peers and experts. What makes TechTarget unique? TechTarget is squarely focused on the enterprise IT space. Our team of editors and network of industry experts provide the richest, most relevant content to IT professionals and management. We leverage the immediacy of the Web, the networking and face-to-face opportunities of events and virtual events, and the ability to interact with peers all to create compelling and actionable information for enterprise IT professionals across all industries and markets. Related TechTarget Websites Page 5 of 5