Characterizing Big Data Management
|
|
|
- Gregory Sherman
- 10 years ago
- Views:
Transcription
1 Issues in Informing Science and Information Technology Volume 12, 2015 Cite as: Rossi, R., & Hirama, K. (2015). Characterizing Big Data Management. Issues in Informing Science and Information Technology, 12, Retrieved from Characterizing Big Data Management Rogério Rossi & Kechi Hirama University of São Paulo, São Paulo, Brazil Abstract Big data management is a reality for an increasing number of organizations in many areas and represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial to facilitate the management of big data in any kind of organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management can be supported by these three dimensions: technology, people and processes. Hence, this article discusses these dimensions: the technological dimension that is related to storage, analytics and visualization of big data; the human aspects of big data; and, in addition, the process management dimension that involves in a technological and business approach the aspects of big data management. Keywords: Big Data, Big Data Management, Big Data Challenges, Big Data Analytics, Decision-Making. Introduction Big data refers to the idea that a vast amount of data cannot be treated, processed and analyzed in a simplified way. To Bughin, Chui & Manyika (2010), nowadays, the data are identified in several environments in volumes never seen before, doubling every 18 months as a result of many types of databases as proprietary databases, databases derived from Web communities and from other types of intelligent data assets. Manyika et al. (2011) consider that big data refers to datasets whose size goes beyond typical databases that can be created, stored, managed and analyzed by existing tools; they also consider the need of creating new technologies for managing big data. It can be seen that several areas currently have data volumes from dozens of terabytes to multiple petabytes (thousands of terabytes). Fisher, DeLine, Czerwinski & Drucker (2012), however, consider that most often big data refer to the conception that the volume of data cannot be treated, processed and analyzed in a simplified way, requiring much more robust technologies, techniques and people with new skills for managing these large data sets. Material published as part of this publication, either on-line or in print, is copyrighted by the Informing Science Institute. Permission to make digital or paper copy of part or all of these works for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage AND that copies 1) bear this notice in full and 2) give the full citation on the first page. It is permissible to abstract these works so long as credit is given. To copy in all other cases or to republish or to post on a server or to redistribute to lists requires specific permission and payment of a fee. Contact [email protected] to request redistribution permission. As can be seen in Borkar, Carey & Li (2012b), actions related to big data reach various sectors for specific purposes, such as: 1) governments and business tracking contents of several Web social networks to perform sentiment analysis; 2) public sector organizations monitoring health research and various networks to evaluate and to treat epidemics; 3) commercial marketing Editor: Eli Cohen
2 Characterizing Big Data Management evaluating the actions of people through social networks in order to understand the behavior of their potential customers. Borkar, Carey & Li (2012b) argue that the support being offered to organizations considering data-intensive computing, research and analysis, as well as the ability to store data are generating significant challenges to big data management. For Chen, Chiang & Storey (2012) the big data era has reached several sectors, from government and e-commerce to healthcare organizations. The abundance of data in critical and high social impact sectors requires discussions on data and analytics characteristics. There are some examples of potential areas that deal with these vast amounts of data, such as 1) e-commerce and market intelligence, 2) e-government, 3) science, 4) health insurance, and 5) security. Grover (2014) points out that the ability to extract knowledge from vast amounts of data that are stored providing opportunities for big data systems can be used and applied to many sectors, such as 1) healthcare, 2) mobile networks, 3) video surveillance, 4) media and entertainment, 5) life sciences, 6) transportation, and 7) study environment. Russom (2013) considers that sensors spread throughout the world produce outrageous amounts of machine data, highlighting the challenges of capturing and managing these vast amounts of data that are generated continuously in real time and in multi-structured format. For any of the sectors, some issues must be considered to achieve satisfactory results from big data management. Manyika et al. (2011) consider the following factors as relevant to extract the best results with of big data management: 1) data policies definition, 2) specific technology and techniques; 3) organizational change and talents; 4) data access, and 5) infrastructure. Considering the application of big data by organizations of various industries the collection and storage of data are observed to be held in proportions unimaginable in the past. Examples can be seen in Bughin, Chui & Manyika (2010), who present some results related to Facebook, which in just two years has quintupled its user database, reaching 500 million users; and Manyika et al. (2011) who point out that, if considering specifically 2010, it appears that global companies exceeded seven exabytes (one exabyte corresponds to one million gigabytes) of data stored. However, Bughin, Chui & Manyika (2010) show that executives in different sectors are wondering about the difficulties to extract the best results from big data, enabling companies to capitalize the best answers from the abundance of data and enabling them to better manage knowledge to provide capacity for decision-making. Russom (2013) considers some preliminary difficulties for managing big data: 1) groups of people who are from business or technological areas but do not have adequate skills; 2) inadequate data management infrastructure; and 3) treatment of immature types of data from different sources, such as semi-structured or unstructured data. "Data are flooding in at rates never seen before" as state Bughin, Chui & Manyika (2010, p.7); thus, the application and use of big data are increasingly and a vast amount of data with varying structures have been used by organizations of many sectors. These data are categorized by Russom (2013) as follow: 1) structured data; 2) complex data (hierarquical or legacy sources); 3) semi-structured data; 4) web-logs; 5) unstructured data; 6) social media data; and 7) machine generated data (sensors, RFID, devices). Thus, the ability to manage big data, i.e., a high volume of data with the extensive variety of data types that should provide rapid responses is a reality for the organizations that must handle with relevant challenges of big data management as the dimensions related to people, technology, and process management. 166
3 Rossi & Hirama The characterization of big data management and the relationship with these three dimensions become the purposes of this article, as regards: 1) highlighting the importance and purpose of big data management; 2) addressing and discussing specific needs involving big data management; 3) discussing several technological, human and process aspects related to big data management; and 4) presenting the difficulties and challenges to analyze the vast amount of data and to visualize the results. To meet the above objectives, this article is organized as follows: section two presents a theoretical development of big data management; section three considers some works that may be related to this research that also treat the aspects and characteristics of big data management; section four discusses people, process, and technology dimensions involving big data management; section five presents the difficulties and challenges to manage big data; and finally, section six presents the conclusion and proposals for future work. Theoretical Development of Big Data Glatt (2014) presents historical factors concerning the term big data, mentioning that this term has been used since the completion of the 1880 Data Census in the United States. At that moment, with no technology or advanced techniques for data collection and organization, the vast amount of data took seven years to process and finally show results. Borkar, Carey & Li (2012b) go back to the 1970s to present separate consideration of the term big data, the word 'big' at that time, referring to megabytes, and 'big' over time came to mean gigabytes, evolving to terabytes. Currently, the authors mention that this word related to the term big data refers to petabytes and exabytes. Bedeley & Iyer (2014) suggest that the term big data was introduced in computing in 2005 to define a large volume of data that traditional data management technologies were not able to manage or to process due to their complexity and volume. While in the area of computing the term has been employed recently, researchers in other areas are presenting results since Accordingly to Chen, Chiang & Storey (2012) in a survey pointing to quote the keywords 'business intelligence' 'business analytics' and 'big data', the evolution of the latter is quite relevant given that in 2001 only one study was found referencing the term, and in 2011, 95 were found using the specific term big data. Luzivan & Meirelles (2014) collaborate with researches into the evolution of the term big data in scientific reports presenting results that show that, in scientific journals, in 2010, 15 reports were found employing this term, and in 2013, 380 scientific reports were found considering the same term. Bedeley & Iyer (2014) present results across top tier IS journals (journals that occupy the top spot according to the MIS journal rankings) allowing checking that in the field of business, only 16 articles were identified mentioning the term big data. These results present a quantitative insight into the related research, although the results presented by Bedeley & Iyer (2014) also propose a qualitative view of the technical and scientific reports identified. However, the results demonstrate the need for studies and research in the area, given that needs related to big data management are a reality for an increasing number of organizations. As states Gartner (2012), 85% of the companies infrastructure will be overloaded by big data until Moreover, as mentioned by Luzivan & Meirelles (2014), several authors showed a lack of academic studies related to big data under broader and integrative analysis. According to the definition of the word 'big' from the term big data, Borkar, Carey & Li (2012b) mention that it varies over time, from megabytes (1970s) to exabytes (2014). For Luzivan & 167
4 Characterizing Big Data Management Meirelles (2014), this word belongs to the term big data, and can be seen as a large volume of data in an individualized context and as small volume of data in another; or as large volume of data at a given observed moment and small at another. For Jacobs (2009, p. 40) what makes most big data big is repeated observations over time and/or space. However, Demchenko, Laat & Membrey (2014) argue that 'big' is not specifically restricted to the volume, but also refers to variables addressing variety, velocity, value, and veracity which make up the Big Data 5V Properties. The expression, or the full term big data, presents diverse definitions observed in the recent scientific literature. Definitions identified for the term big data are verified in Manyika et al. (2011), in Russon (2013), among other scientific reports. However, this article presents a definition proposed in a draft framework of the NIST (National Institute of Standards and Technology) linked to the US Department of Commerce, which corresponds to: Big data consists of extensive datasets, primarily in the characteristics of volume, velocity, and/or variety that require a scalable architecture for efficient storage, manipulation, and analysis (NIST, 2014a, p. 5). Reflections on big data must be able to effectively meet business competitiveness and support decision-making, which should also be related to the information science and knowledge engineering. Issues that have been considered by Turban, Aronson & Liang (2005) and Laudon & Laudon (2007) for a long time address the elements that integrate the application of information management and knowledge management to business. For Brynjolfsson & McAfee (2012), big data management is responsible for seeking to glean intelligence from data and for translating that into business advantage. The considerations regarding the definition of big data in a practical and effective manner within an organization may consider the Big Data 5V properties presented by Demchenko, Laat & Membrey (2014). This is a way to set the big data in an organization, considering the '5V properties that represent: 1) volume, 2) variety, 3) velocity, 4) value, and 5) veracity. It is essential to characterize the environment that can first consider the combination of volume and variety of data to be processed to generate intelligence and competitive advantage for the business. The definition and clarity of the aspects involving the scenario facing big data enable the organization to align with the specific technologies and techniques that are restricted to big data, and require that it has better control of processes and human resources with specific skills to meet needs related to big data management. Brynjolfsson & McAfee (2012) show that business executives are questioning whether big data is another way to say analytics. This makes explicit the need to define and to clarify the particular aspects for managing big data in a consistent and real way and to meet the expectations. Currently, organizations are not worried with the question of the need of big data, because it is more than the necessity, it s a reality that should be managed. Big data reflects existing scenarios in multiple-sector organizations. There is a vast amount of data with varied structures, as semistructured, unstructured or multi-structured data, and there is a necessity to provide quick responses, with the implementation of effective mechanisms for big data management, considering new technologies, organization and process changes, and right people. Related Works on Big Data Management The current situation regarding big data management presents diverse studies involving technological issues, issues dealing with data management, data analysis; there are also studies that link big data to business intelligence or to other consolidated information technology 168
5 Rossi & Hirama approaches. Usually related studies to manage big data propose two approaches, one that strongly addresses the technological and technical issues to institutionalize and to maintain an infrastructure that considers big data and another that seeks to meet the business goals. Big data management in this research is not restricted to management based exclusively on information technology, but also considers the involvement of human resources as well as the organizational processes for managing big data. For Russom (2013), there is a difference between managing big data at a technological level and manage big data in order to support successful business objectives. Hence, the author proposes two relevant questions regarding big data management that correspond to: 1) how effectively does an organization have the technological capacity to manage big data?; and, 2) does the management of big data have the ability to support the business goals? In fact, information engineering and knowledge management, according to Turban, Aronson & Liang (2005), collaborate with the organizations, increasing their capacity of competitive intelligence and decision-making. In this sense, for business objectives, managing big data becomes essential to provide favorable results from the vast amount of data. Russom (2013) presents evidence from a survey on a number of North American, European and Asian companies showing that only 3% of the organizations were considered to be at a relatively mature state to manage big data. Most of the organizations participating in the survey (37%) reported to be arguing about it without commitments with the institutionalization of big data. Regarding the expectation of when these organizations expect to have big data in production, the majority (22%) believes that only in three years or more, but 10% of the respondents expect to implement the management of big data within 6 months. A case study of the banking industry presented by Bedeley & Iyer (2014) discloses that this sector has huge volume of data being generated and processed continuously given to issues related to high competitiveness of the sector and the significant increase in customer database. Other issues that lead to the increased volume of data for the sector are mobile banking and e-banking. This requires that data capture, storage, processing, and analysis strategies, i.e., managing big data should be supported by high technology to provide the best results. Brynjolfsson & McAfee (2012) suggest that organizations to manage big data should particularly consider five areas: 1) leadership, since the era of big data means not just more data, but the ability to extract results; 2) talent management, considering that the most crucial are the data scientists and professionals with skills to deal with the vast volume of data, organizing large data sets that are not only in structured format; 3) technology, as an important component of the strategy for big data; although the available technology has improved significantly for managing big data, it should be considered novel for many IT departments and integration should be performed; 4) decision-making, reflects the need to maximize cross-functional cooperation between people who manage the data and the people who use them, people who understand business problems must be close to certain data and with people who know effective techniques for extracting the best results; and, 5) company culture, a data-driven organization should cease to be guided solely by hunches and stop using the hippo traditional approaches. Implementation strategies of big data management actions should be considered for organizations and can be checked at NIST (2014b) and Brynjolfsson & McAfee (2012). NIST (2014b) considers four steps that favor the strategic assessment of big data management: 1) identifying and including stakeholders, 2) identifying potential roadblocks, 3) defining achievable goals, and 4) defining finished and success at the beginning of the project. For Brynjolfsson & McAfee (2012), some steps can guide the use and application of big data management without huge investments in IT, considering a piecemeal approach to generating 169
6 Characterizing Big Data Management capacity for big data management: 1) selection of a business unit to test the actions of big data, considering a team of data scientists, 2) identifying five business opportunities based on big data, considering prototype solution for a given period (approximately five weeks), and 3) implementing an innovation process with four steps - a) experimentation, b) measurement, c) sharing, and d) replication. NIST (2014b) presents two scales to be considered for organizations related to big data management; the first considers the organizational readiness: 1) no big data, 2) ad hoc, 3) opportunistic, 4) systematic, 5) managed, and 6) optimized; and the second scale that deals with organizational adoption: 1) no adoption, 2) project, 3) program, 4) divisional, 5) cross-divisional, and 6) enterprise. The characteristics that occur in NIST (2014b) are relevant to provide visibility of the situation in which the organization is concerned about the management of big data, i.e., when the management of big data, in a business and technology approach, is able to provide intelligence to improve competitiveness and decision-making. People, Process, and Technology for Managing Big Data As part of an information system and as the principal component of this type of system, data are collected, qualified, stored and processed by information systems to deliver results that satisfy its users. For Laudon & Laudon (2007), information systems consider three dimensions: people, technology and organization (emphasizing the need for organizational processes). In this sense, these dimensions should be considered for intensive management of big data in the organizations: people, technology, and processes. To improve competitive advantage and decision-making, organizations consider information a fundamental object. In the information era, and more precisely in the era of digital information, this smart asset becomes increasingly necessary for business survival. O'Brien and Marakas (2013) argue that information systems have three key business roles: 1) supporting processes and operations, 2) supporting decision making by agents of the organization, and 3) supporting strategies for competitive advantage. To collaborate in decision-making, information systems must meet some basic requirements, such as the type of support offered, frequency and form of information presentation, format of information, and method of processing information (O'Brien & Marakas, 2013). The need for accurate, fast and concise information means that this is a costly asset for organizations, however, extremely necessary. Big data is hence also considered an important tool in this scenario where it can be treated as fundamental input to decision-making and competitive advantage. Fisher et al. (2012) argue that many decision makers, from company executives to government agencies to researchers and scientists, would like to base their decisions and actions on information. Therefore, big data analytics as a new discipline is a workflow that distills terabytes of low-value data, transforming them, in some cases, into a single bit of high-value data. The ability to generate information, as just a single bit of high-value data, from a large amount of data that present different structures is part of what can determine big data management. And, for successfully managing big data, the three dimensions, technology, processes and people, can favor environments where big data is identified. Therefore, the characteristics of these three dimensions for managing big data are detailed as follows. In a view that allows understanding the need for big data management as specific 170
7 Rossi & Hirama technologies and techniques with people with different profiles that are involved in various organizational processes, in business or technology areas. People dimension people related to big data management need new skills, according to Manyika et al. (2011) there may be limits to innate human ability to the human sensory and cognitive faculties to process the data torrent. There are limitations in human abilities to understand and to consume the vast and varied data set related to big data. The need for new skills is not restricted to those who manage data, to people that manipulate, process and manage the related big data technology environment, but mainly the abilities of users and decision makers should be considered to view an extremely large data set to obtain the necessary information for making important decisions. For Bughin, Chui & Manyika (2010), using experimentation in big data as essential components for managing decision-making requires new capabilities, as well as organizational and cultural changes. People involved with big data management currently receive positions with varying titles. Russom (2013) presents the following positions as the three most commonly used to manage big data: 1) Data Architect, 2) Data Analyst, and 3) BI Manager or DW Manager. Although Data Scientist appears as a position related to big data, as a specific professional to handle the management of big data, it is not the mostly considered position by the organizations, being considered for managing big data as well as the Application Developer, Business Analyst, and the System Analyst or System Architect. However, NIST (2014b) proposes specific actors and roles (Figure 1) for big data management, such as: 1) Data Provider, 2) Data Consumer, 3) Big Data Application Provider, 4) Big Data Framework Provider, and 5) System Orchestrator. Chen, Chiang & Storey (2012) argue that the United States alone will need between 140,000 and 190,000 professionals with deep analytical skills, as well as 1.5 million managers with data-savvy know-how to analyze big data to make effective decisions. If these proportions are amplified, professionals with this profile will generally have profound relevance in the global technology scenario and business. For Brynjolfsson & McAfee (2012, p. 65) big data power does not erase the need for human insight. One of the most critical aspects of big data is its impact on how decisions are made and who gets to make them. The ability of managing big data technologically does not overlap the ability big data gives to the decision maker. It represents important aspects that should be considered by the teams within organizations that manage big data as a backdrop to the real competitive advantage. 171
8 Characterizing Big Data Management Figure 1: Actors and roles for big data management (NIST, 2014B) Groups capable of managing big data in an organization, such as data warehouse group, central IT team, or also the business units or department, must possess appropriate skills and pay attention to training programs that promote the proper use to obtain better results from big data management. Russom (2013) proposes 10 top properties for big data management, one of which relates to get training (and maybe new staff) ; its focus should lie in training and hiring data analysts, data scientists, and data architects who can develop the applications for data exploration, discovery analytics, and real-time monitoring. Brynjolfsson & McAfee (2012) consider that people who understand the problems need to be together with the right people who manage big data technologies to obtain better results from this vast amount of data. Process dimension process for big data management are related to the actions that are performed in the technological environment, as in the business environment, i.e., some specific processes should be treated for the technology area where tools are used and specific techniques applied for managing big data; and business processes that are responsible for generating the data, as well as using them accurately after processed. However, for big data, processes are sometimes interrelated, since it is necessary that they concurrently perform activities related to the business, also performing technical activities, which culminate, for example, in big data analytics. According to Fisher et al. (2012), there are a number of challenges involving big data and one of them concerns the analysis that must be performed from a vast mass of data that possibly have different structures. For this relevant challenge, the authors present a pipeline that considers a 172
9 Rossi & Hirama five-step set, representing a data management process, to provide the best results from an analytical visualization. The pipeline denotes the state of practice for data analysis from a large volume of data. It has been created as the software development waterfall model. The big data pipeline proposed by Fisher et al. (2012) considers the steps that are shown in Figure 2. Acquire Data Choose Architecture Shape Data into Architecture Code/Debug Figure 2: The Big data Pipeline (Fisher et al., 2012) Reflect Acquire data, determines where data are extracted. How to discover the source of data and format relevant subsets to meet the outcomes. Sometimes the data may be stored in schemas that hinder their use. In this case, there are opportunities to improve standards for data storage, streamlining the search and formatting data; Choose architecture considers such items as cost and performance. Sometimes the analysis from vast amounts of data requires substantially different abstractions of programming designed for traditional environments. Especially when considering the environment facing Cloud Computing, it imposes nonlinear costs on access, storage and changes what occurs in the environment; Shape data into architecture to ensure compatibility when uploading data to the selected platform, a compatible way to computation and data distribution. It is relevant to consider that cloud computing environments use different storage engines from conventional desktops; Code / debug suggests the use of specific languages such as R, Python or PIG(data manipulation language) conjugated to Hadoop technology; and Reflect corresponds to a step of debugging favoring the visualization and interpretation of results. Aiming to encourage decision-making, this pipeline can be considered for both, the corporate environment, i.e., the business world to provide answers to the business leaders who still consider techniques such as data mining, machine learning and visualization; and is able to provide answers to scientific research, considering stringent mechanisms for data analysis in which theories and hypotheses could be tested. For Bizer, Boncz, Brodie & Erling (2011), a five-step methodological overview can serve the needs facing challenges when it comes to the extraction results from Big Data World, and this vision includes the following steps: Defining the concern which considers the problem to be solved from the manipulation of data in an environment that considers a vast amount of data; Search - search with the vast amount of existing data, i.e., the Big Data World, elements that can direct answers to the problem; 173
10 Characterizing Big Data Management Transform - where the ETL (Extract, Transform, Load) technique is used to perform data extraction from vast amounts of data that are relevant to the solution of the problem, to transform them and to store them for processing; Entity resolution - checks the data, ensuring their relevance to the solution of the problem, considering different levels of abstraction; and finally Solve the problem - involving actions from the relevant pre-selected data to compute the solution using specific computational domains. As illustrated, the approaches to manage big data present steps that include business and technology actions, and a joint vision is needed and amplified for people managing big data. In NIST (2014b), a roadmap to four categories is observed: 1) data services, 2) usage services, 3) capabilities, and 4) vertical orchestrator; as well as the nine features that refer to: 1) storage framework, 2) processing framework, 3) resource managers framework, 4) infrastructure architecture, 5) information architecture, 6) Standard integration framework, 7) application framework, 8) business operations, and 9) business intelligence. These features defined for big data management provide value statements for technology and organizational readiness as presented in Table 1. Table 1: Big data technology and organizational features (NIST, 2014b) eature Value statement 1.Storage framework 2.Processing framework 3.Resource manager framework 4.Infrastructure architecture 5.Information architecture 6.Standard integration framework 7.Application framework 8.Business operations 9.Business intelligence Define how big data is logically organized, distributed and stored. Define how big data is operated in the big data environment. Resource management solutions are required because big data storage and processing frameworks are distributed. Requires the ability to operate with sufficient network and infrastructure backbone. Data itself needs to be reviewed for its informational value. Integration with appropriate standards to assist cross-product integration and knowledge. Considering how applications will interact with a big data solution. Business operations need to be able to strategize, deploy, and operate big data solution, as big data is more than just technology. The end value of big data: presenting data as information, intelligence, and insight. The value statements defined by NIST (2014b) collaborate with the perception that processes in big data management are integrated, considering business and technology areas. Technology dimension - big data environment considers several technologies and techniques for collecting, storing, processing and analyzing data. Some of these technologies and techniques have emerged specifically in the big data era; other existing ones have been improved for big data. Many techniques and technologies for managing big data have been developed and adapted to add capacity to the analysis that must be performed from big data. Manyika et al. (2011) present some relevant techniques that consider the statistics and computer science theories and culminating in the big data analytics, among which are: Data Mining - technique used to extract patterns from vast amounts of data by combining statistical methods and machine learning data management; Genetic Algorithms - technique used for optimization, applied to nonlinear problems; 174
11 Rossi & Hirama Machine Learning - technique that uses artificial intelligence principles and considers the development of algorithms for recognizing complex patterns in large volumes of data and propose intelligent decisions; Neural Networks - consider the assumptions of biological neural networks to inspire computational models in identifying patterns in vast amounts of data being used for pattern recognition and optimization. There are still a number of technologies and techniques in Manyika et al. (2011) that are related to big data environments, which include: Hadoop - the open source framework for processing large volumes of data in distributed systems, inspired by tools such as MapReduce and GFS (Google File System) from Google company; MapReduce - software framework introduced by Google company to process high volumes of data is also part of the implementation of the Hadoop technology. According to Kulkarni & Khandewal (2014), to address some limitations identified for MapReduce its new generation was proposed, called YARN (Yet Another Resource Negotiator); Business Intelligence - refers to a type of application based on software developed to display and to analyze the data; and Cloud Computing - technology refers to a computing paradigm with a high level of computational resources sometimes configured as distributed systems to provide services through digital networks. Agrawal, Das & Abbadi (2011) consider that the field of big data analysis, MapReduce paradigm as well as its open source implementation, Hadoop, corresponds to technologies that have been adopted by the industry as well as by the academia. Bakshi (2012) also considers the relevance of the related unstructured data, such as texts, electronic messages ( s) that use NoSQL (Not Only SQL) database for managing and manipulating unstructured data. In Borkar, Carey & Li (2012a) and Borkar, Carey & Li (2012b) is identified as the Asterix Project that began at UC Irvine in early 2009 for creating a new parallel, semi-structured information management system. The top layer of Asterix is a DBMS Manager (Data Base Management System) completely parallel with data model (ADM - Asterix Data Model) and data query (AQL - Asterix Query Language) to describe, to analyze and to manipulate data. Asterix software stack is a full parallel DBMS with flexible data model (ADM) and query language (AQL) for describing, querying, and analyzing data. Hall et al. (2009) present the WEKA (Waikato Environment for Knowledge Analysis) that aims to provide a comprehensive collection of machine learning algorithms and data preprocessing tools to researchers and practitioners. WEKA has specific characteristics for activities of data mining that can be applied to big data environment. With the techniques and technologies available, the term 'analytics' for big data has been used constantly, and for Fisher et al. (2012) the term analytics is often used broadly to cover any datadriven decision-making. Analytics in the corporate world is considering statistics, data mining, machine learning, and visualization to answer questions that business executives pose. In the academic world, research scientists analyze data sets to form theories and test hypothesis. The tools for big data analytics, as in Demchenko, Laat & Membrey (2014), are currently offered by major big data technology providers, such as Amazon Elastic MapReduce and Dynamo, Microsoft Azure HDInsight, IBM Big Data Analytics, Cloudera; and others. Russom (2013) presents some vendors platforms and tools for managing big data, such as: 175
12 Characterizing Big Data Management Cloudera as a leader enterprise analytic data management powered by Apache Hadoop; Oracle Big Data Appliance integrates and optimizes all the hardware and software components to build comprehensive analytic application; Pentaho presents Pentaho Data Integration (PDI), an enterprise class, graphical ETL tool; and SAP Hana a smart data access to push queries into Hive Hadoop. Manyika et al. (2011) mention other technologies and tools that support big data, such as: Cassandra an open source (free) database management system for handing huge amount of data in a distributed system; MongoDB - a cross-platform document-oriented database; and Dynamo a proprietary distributed data storage system developed by Amazon. Besides tools and new technologies, new software languages emerge related to big data environment, such as language R, which is an open source programming language for statistical computing and graphics; PIG, a language for data manipulation coupled with Hadoop technology; and AQL (Asterix Query Language) a comparable language such as PIG. However, Chen, Chiang & Storey (2012) state that data analytics continues to be an active area of research given that statistical machine learning, techniques such as Bayesian networks, Hidden Markov Models, support vector machine, reinforcement learning have been applied to data, text, and web analytics application. For Brynjolfsson & McAfee (2012), all possible technologies and techniques for big data require a skill set that is new to most IT departments, which will need to work hard to integrate all the relevant internal and external sources of data to increase the decision-making capability based on big data. Challenges on Big Data Management The challenges to big data management refer mainly to structural problems of managing large volumes of data, and especially the difficulties inherent to the ability to extract meaning from this mass of data. For Borkar, Carey & Li (2012b) information has great potential value for many purposes if captured and aggregated effectively. Big data refers to considerations of data collected and stored in proportions unimaginable in the past. According to Bizer et al. (2011), in the big data world, the databases are unbelievably large in scale, scope, distribution, heterogeneity, and supporting technologies. As examples, according to Dounde (2014), the global data generated from the beginning until the year of 2003 can be estimated to represent about 5 exabytes (one exabyte corresponds to one million gigabytes) and the volume of data generated until 2012 is equal to 2.7 zettabytes (one zettabyte correspond to a thousand exabytes). They are expected to grow by 3 times until However, the challenges are not restricted to extracting, storing and managing vast amounts of data, but also refer to semantic analysis of these data, as shown by Bakshi (2012) as it is related to the needs of new skills by technology professionals and users, change in organizational culture and integration environment. Bizer et al. (2011) consider two classes of primary challenges to big data: 1) engineering - efficiency in data management at unimaginable scales, and 2) semantic - identification of meaning, considering the information that is relevant to specific goals. Alexander, Hoise and Szalay (2011) also consider that one of the challenges of big data is that data cannot be simply moved or made available for analysis. These should be analyzed in situ 176
13 Rossi & Hirama and/or specific methods must be developed for extracting smaller collections of relevant data to be analyzed and to provide the expected results. Russom (2013) presents results of a survey that was conducted with professionals of American, European and Asian companies, and points out that for most of them, big data represents an opportunity that enables data exploration and predictive analytics to discover new facts about customers, markets, partners, costs, and operations. A tiny minority considers big data management a problem; although big data poses technical challenges, data volume for a few organizations is a showstopper. The three most relevant barriers for big data management according to Russom (2013) are: 1) inadequate staffing or skills; 2) lack of governance or stewardship and lack of business sponsorship, and 3) data integration complexity and data ownership and other policies. Borkar, Carey & Li (2012b) consider that big data analytics and management is being touted as a critical challenge in the current computing landscape given that governments and businesses are tracking the content of blogs and tweets to perform sentiment analysis; likewise, health insurance organizations are monitoring search trends to check the progress of epidemics. Social scientists are also using social information from different social networks to be used more effectively for the public good. Grover (2014) considers that big data environments create significant opportunities and challenges. Specifically, technology organizations must find ways to cope with security and other technical challenges for managing the massive volumes of data, such as: 1) heterogeneity and incompleteness of data, 2) scale (large and rapidly increasing volumes of data), 3) context awareness, 4) performance issues, 5) security and privacy, and 6) other challenges (timeliness of data analysis, distributed storage structures, content validation, stream processing and real time analytics). Brynjolfsson & McAfee (2012) show that the five large areas for big data management (leadership, talent management, technology, decision-making, and company culture) still reflect challenges to organizations, i.e., the talents and organizational changes should be revised to encourage the big data world; technologies that are extremely relevant, even though they are being offered, they are sometimes not easily integrated into the environment. Moreover, there is the need for greater cooperation and integration among people who understand the problems and master the technologies to promote support to decision-making. Opportunities and challenges can also be seen in Luzivan & Meirelles (2014), related to studies and research on big data. The authors join others to express that there is a significant need for participation by academic researchers in overall amplitude. They also highlights gaps in the field of Information Science related to big data, and feature five groups of issues that may inspire research related to big data: 1) multidisciplinary studies, 2) methodological standards, 3) structure, 4) ethics, security and access, and 5) human capital formation. Conclusion and Further Works Big data management is a reality for many types of organizations and poses a challenge to computer science and information technology. The primary characteristics of big data are related to its volume and variety as also other characteristics have been considered as velocity, value, and veracity. Big data influences public and private sectors, science and economy, areas such as education and healthcare, among others. The proposal for big data scenario features important actions related to the provision of new technologies and techniques and its integration with existing technologies to 177
14 Characterizing Big Data Management promote the expected results. The specific needs related to big data involve the ability to manage the infrastructure and its semantic capacity, i.e., the ability related to improve decision-making. Big data management reflects several aspects, both in the technological field and in how to manage teams (technical or users); big data management is also strongly related to the processes involved that consider aspects related to collection, storage, and retrieval of data by using technical and user knowledge. Thus, human, technological dimensions and related processes for big data management are able to provide more favorable conditions to this new scenario. However, the three dimensions require further studies and researches. Even though the technological conditions for big data management have evolved, they are not enough by themselves, because they need to be integrated and operated by qualified personnel. Hence, the demand for new skills and trained professionals to use the new technological arsenal emerges. In addition, the processes must be more effective, sometimes requiring major revisions, either in the user's view, or from the technology point of view. Both need to review their work processes to provide better results from the vast amounts of data presented on a daily basis. The process affects how to generate, store, and retrieve data, and its presentation and visualization. The difficulties and challenges in the field of engineering and the semantic value that can be provided by big data reflect numerous future works in this area. As engineering issues, the necessary infrastructure for big data environment and its integration is presented as a fundamental aspect. Moreover, human issues must be addressed to operate big data; the semantic value extracted from big data requires more robust processes, processes from business or tech processes. Analysis and visualization are both especially critical processes for big data management. References Agrawal, D., Das, S., & Abbadi, A. E. (2011). Big Data and Cloud Computing: current state and future opportunities. Proceedings of International Conference on Extending Database Technology (EDBT), 1, Alexander, F. J., Hoise, A., & Szalay, A. (2011). Big Data. IEEE Computing in Science & Engineering, 13, Bakshi, K. Considerations for Big Data: Architecture and Approach. IEEE, Aerospace Conference, Bedeley, R. T., & Iyer, L. S. (2014). Big Data Opportunities and Challenges: The Case of Banking Industry. Proceedings of the Southern Association for Information Systems Conference, 1, 1-6. Bizer, C., Boncz, P., Brodie, M. L., & Erling, O. (2011). The Meaningful of Big Data: four perspectives four challenges. SIGMOD Record, 40(4), Borkar, V. R., Carey, M. J., Li, C. (2012a). Inside Big Data Management : Ogres, Onions, or Parfaits? Proceedings of the 15th International Conference on Extending Database Technology, 1, Borkar, V. R., Carey, M. J., Li, C. (2012b). Big Data Platforms: What s next? XRDS: Crossroads, The ACM Magazine for Students - Big Data, 19(1), Brynjolfsson, E. & McAfee, A. (2012). Big Data: the management revolution. Harvard Business Review Press, 90(10), Bughin, J., Chui, M., & Manyika, J. (2010). Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Global Institute. Retrieved November 6, 2014, from n_tech-enabled_business_trends_to_watch 178
15 Rossi & Hirama Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business Intelligence and Analytics: from Big Data to Big Impact. Journal MIS Quarterly, 36(4), Demchenko, Y., Laat, C., & Membrey, P. (2014). Defining architecture components of the big data ecosystem. Proceedings of the International Conference on Collaboration Technologies and Systems. 1, Dounde, R. G. (2014). Evolution of data into big data and big data management. Proceedings of the 4 th IRF International Conference. Retrieved November 8, 2014, from Fisher, D., DeLine, R., Czerwinski, M., & Drucker, S. (2012). Interactions with big data analytics. Interactions of the ACM, 19(3), Gartner. (2012). Market trends: Big data opportunities in vertical industries. Retrieved November 9, 2014, from Glatt, M. (2014). Big Data: more than just a buzzword? Bowdoin DCSI Digital and Computational Studies Initiative. Retrieved November 8, 2014, from Grover, N. (2014). Big Data Architecture, Issues, Opportunities and Challenges. International Journal of Computer & Electronics Research, 3(1), Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA Data Mining Software: An Update. SIGKDD Explorations, 11(1), Jacobs, A. (2009). The Pathologies of Big Data. Communications of the ACM, 52(8), Kulkarni, A. P., & Khandewal, M. (2014). Survey on Hadoop and Introduction to YARN. International Journal of Emerging Technology and Advanced Engineering (IJETAE), 4(5), Laudon, K. C., & Laudon J. P. (2007). Essentials of management information systems. São Paulo: Pearson Prentice Hall. Luzivan, S. S. & Meirelles, F. S. (2014). Big data: publication evolution and research opportunities. Proceedings of 11 th International Conference on Information Systems and Technology Management (CONTECSI), 1, Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big Data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute. Retrieved November 6, 2014, from NIST (National Institute of Standards and Technology) Big Data Working Group (NBD-WG). (2014a). Draft NIST Big Data Interoperability Framework: Volume 1, Definitions. NIST (National Institute of Standards and Technology). Retrieved November 6, 2014, from NIST (National Institute of Standards and Technology) Big Data Working Group (NBD-WG). (2014b). Draft NIST Big Data Interoperability Framework: Volume 7, Technology Roadmap. NIST (National Institute of Standards and Technology). Retrieved November 22, 2014, from O Brien, J. A., & Marakas, G. M. (2013). Introduction to information systems. Porto Alegre: AMGH. Russom, P. (2013). Managing Big Data. TDWI - The Data Warehousing Institute. Retrieved November 6, 2014, from Turban, E., Aronson, J. E., & Liang, T. (2005). Decision support systems and intelligent systems. New Jersey: Pearson Prentice Hall. 179
16 Characterizing Big Data Management Biographies Rogério Rossi received his B. S. in Mathematics by the University Center Foundation Santo André as he also has a M.S. and Ph.D. in Electrical Engineering, both by Mackenzie Presbyterian University. He is in a Postdoctoral Program at the University of São Paulo developing researches that are related to Complex Systems, Big Data and the Internet of Things (IoT). He is an Adjunct Professor for Information Technology and Computer Science courses of graduate and undergraduate programs in São Paulo. He has done research on the fields of software quality, and quality for digital educational solutions and he also has some publications on this area. He is a member of IACSIT (International Association of Computer Science and Information Technology) and he worked as a reviewer for InSite Conferences 2013 and 2015, and e-skills Conference 2014; as he also presented his papers in the InSite Conferences in Montreal, Canada (2012) and Porto, Portugal (2013). Kechi Hirama received his B.S., M.S., Ph.D. and Associate Professor degrees in Computer Engineering from Escola Politécnica of the University of São Paulo, São Paulo, Brazil in 1980, 1988, 1996 and 2008, respectively. He worked 15 years in the Control and Automation area in research organizations and since 1996 he has been a Professor of the Department of Computer and Digital Systems Engineering of Escola Politécnica of the University of São Paulo. His interests include Complex System, System Dynamics, Big Data and Internet of Things (IoT). 180
The emergence of big data technology and analytics
ABSTRACT The emergence of big data technology and analytics Bernice Purcell Holy Family University The Internet has made new sources of vast amount of data available to business executives. Big data is
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]
TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS
9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence
BIG DATA IN SUPPLY CHAIN MANAGEMENT: AN EXPLORATORY STUDY
Gheorghe MILITARU Politehnica University of Bucharest, Romania Massimo POLLIFRONI University of Turin, Italy Alexandra IOANID Politehnica University of Bucharest, Romania BIG DATA IN SUPPLY CHAIN MANAGEMENT:
The 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology
International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 BIG DATA: A New Technology Farah DeebaHasan Student, M.Tech.(IT) Anshul Kumar Sharma Student, M.Tech.(IT)
Big Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
Big Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
BIG DATA CHALLENGES AND PERSPECTIVES
BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
USING BIG DATA FOR INTELLIGENT BUSINESSES
HENRI COANDA AIR FORCE ACADEMY ROMANIA INTERNATIONAL CONFERENCE of SCIENTIFIC PAPER AFASES 2015 Brasov, 28-30 May 2015 GENERAL M.R. STEFANIK ARMED FORCES ACADEMY SLOVAK REPUBLIC USING BIG DATA FOR INTELLIGENT
Big Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
How To Learn To Use Big Data
Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization
Transforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
Business Challenges and Research Directions of Management Analytics in the Big Data Era
Business Challenges and Research Directions of Management Analytics in the Big Data Era Abstract Big data analytics have been embraced as a disruptive technology that will reshape business intelligence,
Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:
Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
ISSN:2321-1156 International Journal of Innovative Research in Technology & Science(IJIRTS)
Nguyễn Thị Thúy Hoài, College of technology _ Danang University Abstract The threading development of IT has been bringing more challenges for administrators to collect, store and analyze massive amounts
BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
Big Data at Cloud Scale
Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For
Chapter 6. Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING
TECHNOLOGY ANALYSIS FOR INTERNET OF THINGS USING BIG DATA LEARNING Sunghae Jun 1 1 Professor, Department of Statistics, Cheongju University, Chungbuk, Korea Abstract The internet of things (IoT) is an
How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
Data Warehouse design
Data Warehouse design Design of Enterprise Systems University of Pavia 10/12/2013 2h for the first; 2h for hadoop - 1- Table of Contents Big Data Overview Big Data DW & BI Big Data Market Hadoop & Mahout
Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
Big Data and Healthcare Payers WHITE PAPER
Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other
Big Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
Doing Multidisciplinary Research in Data Science
Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University [email protected] http://ce.qu.edu.az/~aadamov 16 May
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
The Next Wave of Data Management. Is Big Data The New Normal?
The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management
WELCOME TO THE WORLD OF BIG DATA. NEW WORLD PROBLEMS, NEW WORLD SOLUTIONS
WELCOME TO THE WORLD OF BIG DATA. NEW WORLD PROBLEMS, NEW WORLD SOLUTIONS TECHNOLOGY by Zachary Zeus Data in our world has been exploding. According to IBM research, 90% of today s data was created in
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
Understanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP
Pythian White Paper TAMING THE BIG CHALLENGE OF BIG DATA MICROSOFT HADOOP ABSTRACT As companies increasingly rely on big data to steer decisions, they also find themselves looking for ways to simplify
5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
Big Data Introduction, Importance and Current Perspective of Challenges
International Journal of Advances in Engineering Science and Technology 221 Available online at www.ijaestonline.com ISSN: 2319-1120 Big Data Introduction, Importance and Current Perspective of Challenges
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges
Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
Sunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
Where is... How do I get to...
Big Data, Fast Data, Spatial Data Making Sense of Location Data in a Smart City Hans Viehmann Product Manager EMEA ORACLE Corporation August 19, 2015 Copyright 2014, Oracle and/or its affiliates. All rights
Developing the SMEs Innovative Capacity Using a Big Data Approach
Economy Informatics vol. 14, no. 1/2014 55 Developing the SMEs Innovative Capacity Using a Big Data Approach Alexandra Elena RUSĂNEANU, Victor LAVRIC The Bucharest University of Economic Studies, Romania
Navigating Big Data business analytics
mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what
Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out
Big Data Challenges and Success Factors Deloitte Analytics Your data, inside out Big Data refers to the set of problems and subsequent technologies developed to solve them that are hard or expensive to
How Big Data is Different
FALL 2012 VOL.54 NO.1 Thomas H. Davenport, Paul Barth and Randy Bean How Big Data is Different Brought to you by Please note that gray areas reflect artwork that has been intentionally removed. The substantive
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
Business Intelligence and Big Data Analytics: An Overview
Communications of the IIMA Volume 14 Issue 3 Double Issue 3/4 Article 1 2014 Business Intelligence and Big Data Analytics: An Overview Xin James Fairfield University, [email protected] Follow this and
Manifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
How To Turn Big Data Into An Insight
mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed
Interactive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
Big Data and Your Data Warehouse Philip Russom
Big Data and Your Data Warehouse Philip Russom TDWI Research Director for Data Management April 5, 2012 Sponsor Speakers Philip Russom Research Director, Data Management, TDWI Peter Jeffcock Director,
Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems
Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Volker Markl [email protected] dima.tu-berlin.de dfki.de/web/research/iam/ bbdc.berlin Based on my 2014 Vision Paper On
Microsoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
Information Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
BIG DATA & DATA SCIENCE
BIG DATA & DATA SCIENCE ACADEMY PROGRAMS IN-COMPANY TRAINING PORTFOLIO 2 TRAINING PORTFOLIO 2016 Synergic Academy Solutions BIG DATA FOR LEADING BUSINESS Big data promises a significant shift in the way
Data Virtualization A Potential Antidote for Big Data Growing Pains
perspective Data Virtualization A Potential Antidote for Big Data Growing Pains Atul Shrivastava Abstract Enterprises are already facing challenges around data consolidation, heterogeneity, quality, and
Data Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
This Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
BIG Data. An Introductory Overview. IT & Business Management Solutions
BIG Data An Introductory Overview IT & Business Management Solutions What is Big Data? Having been a dominating industry buzzword for the past few years, there is no contesting that Big Data is attracting
IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!
The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
IDC MaturityScape Benchmark: Big Data and Analytics in Government. Adelaide O Brien Research Director IDC Government Insights June 20, 2014
IDC MaturityScape Benchmark: Big Data and Analytics in Government Adelaide O Brien Research Director IDC Government Insights June 20, 2014 IDC MaturityScape Benchmark: Big Data and Analytics in Government
ANALYTICS BUILT FOR INTERNET OF THINGS
ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that
Apache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
The 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013
Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers
60 Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative
Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD
Big Analytics for Space Exploration, Entrepreneurship and Policy Opportunities Tiffani Crawford, PhD Big Analytics Characteristics Large quantities of many data types Structured Unstructured Human Machine
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
How To Use Big Data In Education
www.ijcsi.org 58 The Use of Big Data in Education Athanasios S. Drigas 1 and Panagiotis Leliopoulos 2 1 Institute of Informatics & Telecommunications, Telecoms Lab - Net Media Lab, N.C.S.R. Demokritos
DATA VISUALIZATION: When Data Speaks Business PRODUCT ANALYSIS REPORT IBM COGNOS BUSINESS INTELLIGENCE. Technology Evaluation Centers
PRODUCT ANALYSIS REPORT IBM COGNOS BUSINESS INTELLIGENCE DATA VISUALIZATION: When Data Speaks Business Jorge García, TEC Senior BI and Data Management Analyst Technology Evaluation Centers Contents About
Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
White Paper: Datameer s User-Focused Big Data Solutions
CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration
Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
BIG DATA IN BUSINESS ENVIRONMENT
Scientific Bulletin Economic Sciences, Volume 14/ Issue 1 BIG DATA IN BUSINESS ENVIRONMENT Logica BANICA 1, Alina HAGIU 2 1 Faculty of Economics, University of Pitesti, Romania [email protected] 2 Faculty
Big Data. Lyle Ungar, University of Pennsylvania
Big Data Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. McKinsey Data Scientist: The Sexiest Job of the 21st Century -
Foundations of Business Intelligence: Databases and Information Management
Chapter 5 Foundations of Business Intelligence: Databases and Information Management 5.1 Copyright 2011 Pearson Education, Inc. Student Learning Objectives How does a relational database organize data,
CONNECTING DATA WITH BUSINESS
CONNECTING DATA WITH BUSINESS Big Data and Data Science consulting Business Value through Data Knowledge Synergic Partners is a specialized Big Data, Data Science and Data Engineering consultancy firm
Why Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
International Journal of Innovative Research in Computer and Communication Engineering
FP Tree Algorithm and Approaches in Big Data T.Rathika 1, J.Senthil Murugan 2 Assistant Professor, Department of CSE, SRM University, Ramapuram Campus, Chennai, Tamil Nadu,India 1 Assistant Professor,
