Data. Data and database. Aniel Nieves-González. Fall 2015



Similar documents
Foundations of Business Intelligence: Databases and Information Management

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management

Databases & Business Intelligence Part 1

Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Course MIS. Foundations of Business Intelligence

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

BUILDING OLAP TOOLS OVER LARGE DATABASES

History of Database Systems

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Foundations of Business Intelligence: Databases and Information Management

Relational Database Basics Review

Foundations of Business Intelligence: Databases and Information Management

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

David M. Kroenke and David J. Auer Database Processing 11 th Edition Fundamentals, Design, and Implementation. Chapter Objectives

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

It s about you What is performance analysis/business intelligence analytics? What is the role of the Performance Analyst?

DATABASE MANAGEMENT SYSTEM

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Ursuline College Accelerated Program

Types & Uses of Databases

TIM 50 - Business Information Systems

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc.

BIG DATA IS MESSY PARTNER WITH SCALABLE

Foundations of Business Intelligence: Databases and Information Management

How To Handle Big Data With A Data Scientist

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Original Research Articles

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

full file at

uncommon thinking ORACLE BUSINESS INTELLIGENCE ENTERPRISE EDITION ONSITE TRAINING OUTLINES

BIG DATA What it is and how to use?

Online Courses. Version 9 Comprehensive Series. What's New Series

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

HadoopRDF : A Scalable RDF Data Analysis System

IBM BigInsights for Apache Hadoop

Databases in Organizations

IT Workload Automation: Control Big Data Management Costs with Cisco Tidal Enterprise Scheduler

Parallel Data Warehouse

Tagetik Extends Customer Value with SQL Server 2012

Module 3: File and database organization

Establish and maintain Center of Excellence (CoE) around Data Architecture

The Celebrus v8 Big Data Engine. Powering real-time personalisation, one-to-one data-driven marketing & advanced customer analytics.

Advanced In-Database Analytics

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

David M. Kroenke and David J. Auer Database Processing 12 th Edition

Big Data on Microsoft Platform

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

CHAPTER 5: BUSINESS ANALYTICS

Data Mining in the Swamp

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

Oracle 11g is by far the most robust database software on the market

CHAPTER SIX DATA. Business Intelligence The McGraw-Hill Companies, All Rights Reserved

Chapter 1: Introduction

iservdb The database closest to you IDEAS Institute

Architectures for Big Data Analytics A database perspective

Tableau Visual Intelligence Platform Rapid Fire Analytics for Everyone Everywhere

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Introduction: Database management system

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

DATA MINING AND WAREHOUSING CONCEPTS

Breadboard BI. Unlocking ERP Data Using Open Source Tools By Christopher Lavigne

SQL Server and MicroStrategy: Functional Overview Including Recommendations for Performance Optimization. MicroStrategy World 2016

INTEROPERABILITY OF SAP BUSINESS OBJECTS 4.0 WITH GREENPLUM DATABASE - AN INTEGRATION GUIDE FOR WINDOWS USERS (64 BIT)

In-Database Analytics

When to consider OLAP?

Profiling as a Service

Products CRM and Business Intelligence for DNA

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

OBIEE 11g Analytics Using EMC Greenplum Database

From Spark to Ignition:

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

University of Gaziantep, Department of Business Administration

Hexaware E-book on Predictive Analytics

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Big Data Data-intensive Computing Methods, Tools, and Applications (CMSC 34900)

hmetrix Revolutionizing Healthcare Analytics with Vertica & Tableau

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

Business Intelligence & Product Analytics

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Outline. BI and Enterprise-wide decisions BI in different Business Areas BI Strategy, Architecture, and Perspectives

CHAPTER 4: BUSINESS ANALYTICS

Real-time Big Data Analytics with Storm

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Client Overview. Engagement Situation. Key Requirements

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances

The HP Neoview data warehousing platform for business intelligence Die clevere Alternative

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Transcription:

Data and database Aniel Nieves-González Fall 2015

Data I In the context of information systems, the following definitions are important: 1 Data refers simply to raw facts, i.e., facts obtained by measuring a property of an object. 2 Information is data presented in a context so that it can answer a question or support decision making.

Database I 1 A a database is an organized collection of data. In some instances can be seen as a collection of tables that are related to one another. 2 A database management system (DBMS) is the software that creates, maintains, and manipulates a database. 3 Some examples of DBMS are: MySQL, Microsoft SQL server, and Oracle. 4 A database model determines the logical structure of a database and the way data is going to be stored, organized, and manipulated. 5 Some examples of database model are:

Database II Relational: It was first proposed by E.F. Codd (1969). All data is represented by tuples and grouped into relations. It is based on type of mathematical logic that yields a method to specify data and queries. (Tuples, mathematically, are ordered lists). Hierarchical: In this model the data is organized in a tree-like structure that contains records. The records correspond to the tuples in relational model. Object: This model represents data as objects, in the sense of object-oriented programming. 6 Most DBMS are built around a specific model, but some support more than one model. 7 Structured query language (SQL) is the most common programming language for creating and manipulating relational databases.

Database III 8 Other important terminology is: 1 A table or file refers to a list of data. 2 A database can be a single table or a collection of related tables. 3 A column or field defines the data that a table can hold. 4 A row or record represents a single instance of whatever the table keeps track of. 5 A key is the field used to relate tables in a database. 6 In a relational database multiple tables are related based on common keys. 9 Let s see a very simple example of a relational database that makes use of a web page (xxx.xxx.xxx/phpmyadmin) to collect data and to administer the database (of course, using SQL).

Collecting data I 1 There are several ways to collect data in the context of a business. Transactions processing systems (TPS). Loyalty card systems. Other? 2 There are firms that collect data for resale. This are called data aggregators. Any concern about that? Any privacy concern? 3 Sometimes the data, abeit collected, cannot be properly used. Incompatible systems. This includes the legacy systems, which are older information systems not compatible with newer technologies. Transactional data that cannot be accessed for analysis.

Big Data I The term big data is widely used in many areas and it encompasses several things. Business Intelligence combines aspects of reporting, data exploration and ad hoc queries, and sophisticated data modeling and analysis. The term analytics describes the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions. On occassions data can be a true strategic asset. This happens when it is rare, valuable, imperfectly imitable, and lacking in substitutes.

Big Data II Advantages based on analytics and modeling are sustainable if they result in differentiation as opposed to just operational efficiency. Advantages based on capabilities that others can acquire will be short-lived. A data warehouse is a set of databases designed to support decision making in an organization. It is structured for fast online queries and exploration and may aggregate enormous amounts of data from many different operations. A data mart is a database focused on addressing the concerns of a specific problem or business unit. Hadoop: it is an open-source project created to analyze massive amounts of raw information better than traditional, highly-structured databases. Scalability, flexibility, cost, and fault tolerance are its main advantages.

Big Data III In essence Hadoop is a software framework for parallel computing and parallel storage in computer clusters. Data mining is the process of using computers to identify hidden patterns and to build models from large data sets. Data used for data mining must: be clean and consistent reflect current and future trends.

Discussion Walmart.

Discussion Walmart. Ceasar s Casinos.