# Introduction to the Mathematics of Big Data. Philippe B. Laval

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2015

2 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business, and finance produces huge amount of data [4]. This data, has to be generated, acquired, stored, then repeatedly processed and analyzed. Because Big Data has affected so many disciplines, some even talk about the Big Data revolution. In fact, Big Data is mentioned not only in books, but also in various magazines [2], [3], [5], [6], [7], and [8]. But what is Big Data and how big is Big Data? This class will give a short overview of Big Data, will discuss the issues associated with Big Data, and will provide some answers. Big Data is a new area which covers many disciplines and technologies. This class will focus on scientific data and on the mathematical techniques used to address some of the issues which arise with Big Data. The computer science and statistical techniques used to address the same issues are equally important but will not be addressed in this class. vii

3 Part I Basic Concepts 1

4 Chapter 1 Mathematics and Big Data 1.1 Definitions and Overview of the Issues with Big Data What is Big Data? Big Data is still a new area, evolving quickly. Many of the definitions and techniques are still evolving rapidly and are therefore not formal. As of the writing of these notes, here is how big data is defined. Definition Big Data refers to a data set that is so large and/or complex that it cannot be perceived, acquired, managed, and processed by traditional Information Technology (IT) and software/hardware tools within a tolerable time [1]. Of course, this definition is not absolute. As technology improves, the amount of data that can be processed within a tolerable time also increases. With this definition, Big Data, ten years ago, would have been much smaller than today. In addition, the notion of tolerable time is also relative. Medical doctors are known to be impatient; they do not want to have to wait a lot between the time an image is captured and the time it is available for them to analyze. In contrast, scientists studying outer space wait years before receiving images taken and sent by some probe. Big Data can be characterized by the three V s. 1. Volume: refers to the huge volume of data that is generated and has to be stored and analyzed. 2. Variety: refers to the different types of data (text, images, sound). Data can also be structured or unstructured. The complexity of the data can also drastically change. 3. Velocity: refers to the fact that more and more data is being generated at a faster and faster pace. 3

5 4 CHAPTER 1. MATHEMATICS AND BIG DATA Some researchers have now started to add a fourth V, Value. It refers to the value that could be saved if Big Data techniques were creatively and effectively used to improve effi ciency and quality. Big Data has given rise to several new and related technologies. These would be studied in a class approaching this subject from the computer science point of view. These related technologies include: 1. Cloud computing. 2. Internet of Things (IoT). 3. Data Centers. 4. Hadoop How Big is Big Data? The smallest unit of measurement for data is a bit (b). A bit can be either 0 or 1 and is therefore not large enough to hold any data. A byte (B), which is 8 bits, is used as the fundamental unit of measurement for data. A byte can hold 2 8 = 256 different values, which is enough to represent the standard ASCII characters, such as letters, numbers and some basic symbols. Following the tradition of the metric system, terms to measure large quantities of data can be formed using SI prefixes as shown in the table below. These prefixes are often used for multiple of bytes. For example, a kilobyte is 10 3 bytes since kilo means Because Big Data is so large, the prefixes used for it are not known to the public. We review them in the table below. Prefix Unit Name Symbol SI Meaning kilo kilobyte kb or KB 10 3 mega megabyte MB 10 6 = ( 10 3) 2 giga gigabyte GB 10 9 = ( 10 3) 3 tera terabyte TB = ( 10 3) 4 peta petabyte PB = ( 10 3) 5 exa exabyte EB = ( 10 3) 6 zetta zettabyte ZB = ( 10 3) 7 yotta yottabyte YB = ( 10 3) 8 It is hard to measure how big Big Data is, as the size of Big Data is constantly increasing, and increasing at a faster and faster rate. More than a specific number, what is important is an order of magnitude. Here are some figures though. The site has some figures from Below are similar but more recent figures.

6 1.1. DEFINITIONS AND OVERVIEW OF THE ISSUES WITH BIG DATA5 Every day, humanity tweets 500 million times [2]. Every day, humanity shares 70 million photos on Instagram [2]. Every day, humanity watches 4 billion videos on Facebook [2]. Every minute, we upload 300 hours of new content on YouTube [2]. A 2014 study by the market-research firm IDC estimated that the world of digital data would grow by a factor of 10 from 2013 to 2020, to 44 zettabytes [2] Issues with Big Data With new technologies, as more and more objects are connected, more and more data is being produced every day. This data has to be generated, acquired, stored, transmitted, and analyzed. There are issues to deal with at each stage of this data pipeline. Acquisition/storage: the total data generated is larger than the total storage capacity. Hence, we have to find ways to store data more effi ciently. Transmission: the increase in the generation rate of this data is far greater than the increase in communication rate. The analysis of a data set can be very complex. Even if it is feasible, it may take a long time. There are additional issues to consider. Data can be noisy, hence it has to be processed to remove as much noise as possible, without modifying the data. Data is also often unstructured. Think of the data we get from social networks. It is a mixture of text, voice, images, videos. What someone is looking for in a data set is not likely to come from a table in which data is nicely organized, as it was the case in the past. Data can also be dynamic, hence it has to be processed in real time. Mathematics plays an important role in addressing these issues, as we will see. More specifically, mathematics......allows us to formalize both the data and the problem. In other words, it allows us to transform a complex problem into a mathematical problem on which we can use all the tools mathematics has at its disposal....provides a big "chest" of tools or methodologies. Mathematics does have a very large chest of tools.

7 6 CHAPTER 1. MATHEMATICS AND BIG DATA...allows validation of the methodologies (proof of functionality). Think, for example, about a noisy image that we clean to see what it has to show us. It may be a medical image. How do we know that the procedure used to remove noise is not going to insert features which were not in the original image? Exercises 1. Write a paragraph explaining what cloud computing is and how it relates to Big Data. 2. Write a paragraph explaining what IoT is and how it relates to Big Data. 3. Write a paragraph explaining what Data Centers are and how they relate to Big Data. 4. Write a paragraph explaining what Hadoop is and how it relates to Big Data. 5. Give examples in which Big Data plays and will play an important role.

8 Bibliography [1] M. C, S. M, Y. Z, V. C. L, Big Data: Related Technologies, Challenges and Future Prospects, Springer, [2] L. G, What s this all about?, [3] F. L R. I, Google le nouvel einstein, [4] K.-C. L, H. J, L. T. Y, A. C, Big Data: Algorithms, Analytics, and Applications, CRC Press, [5] T. M P, Will our data drown us, [6] T. P, Giving your body a "check engine" light, [7] L. S, Should you get paid for your data, [8] E. S, Their prescription: Big data,

### Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com

Introduction to Predictive Analytics Dr. Ronen Meiri Outline From big data to predictive analytics Predictive Analytics vs. BI Intelligent platforms What can we do with it. The modeling process. Example

### Doing Multidisciplinary Research in Data Science

Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May

### Lecture.5. Number System:

Number System: Lecturer: Dr. Laith Abdullah Mohammed Lecture.5. The number system that we use in day-to-day life is called Decimal Number System. It makes use of 10 fundamental digits i.e. 0, 1, 2, 3,

### THE EMERGING GI ENVIRONMENT

THE EMERGING GI ENVIRONMENT BRUCE McCORMACK VICE PRESIDENT EUROGI (EUROPEAN UMBRELLA ORGANISATION FOR GEOGRAPHIC INFORMATION) ESTONIA GI ASSOCIATION ANNUAL CONFERENCE 23 October 2015 OUTLINE Questionnaire

### A Survey on Big Data Concepts and Tools

A Survey on Big Data Concepts and Tools D. Rajasekar 1, C. Dhanamani 2, S. K. Sandhya 3 1,3 PG Scholar, 2 Assistant Professor, Department of Computer Science and Engineering, Sri Krishna College of Engineering

### Cloud storage Megas, Gigas and Teras

Cloud storage Megas, Gigas and Teras I think that I need cloud storage. I have photos, videos, music and documents on my computer that I can only retrieve from my computer. If my computer got struck by

### The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP

The Big Deal about Big Data Mike Skinner, CPA CISA CITP HORNE LLP Mike Skinner, CPA CISA CITP Senior Manager, IT Assurance & Risk Services HORNE LLP Focus areas: IT security & risk assessment IT governance,

### Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,

SECURITY MEETS BIG DATA Achieve Effectiveness And Efficiency 1 IN 2010 THE DIGITAL UNIVERSE WAS 1.2 ZETTABYTES 1,000,000,000,000,000,000,000 Zetta Exa Peta Tera Giga Mega Kilo Byte Source: 2010 IDC Digital

### So Just What Is Big Data? James E. Tcheng, MD, FACC, FSCAI

So Just What Is Big Data? James E. Tcheng, MD, FACC, FSCAI Disclosures James E. Tcheng, MD, FACC, FSCAI Affiliations / Financial Relationships / Other RWI ACC Chair, Informatics and Health IT Task Force

### Applications for Business Intelligence, Predictive Analytics and Big Data

Finance, Management, & Operations Applications for Business Intelligence, Predictive Analytics and Big Data Patrick Bogan, Chief Information Officer, Fuzion Analytics Kyle Korzenowski, Chief Information

### Definition of Computers. INTRODUCTION to COMPUTERS. Historical Development ENIAC

Definition of Computers INTRODUCTION to COMPUTERS Bülent Ecevit University Department of Environmental Engineering A general-purpose machine that processes data according to a set of instructions that

### BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013 PERSONAL BACKGROUND Founder of the first specialist Service Management & Helpdesk System provider in Europe Past President of AFSMI

### Computer Science Terminology II

Computer Science 1000 Terminology II Storage a computer has two primary tasks store data operate on data a processor's primary job is to operate on data math operations move operations note that processors

### ว ชาโปรแกรมแกรมประย กต ด านการจ ดการ สาน กงานอ ตโนม ต

ว ชาโปรแกรมแกรมประย กต ด านการจ ดการ สาน กงานอ ตโนม ต บทท 3 บทบาทและการใช คอมพ วเตอร ใน สาน กงานอ ตโนม ต อ.รจนา วานนท 1 บทบาทของสาน กงานอ ตโนม ต ในการจ ดการสารสนเทศ 1. ล กษณะงานสาน กงานท วไป 1.1 งานร บข

### Computer Logic (2.2.3)

Computer Logic (2.2.3) Distinction between analogue and discrete processes and quantities. Conversion of analogue quantities to digital form. Using sampling techniques, use of 2-state electronic devices

### Age of Big data. Presented by: Mohammad Iqbal BCM -2014

Age of Presented by: Mohammad Iqbal BCM -2014 Agenda Big? Big evolution from Big? Name Symbol Value Kilobyte KB 10^3 BIG DATA Megabyte MB 10^6 Gigabyte GB 10^9 Terabyte TB 10^12 Petabyte PB 10^15 So large

### DIGITAL MARKETING STRATEGIES Leveraging The Back-End Tools

DIGITAL MARKETING STRATEGIES Leveraging The Back-End Tools Professional Background RACING INDUSTRY EXPERIENCE: First Job Out of Undergrad: - Arlington Park, Assistant to the VP of Marketing - Sponsorship

### Big Data: Study in Structured and Unstructured Data

Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 mail2motashim@gmail.com, khanwasim051@gmail.com Abstract With the overlay of digital world, Information is available

### HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety

### Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

### Two Recent LE Use Cases

Two Recent LE Use Cases Case Study I Have A Bomb On This Plane (Miami Airport) In January 2012, an airline passenger tweeted she had a bomb on a Jet Blue commercial aircraft at the Miami International

### SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

Sean Lee Solution Architect, SDI, IBM Systems SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Agenda Converging Technology Forces New Generation Applications Data Management Challenges

### The Components of the System Unit

Objectives Differentiate among various styles of system units Differentiate among the various types of memory The Components of the System Unit Identify chips, adapter cards, and other components of a

### Bits and Bytes. Computer Literacy Lecture 4 29/09/2008

Bits and Bytes Computer Literacy Lecture 4 29/09/2008 Lecture Overview Lecture Topics How computers encode information How to quantify information and memory How to represent and communicate binary data

### Algorithms and Methods for Distributed Storage Networks 7 File Systems Christian Schindelhauer

Algorithms and Methods for Distributed Storage Networks 7 File Systems Institut für Informatik Wintersemester 2007/08 Literature Storage Virtualization, Technologies for Simplifying Data Storage and Management,

### Introducing Big Data. Abstract. with Small Changes. Agenda. Big Data in the News. Bits and Bytes

Introducing Big Data in Stat 101 with Small Changes 17 Nov 2013 Introducing Big Data in Stat 101 with Small Changes John D. McKenzie, Jr. Babson College Babson Park, MA 02457 0310 mckenzie@babson.edu DSI

### Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016

Big Data! Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016 Big Data: Data Analytical Tools for Decision Support 2 Outline Introduce

### Big data and its transformational effects

Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven

### CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science

CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science Dr. Daisy Zhe Wang CISE Department University of Florida August 25th 2014 20 Review Overview of Data Science Why Data

### Definitions in Informatics Susen Rabold

Definitions in Informatics 2009-10-01 Susen Rabold Question after question AND (hopefully) answers Informatics Literacy Bit? Html? Xml? CSS? Bytes? Computers encode information What is information? Wikipedia:

### Congrats to Game Winners. How can computation use data to solve problems? What topics have we covered in CS 202? Part 1: Completed!

CS 202: Introduction to Computation " UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department Professor Andrea Arpaci-Dusseau How can computation use data to solve problems? Congrats to Game Winners

### DISCOVERING ediscovery

DISCOVERING ediscovery Purpose This paper is the first in a series that are designed to educate organisations and increase awareness in the area of ediscovery technology. What is ediscovery? Electronic

### Digital Earth: Big Data, Heritage and Social Science

Digital Earth: Big Data, Heritage and Social Science The impact on geographic information and GIS Geographic Information Systems Analysis for Decision Support Impact of Big Data Digital Earth Citizen Engagement

### Oracle Big Data for Dummies

Oracle Big Data for Dummies Sai Janakiram Penumuru WW Product Expert Cloud Platforms The Father of Microbiology First Microbiologist Antonie Philips van Leeuwenhoek 2 Sai Janakiram Penumuru o o o o o o

### Big Data: Public Sector Opportunities, Challenges, and Implications

Big Data: Public Sector Opportunities, Challenges, and Implications Timothy M. Persons, Ph.D. Chief Scientist U.S. Government Accountability Office personst@gao.gov / www.gao.gov / @GAOChfScientist Presentation

### BIG DATA CHALLENGES AND PERSPECTIVES

BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,

### Cloud beyond the obvious, an approach for innovation

Cloud beyond the obvious, an approach for innovation Christian Verstraete Chief Technologist Cloud Strategy Our World is Changing Living in the age of tectonic shifts, and welcome to the new style of IT

### Problems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM 1

Problems to store, transfer and process the Big Data COURSE: COMPUTING CLUSTERS, GRIDS, AND CLOUDS LECTURER: ANDREY SHEVEL ITMO UNIVERSITY SAINT PETERSBURG 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM

### CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Data Science Overview Why, What, How, Who Outline Why Data Science?

### Chapter NO.4 Storage Devices

Chapter NO.4 Storage Devices 4.01 Complete the following statements. i) A byte is a group of bits. ii) is a volatile memory. iii) Storage capacity of a sector on floppy is a multiple of bytes. iv) SIMMs

### Analytical Tools: What Auditors Need to Know About Big Data

Analytical Tools: What Auditors Need to Know About Big Data Timothy M. Persons, Ph.D. Chief Scientist U.S. Government Accountability Office personst@gao.gov / www.gao.gov / @GAOChfScientist Presentation

### What s New About Big Data? Implications for the Accountability Community

What s New About Big Data? Implications for the Accountability Community Timothy M. Persons, Ph.D. Chief Scientist U.S. Government Accountability Office personst@gao.gov / www.gao.gov Presentation to the

### Big Data Streams. Analytics Challenges, Analysis, and Applications. Adel M. Alimi

Big Data Streams 1 Analytics Challenges, Analysis, and Applications Adel M. Alimi REGIM-Lab., University of Sfax, Tunisia http://adel.alimi.regim.org adel.alimi@ieee.org 2 Evolution of Technology 3 Nano,

### Peter Rakers De Verstoring van Big Data

Peter Rakers De Verstoring van Big Data Na globalisatie en digitalisatie komt dataficatie, het continue genereren van real time gegevens. Voor veel bedrijven werkt dit fenomeen eerder verstorend. Peter

### EXECUTIVE REPORT. Big Data and the 3 V s: Volume, Variety and Velocity

EXECUTIVE REPORT Big Data and the 3 V s: Volume, Variety and Velocity The three V s are the defining properties of big data. It is critical to understand what these elements mean. The main point of the

### Data Centric Computing Revisited

Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data

### What Is Big Data? Craig C. Douglas University of Wyoming

What Is Big Data? Craig C. Douglas University of Wyoming What Is Big Data?... It Depends Unit Approximately 10 n Related to Kilobyte (KB) 1,000 bytes 3 Circa 1952 computer memory 32 KB Apollo 11 computer

### CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait

CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems

### Data Management Nuts and Bolts. Don Johnson Scientific Computing and Visualization

Data Management Nuts and Bolts Don Johnson Scientific Computing and Visualization Overview Data Management Storing data Sharing data Moving data Tracking data (Client responsibility) Where can you obtain

### Big data challenges hll and opportunities in healthcare: application to detecting faint signals

Big data challenges hll and opportunities in healthcare: h application to detecting faint signals Dr. Greg Slabaugh City University London School of Informatics Data, data, and more data According to IBM,

### Introduction to Computer & Information Systems

Introduction to Computer & Information Systems Binnur Kurt kurt@ce.itu.edu.tr Istanbul Technical University Computer Engineering Department Copyleft 2005 1 Version 0.1 About the Lecturer BSc İTÜ, Computer

### Cloud Computing and Big Data That s Why! Ray Walshe 14 th March 2013

Cloud Computing and Big Data That s Why! Ray Walshe 14 th March 2013 Data Centre Growth We need more data centres??? (c) Ray Walshe 2013 2 By 2015 By 2015, about 24% of all new business software purchases

### Turning Big Data into Big Decisions Delivering on the High Demand for Data

Turning Big Data into Big Decisions Delivering on the High Demand for Data Michael Ho, Vice President of Professional Services Digital Government Institute s Government Big Data Conference, October 31,

### Big Data Analytics: Collecting, Analyzing and Decision Making

Big Data Analytics: Collecting, Analyzing and Decision Making Defining Big Data Jennifer Jones, Senior Indirect Sales Manager, CBTS Thought Leader Definitions Oracle - Derivation of value from traditional,

### UNDERSTANDING THE BIG DATA PROBLEMS AND THEIR SOLUTIONS USING HADOOP AND MAP-REDUCE

UNDERSTANDING THE BIG DATA PROBLEMS AND THEIR SOLUTIONS USING HADOOP AND MAP-REDUCE Mr. Swapnil A. Kale 1, Prof. Sangram S.Dandge 2 1 ME (CSE), First Year, Department of CSE, Prof. Ram Meghe Institute

### Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rĳmenam, Think Bigger, 2014

What is Big Data? Of all the data in recorded human history, 90 percent has been created in the last two years. - Mark van Rĳmenam, Think Bigger, 2014 Data in the Twentieth Century and before In 1663,

### Big Systems, Big Data

Big Systems, Big Data When considering Big Distributed Systems, it can be noted that a major concern is dealing with data, and in particular, Big Data Have general data issues (such as latency, availability,

### Now, Next and the Future: IT, Big Data and other Implications for RIM. Presented by Michael S. Smith / http://about.me/mikessmith

Now, Next and the Future: IT, Big Data and other Implications for RIM Agenda for This Afternoon Now: What trends are creating implications within the profession? Next: Why is IT now concerned about RIM?

### Bits, Bytes, and Codes

Bits, Bytes, and Codes Telecommunications 1 Peter Mathys Black and White Image Suppose we want to draw a B&W image on a computer screen. We first subdivide the screen into small rectangles or squares called

### Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation

### WHAT IS BIG DATA? David Bechtold

WHAT IS BIG DATA? David Bechtold Agenda 1. Introduction 2. What is Big Data? 3. Big Data a perspective 4. Characteristic of Big Data Three Vs 5. A Fourth V..? 6. Examples 7. How did we get here?... A historical

### The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

### Data Representation. Topic 1. Contents. Prerequisite knowledge Before studying this topic you should be able to:

1 Topic 1 Data Representation Contents 1.1 Introduction...................................... 3 1.2 Numbers....................................... 3 1.2.1 The binary number system.........................

### AN INTRO TO DATA MANAGEMENT

AN INTRO TO DATA MANAGEMENT HOW TO USE DATA TO SUCCEED AT YOUR JOB by: TABLE OF CONTENTS Chapter 1: The Rise of Global Data Chapter 2: Taking Advantage of Big Data Chapter 3: Data Projects from Start to

### Lecture 2: Number Representation

Lecture 2: Number Representation CSE 30: Computer Organization and Systems Programming Summer Session II 2011 Dr. Ali Irturk Dept. of Computer Science and Engineering University of California, San Diego

### Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering

### Gi-Joon Nam, IBM Research - Austin Sani R. Nassif, Radyalis. Opportunities in Power Distribution Network System Optimization (from EDA Perspective)

Gi-Joon Nam, IBM Research - Austin Sani R. Nassif, Radyalis Opportunities in Power Distribution Network System Optimization (from EDA Perspective) Outline! SmartGrid: What it is! Power Distribution Network

### Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»

### Lab 1: Information Representation I -- Number Systems

Unit 1: Computer Systems, pages 1 of 7 - Department of Computer and Mathematical Sciences CS 1410 Intro to Computer Science with C++ 1 Lab 1: Information Representation I -- Number Systems Objectives:

### Lab 1: Information Representation I -- Number Systems

Unit 1: Computer Systems, pages 1 of 7 - Department of Computer and Mathematical Sciences CS 1408 Intro to Computer Science with Visual Basic 1 Lab 1: Information Representation I -- Number Systems Objectives:

### DATA. SMALL WORD. COLOSSAL IMPLICATIONS. 312.756.1760 info@spr.com www.spr.com SPR Consulting

DATA. SMALL WORD. COLOSSAL IMPLICATIONS. 312.756.1760 info@spr.com www.spr.com SPR Consulting Data, not such a small word after all. Truth is, whether we call it big data or just data, the phenomenon of

### Chapter 1. Contrasting traditional and visual analytics approaches

Chapter 1 Understanding Big Data Analytics In This Chapter Defining Big Data Understanding Big Data Analytics Contrasting traditional and visual analytics approaches The era of Big Data is upon us. The

### BIG DATA FUNDAMENTALS

BIG DATA FUNDAMENTALS Timeframe Minimum of 30 hours Use the concepts of volume, velocity, variety, veracity and value to define big data Learning outcomes Critically evaluate the need for big data management

### After studying this lesson, you will gain a clear understanding on, Binary, Octal, Decimal and hexadecimal number systems

After studying this lesson, you will gain a clear understanding on, various number systems Binary, Octal, Decimal and hexadecimal number systems conversions among number systems units used in measuring

### LARGE, DISTRIBUTED COMPUTING INFRASTRUCTURES OPPORTUNITIES & CHALLENGES. Dominique A. Heger Ph.D. DHTechnologies, Data Nubes Austin, TX, USA

LARGE, DISTRIBUTED COMPUTING INFRASTRUCTURES OPPORTUNITIES & CHALLENGES Dominique A. Heger Ph.D. DHTechnologies, Data Nubes Austin, TX, USA Performance & Capacity Studies Availability & Reliability Studies

### Pervasive Location Analytics and A Billion Dollar Opportunity. Jitender Aswani, Portfolio Strategist, Business Analytics, SAP

[ Pervasive Location Analytics and A Billion Dollar Opportunity Jitender Aswani, Portfolio Strategist, Business Analytics, SAP Agenda 1. Big Data & Pervasive Nature of Location-Based Solutions 2. Location

### NewPoint IT Consulting BIG DATA WHITE PAPER. NewPoint Information Technology Consulting

NewPoint IT Consulting BIG DATA WHITE PAPER NewPoint Information Technology Consulting Content 1 Big Data: Challenges and Opportunities for Companies... 3 2 The Technical and Business Drivers of BIG DATA...

### Activity 1: Bits and Bytes

ICS3U (Java): Introduction to Computer Science, Grade 11, University Preparation Activity 1: Bits and Bytes The Binary Number System Computers use electrical circuits that include many transistors and

### So What s the Big Deal?

So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

### International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 BIG DATA: A New Technology Farah DeebaHasan Student, M.Tech.(IT) Anshul Kumar Sharma Student, M.Tech.(IT)

### Data Representation. Data Representation, Storage, and Retrieval. Data Representation. Data Representation. Data Representation. Data Representation

, Storage, and Retrieval ULM/HHIM Summer Program Project 3, Day 3, Part 3 Digital computers convert the data they process into a digital value. Text Audio Images/Graphics Video Digitizing 00000000... 6/8/20

### Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Computers CMPT 125: Lecture 1: Understanding the Computer Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 3, 2009 A computer performs 2 basic functions: 1.

### BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER

BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER 1 MAKING THE RIGHT DECISSION AT THE RIGHT PLACE AT THE RIGHT TIME 2 THE DATA MULTIPLIER EFFECT AT WORK BUSINESS DRIVEN HUMAN DRIVEN MACHINE DRIVEN

### What happens when Big Data and Master Data come together?

What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information

### Von Social Media zum Social Business Ein Megatrend für die Geschäftswelt

Stephan Schneider Executive Technology Briefer 07/11/2013 Von Social Media zum Social Business Ein Megatrend für die Geschäftswelt Our experiences are changing in the new Social world How I Buy Interacting

### Architectures for massive data management

Architectures for massive data management Apache Kafka, Samza, Storm Albert Bifet albert.bifet@telecom-paristech.fr October 20, 2015 Stream Engine Motivation Digital Universe EMC Digital Universe with

### Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data

Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems

### Right Sizing Big Data for Credit Unions. Filene and CUCC Research Symposium May 3, 2015

Right Sizing Big Data for Credit Unions Filene and CUCC Research Symposium May 3, 2015 Today s Presentation Share preliminary findings from the paper Challenge our thinking and understanding of what data

### 2585-29. Joint ICTP-TWAS School on Coherent State Transforms, Time- Frequency and Time-Scale Analysis, Applications.

2585-29 Joint ICTP-TWAS School on Coherent State Transforms, Time- Frequency and Time-Scale Analysis, Applications 2-20 June 2014 Sparsity for big data C. De Mol ULB, Brussels Belgium Sparsity for Big

Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

### Outline. mass storage hash functions. logical key values nested tables. storing information between executions using DBM files

Outline 1 Files and Databases mass storage hash functions 2 Dictionaries logical key values nested tables 3 Persistent Data storing information between executions using DBM files 4 Rule Based Programming

### Keynote: Studiedag Statistische Software Vlaamse Hogescholenraad Jeroen Derynck - ICT Director @ iminds

Keynote: Studiedag Statistische Software Vlaamse Hogescholenraad Jeroen Derynck - ICT Director @ iminds We see Information Technology as the driving force of innovation in many sectors of society We are

### 1.3 Data Representation

8628-28 r4 vs.fm Page 9 Thursday, January 2, 2 2:4 PM.3 Data Representation 9 appears at Level 3, uses short mnemonics such as ADD, SUB, and MOV, which are easily translated to the ISA level. Assembly

### Data Storage. 1s and 0s

Data Storage As mentioned, computer science involves the study of algorithms and getting machines to perform them before we dive into the algorithm part, let s study the machines that we use today to do

### Information Security, PII and Big Data

ITU Workshop on ICT Security Standardization for Developing Countries (Geneva, Switzerland, 15-16 September 2014) Information Security, PII and Big Data Edward (Ted) Humphreys ISO/IEC JTC 1/SC 27 (WG1

### BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

### Oracle Big Data for Dummies

Oracle Big Data for Dummies Sai Janakiram Penumuru WW Product Expert Cloud Platforms Hewlett-Packard, India The Father of Microbiology first microbiologist Antonie Philips van Leeuwenhoek 2 Sai Janakiram