What is Data? Also called samples, examples, instances, data points, objects, tuples.

Size: px
Start display at page:

Download "What is Data? Also called samples, examples, instances, data points, objects, tuples."

Transcription

1 What is Data? Data sets are made up of data objects. A data object represents an entity. Also called samples, examples, instances, data points, objects, tuples. Data objects are described by attributes. An attribute is a property or characteristic of an object Examples: eye color of a person, temperature, etc. Attribute is also known as variable, field, characteristic, or feature A collection of attributes describe an object Object is also known as record, point, case, sample, entity, or instance Data Mining 1

2 10 A Data Object Attributes Tid Refund Marital Status Taxable Income 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No Cheat Database rows -> data objects; columns -> attributes. Objects 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Data Mining 2

3 Attributes Attribute (or dimensions, features, variables): a data field, representing a characteristic or feature of a data object. E.g., customer _ID, name, address Attribute values are numbers or symbols assigned to an attribute Distinction between attributes and attribute values Same attribute can be mapped to different attribute values Example: height can be measured in feet or meters Different attributes can be mapped to the same set of values Example: Attribute values for ID and age are integers But properties of attribute values can be different; ID has no limit but age has a maximum and minimum value Data Mining 3

4 Attribute Types There are different types of attributes Nominal: categories, states, or names of things Hair_color = {auburn, black, blond, brown, grey, red, white} marital status, occupation, ID numbers, zip codes Binary Nominal attribute with only 2 states (0 and 1) Ordinal Values have a meaningful order (ranking) but magnitude between successive values is not known. Size = {small, medium, large}, grades, army rankings Data Mining 4

5 Attribute Types: Numeric Attribute Types Quantity (integer or real-valued) Interval Measured on a scale of equal-sized units Values have order E.g., temperature in C or F, calendar dates No true zero-point Ratio Inherent zero-point We can speak of values as being an order of magnitude larger than the unit of measurement (10 K is twice as high as 5 K ). e.g., temperature in Kelvin, length, counts, monetary quantities Data Mining 5

6 Discrete vs. Continuous Attributes Discrete Attribute Has only a finite or countably infinite set of values E.g., zip codes, profession, or the set of words in a collection of documents Sometimes, represented as integer variables Note: Binary attributes are a special case of discrete attributes Continuous Attribute Has real numbers as attribute values E.g., temperature, height, or weight Practically, real values can only be measured and represented using a finite number of digits Continuous attributes are typically represented as floating-point variables Data Mining 6

7 Types of data sets Record Relational records Data matrix, e.g., numerical matrix, crosstabs Document data: text documents: term-frequency vector Transaction data Graph and network World Wide Web Social or information networks Molecular Structures Ordered Video data: sequence of images Temporal data: time-series Sequential Data: transaction sequences Genetic sequence data Spatial, image and multimedia: Spatial data: maps Image data: Video data: Data Mining 7

8 Important Characteristics of Structured Data Dimensionality Curse of dimensionality Sparsity Only presence counts Resolution Patterns depend on the scale Distribution Centrality and dispersion Data Mining 8

9 10 Record Data Data that consists of a collection of records, each of which consists of a fixed set of attributes Tid Refund Marital Status Taxable Income 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No Cheat 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Data Mining 9

10 Data Matrix If data objects have the same fixed set of numeric attributes, then the data objects can be thought of as points in a multi-dimensional space, where each dimension represents a distinct attribute Such data set can be represented by an m by n matrix, where there are m rows, one for each object, and n columns, one for each attribute Projection of x Load Projection of y load Distance Load Thickness Data Mining 10

11 Document Data Each document becomes a `term' vector, each term is a component (attribute) of the vector, the value of each component is the number of times the corresponding term occurs in the document Data Mining 11

12 Transaction Data A special type of record data, where each record (transaction) involves a set of items. For example, consider a grocery store. The set of products purchased by a customer during one shopping trip constitute a transaction, while the individual products that were purchased are the items. TID Items 1 Bread, Coke, Milk 2 Beer, Bread 3 Beer, Coke, Diaper, Milk 4 Beer, Bread, Diaper, Milk 5 Coke, Diaper, Milk Data Mining 12

13 Graph Data Examples: Generic graph and HTML Links <a href="papers/papers.html#bbbb"> Data Mining </a> <li> <a href="papers/papers.html#aaaa"> Graph Partitioning </a> <li> <a href="papers/papers.html#aaaa"> Parallel Solution of Sparse Linear System of Equations </a> <li> <a href="papers/papers.html#ffff"> N-Body Computation and Dense Linear System Solvers 5 Data Mining 13

14 Chemical Data Benzene Molecule: C 6 H 6 Data Mining 14

15 Sequences of transactions Items/Events Ordered Data An element of the sequence Data Mining 15

16 Ordered Data Genomic sequence data GGTTCCGCCTTCAGCCCCGCGCC CGCAGGGCCCGCCCCGCGCCGTC GAGAAGGGCCCGCCTGGCGGGCG GGGGGAGGCGGGGCCGCCCGAGC CCAACCGAGTCCGACCAGGTGCC CCCTCTGCTCGGCCTAGACCTGA GCTCATTAGGCGGCAGCGGACAG GCCAAGTAGAACACGCGAAGCGC TGGGCTGCCTGCTGCGACCAGGG Data Mining 16

Lecture 6 - Data Mining Processes

Lecture 6 - Data Mining Processes Lecture 6 - Data Mining Processes Dr. Songsri Tangsripairoj Dr.Benjarath Pupacdi Faculty of ICT, Mahidol University 1 Cross-Industry Standard Process for Data Mining (CRISP-DM) Example Application: Telephone

More information

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

More information

Data Mining 5. Cluster Analysis

Data Mining 5. Cluster Analysis Data Mining 5. Cluster Analysis 5.2 Fall 2009 Instructor: Dr. Masoud Yaghini Outline Data Structures Interval-Valued (Numeric) Variables Binary Variables Categorical Variables Ordinal Variables Variables

More information

Intro to GIS Winter 2011. Data Visualization Part I

Intro to GIS Winter 2011. Data Visualization Part I Intro to GIS Winter 2011 Data Visualization Part I Cartographer Code of Ethics Always have a straightforward agenda and have a defining purpose or goal for each map Always strive to know your audience

More information

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu

Medical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?

More information

Foundations of Artificial Intelligence. Introduction to Data Mining

Foundations of Artificial Intelligence. Introduction to Data Mining Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present

More information

2+2 Just type and press enter and the answer comes up ans = 4

2+2 Just type and press enter and the answer comes up ans = 4 Demonstration Red text = commands entered in the command window Black text = Matlab responses Blue text = comments 2+2 Just type and press enter and the answer comes up 4 sin(4)^2.5728 The elementary functions

More information

Concepts of Variables. Levels of Measurement. The Four Levels of Measurement. Nominal Scale. Greg C Elvers, Ph.D.

Concepts of Variables. Levels of Measurement. The Four Levels of Measurement. Nominal Scale. Greg C Elvers, Ph.D. Concepts of Variables Greg C Elvers, Ph.D. 1 Levels of Measurement When we observe and record a variable, it has characteristics that influence the type of statistical analysis that we can perform on it

More information

Data Mining Apriori Algorithm

Data Mining Apriori Algorithm 10 Data Mining Apriori Algorithm Apriori principle Frequent itemsets generation Association rules generation Section 6 of course book TNM033: Introduction to Data Mining 1 Association Rule Mining (ARM)

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

Lecture 2: Types of Variables

Lecture 2: Types of Variables 2typesofvariables.pdf Michael Hallstone, Ph.D. hallston@hawaii.edu Lecture 2: Types of Variables Recap what we talked about last time Recall how we study social world using populations and samples. Recall

More information

Arithmetic and Algebra of Matrices

Arithmetic and Algebra of Matrices Arithmetic and Algebra of Matrices Math 572: Algebra for Middle School Teachers The University of Montana 1 The Real Numbers 2 Classroom Connection: Systems of Linear Equations 3 Rational Numbers 4 Irrational

More information

Optimization Modeling for Mining Engineers

Optimization Modeling for Mining Engineers Optimization Modeling for Mining Engineers Alexandra M. Newman Division of Economics and Business Slide 1 Colorado School of Mines Seminar Outline Linear Programming Integer Linear Programming Slide 2

More information

2-1 Position, Displacement, and Distance

2-1 Position, Displacement, and Distance 2-1 Position, Displacement, and Distance In describing an object s motion, we should first talk about position where is the object? A position is a vector because it has both a magnitude and a direction:

More information

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web

More information

Chapter 6: Constructing and Interpreting Graphic Displays of Behavioral Data

Chapter 6: Constructing and Interpreting Graphic Displays of Behavioral Data Chapter 6: Constructing and Interpreting Graphic Displays of Behavioral Data Chapter Focus Questions What are the benefits of graphic display and visual analysis of behavioral data? What are the fundamental

More information

Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

More information

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions.

Algebra 2 Chapter 1 Vocabulary. identity - A statement that equates two equivalent expressions. Chapter 1 Vocabulary identity - A statement that equates two equivalent expressions. verbal model- A word equation that represents a real-life problem. algebraic expression - An expression with variables.

More information

Data Storage 3.1. Foundations of Computer Science Cengage Learning

Data Storage 3.1. Foundations of Computer Science Cengage Learning 3 Data Storage 3.1 Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: List five different data types used in a computer. Describe how

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

Math 215 HW #6 Solutions

Math 215 HW #6 Solutions Math 5 HW #6 Solutions Problem 34 Show that x y is orthogonal to x + y if and only if x = y Proof First, suppose x y is orthogonal to x + y Then since x, y = y, x In other words, = x y, x + y = (x y) T

More information

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005 Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005 When you create a graph, you step through a series of choices, including which type of graph you should use and several

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining

Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining Introduction to Artificial Intelligence G51IAI An Introduction to Data Mining Learning Objectives Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees

More information

CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining. Chengkai Li University of Texas at Arlington Spring 2016

CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining. Chengkai Li University of Texas at Arlington Spring 2016 CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining Chengkai Li University of Texas at Arlington Spring 2016 Big Data http://dilbert.com/strip/2012-07-29 Big Data http://www.ibmbigdatahub.com/infographic/four-vs-big-data

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

Data Mining: Introduction

Data Mining: Introduction Data Mining: Introduction Introducing the course How the course is organized How students are evaluated Deadlines Data Mining [Chapt. 1 of course book] What is it about? The KDD process Relations to other

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Business Statistics: Intorduction

Business Statistics: Intorduction Business Statistics: Intorduction Donglei Du (ddu@unb.edu) Faculty of Business Administration, University of New Brunswick, NB Canada Fredericton E3B 9Y2 September 23, 2015 Donglei Du (UNB) AlgoTrading

More information

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to: Chapter 3 Data Storage Objectives After studying this chapter, students should be able to: List five different data types used in a computer. Describe how integers are stored in a computer. Describe how

More information

Chapter 1: The Nature of Probability and Statistics

Chapter 1: The Nature of Probability and Statistics Chapter 1: The Nature of Probability and Statistics Learning Objectives Upon successful completion of Chapter 1, you will have applicable knowledge of the following concepts: Statistics: An Overview and

More information

SOST 201 September 18-20, 2006. Measurement of Variables 2

SOST 201 September 18-20, 2006. Measurement of Variables 2 1 Social Studies 201 September 18-20, 2006 Measurement of variables See text, chapter 3, pp. 61-86. These notes and Chapter 3 of the text examine ways of measuring variables in order to describe members

More information

Physics Notes Class 11 CHAPTER 3 MOTION IN A STRAIGHT LINE

Physics Notes Class 11 CHAPTER 3 MOTION IN A STRAIGHT LINE 1 P a g e Motion Physics Notes Class 11 CHAPTER 3 MOTION IN A STRAIGHT LINE If an object changes its position with respect to its surroundings with time, then it is called in motion. Rest If an object

More information

MAT 242 Test 2 SOLUTIONS, FORM T

MAT 242 Test 2 SOLUTIONS, FORM T MAT 242 Test 2 SOLUTIONS, FORM T 5 3 5 3 3 3 3. Let v =, v 5 2 =, v 3 =, and v 5 4 =. 3 3 7 3 a. [ points] The set { v, v 2, v 3, v 4 } is linearly dependent. Find a nontrivial linear combination of these

More information

In order to describe motion you need to describe the following properties.

In order to describe motion you need to describe the following properties. Chapter 2 One Dimensional Kinematics How would you describe the following motion? Ex: random 1-D path speeding up and slowing down In order to describe motion you need to describe the following properties.

More information

Visualization Software

Visualization Software Visualization Software Maneesh Agrawala CS 294-10: Visualization Fall 2007 Assignment 1b: Deconstruction & Redesign Due before class on Sep 12, 2007 1 Assignment 2: Creating Visualizations Use existing

More information

Largest Fixed-Aspect, Axis-Aligned Rectangle

Largest Fixed-Aspect, Axis-Aligned Rectangle Largest Fixed-Aspect, Axis-Aligned Rectangle David Eberly Geometric Tools, LLC http://www.geometrictools.com/ Copyright c 1998-2016. All Rights Reserved. Created: February 21, 2004 Last Modified: February

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

More information

1 Determinants and the Solvability of Linear Systems

1 Determinants and the Solvability of Linear Systems 1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped

More information

Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan

Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:

More information

Statistical research is always concerned with a group of research objects, called population or universe (populaatio/perusjoukko).

Statistical research is always concerned with a group of research objects, called population or universe (populaatio/perusjoukko). 2. Data and Measurement 2.1. Basic Concepts Statistical research is always concerned with a group of research objects, called population or universe (populaatio/perusjoukko). Determining the bounds of

More information

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

More information

Data Mining: Data Preprocessing. I211: Information infrastructure II

Data Mining: Data Preprocessing. I211: Information infrastructure II Data Mining: Data Preprocessing I211: Information infrastructure II 10 What is Data? Collection of data objects and their attributes Attributes An attribute is a property or characteristic of an object

More information

Framing Business Problems as Data Mining Problems

Framing Business Problems as Data Mining Problems Framing Business Problems as Data Mining Problems Asoka Diggs Data Scientist, Intel IT January 21, 2016 Legal Notices This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS

More information

Chapter 4 One Dimensional Kinematics

Chapter 4 One Dimensional Kinematics Chapter 4 One Dimensional Kinematics 41 Introduction 1 4 Position, Time Interval, Displacement 41 Position 4 Time Interval 43 Displacement 43 Velocity 3 431 Average Velocity 3 433 Instantaneous Velocity

More information

Measurement and Measurement Scales

Measurement and Measurement Scales Measurement and Measurement Scales Measurement is the foundation of any scientific investigation Everything we do begins with the measurement of whatever it is we want to study Definition: measurement

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

6. Cholesky factorization

6. Cholesky factorization 6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =

MAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A = MAT 200, Midterm Exam Solution. (0 points total) a. (5 points) Compute the determinant of the matrix 2 2 0 A = 0 3 0 3 0 Answer: det A = 3. The most efficient way is to develop the determinant along the

More information

Name: Section Registered In:

Name: Section Registered In: Name: Section Registered In: Math 125 Exam 3 Version 1 April 24, 2006 60 total points possible 1. (5pts) Use Cramer s Rule to solve 3x + 4y = 30 x 2y = 8. Be sure to show enough detail that shows you are

More information

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,

More information

CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY. 3.1 Basic Concepts of Digital Imaging

CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY. 3.1 Basic Concepts of Digital Imaging Physics of Medical X-Ray Imaging (1) Chapter 3 CHAPTER 3: DIGITAL IMAGING IN DIAGNOSTIC RADIOLOGY 3.1 Basic Concepts of Digital Imaging Unlike conventional radiography that generates images on film through

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Introduction to Data Mining. Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj

Introduction to Data Mining. Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Introduction to Data Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Overview Introduction The Data Mining Process The Basic Data Types The Major Building Blocks Scalability and Streaming

More information

APPLICATIONS AND MODELING WITH QUADRATIC EQUATIONS

APPLICATIONS AND MODELING WITH QUADRATIC EQUATIONS APPLICATIONS AND MODELING WITH QUADRATIC EQUATIONS Now that we are starting to feel comfortable with the factoring process, the question becomes what do we use factoring to do? There are a variety of classic

More information

System Identification for Acoustic Comms.:

System Identification for Acoustic Comms.: System Identification for Acoustic Comms.: New Insights and Approaches for Tracking Sparse and Rapidly Fluctuating Channels Weichang Li and James Preisig Woods Hole Oceanographic Institution The demodulation

More information

Algebra 1 Course Information

Algebra 1 Course Information Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through

More information

Schedule of Accreditation issued by United Kingdom Accreditation Service 2 Pine Trees, Chertsey Lane, Staines-upon-Thames, TW18 3HR, UK

Schedule of Accreditation issued by United Kingdom Accreditation Service 2 Pine Trees, Chertsey Lane, Staines-upon-Thames, TW18 3HR, UK 2 Pine Trees, Chertsey Lane, Staines-upon-Thames, TW18 3HR, UK Unit 7 Contact: Mr S C Sparks Solent Industrial Estate Tel: +44 (0)1489 790296 Hedge End Fax: +44 (0)1489 790294 Southampton E-Mail: info@southcal.co.uk

More information

Math Content by Strand 1

Math Content by Strand 1 Patterns, Functions, and Change Math Content by Strand 1 Kindergarten Kindergarten students construct, describe, extend, and determine what comes next in repeating patterns. To identify and construct repeating

More information

Logistic Regression (1/24/13)

Logistic Regression (1/24/13) STA63/CBB540: Statistical methods in computational biology Logistic Regression (/24/3) Lecturer: Barbara Engelhardt Scribe: Dinesh Manandhar Introduction Logistic regression is model for regression used

More information

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining Introduction of Information Visualization and Visual Analytics Chapter 4 Data Mining Books! P. N. Tan, M. Steinbach, V. Kumar: Introduction to Data Mining. First Edition, ISBN-13: 978-0321321367, 2005.

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

6 Scalar, Stochastic, Discrete Dynamic Systems

6 Scalar, Stochastic, Discrete Dynamic Systems 47 6 Scalar, Stochastic, Discrete Dynamic Systems Consider modeling a population of sand-hill cranes in year n by the first-order, deterministic recurrence equation y(n + 1) = Ry(n) where R = 1 + r = 1

More information

Math 202-0 Quizzes Winter 2009

Math 202-0 Quizzes Winter 2009 Quiz : Basic Probability Ten Scrabble tiles are placed in a bag Four of the tiles have the letter printed on them, and there are two tiles each with the letters B, C and D on them (a) Suppose one tile

More information

with functions, expressions and equations which follow in units 3 and 4.

with functions, expressions and equations which follow in units 3 and 4. Grade 8 Overview View unit yearlong overview here The unit design was created in line with the areas of focus for grade 8 Mathematics as identified by the Common Core State Standards and the PARCC Model

More information

IBM SPSS Direct Marketing 19

IBM SPSS Direct Marketing 19 IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS

More information

Answer Key for California State Standards: Algebra I

Answer Key for California State Standards: Algebra I Algebra I: Symbolic reasoning and calculations with symbols are central in algebra. Through the study of algebra, a student develops an understanding of the symbolic language of mathematics and the sciences.

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.

More information

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 8/05/2005 1 What is data exploration? A preliminary

More information

A Quick Guide to Constructing an SPSS Code Book Prepared by Amina Jabbar, Centre for Research on Inner City Health

A Quick Guide to Constructing an SPSS Code Book Prepared by Amina Jabbar, Centre for Research on Inner City Health A Quick Guide to Constructing an SPSS Code Book Prepared by Amina Jabbar, Centre for Research on Inner City Health 1. To begin, double click on SPSS icon. The icon will probably look something like this

More information

Ratios and Proportional Relationships Set 1: Ratio, Proportion, and Scale... 1 Set 2: Proportional Relationships... 8

Ratios and Proportional Relationships Set 1: Ratio, Proportion, and Scale... 1 Set 2: Proportional Relationships... 8 Table of Contents Introduction...v Standards Correlations... viii Materials List... x... 1 Set 2: Proportional Relationships... 8 The Number System Set 1: Integers and Absolute Value... 15 Set 2: Comparing

More information

Implementing GIS in Optical Fiber. Communication

Implementing GIS in Optical Fiber. Communication KING FAHD UNIVERSITY OF PETROLEUM AND MINERALS COLLEGE OF ENVIRONMENTAL DESIGN CITY & RIGINAL PLANNING DEPARTMENT TERM ROJECT Implementing GIS in Optical Fiber Communication By Ahmed Saeed Bagazi ID# 201102590

More information

System Modeling Introduction Rugby Meta-Model Finite State Machines Petri Nets Untimed Model of Computation Synchronous Model of Computation Timed Model of Computation Integration of Computational Models

More information

Digital Versus Analog Lesson 2 of 2

Digital Versus Analog Lesson 2 of 2 Digital Versus Analog Lesson 2 of 2 HDTV Grade Level: 9-12 Subject(s): Science, Technology Prep Time: < 10 minutes Activity Duration: 50 minutes Materials Category: General classroom National Education

More information

DATA WAREHOUSE E KNOWLEDGE DISCOVERY

DATA WAREHOUSE E KNOWLEDGE DISCOVERY DATA WAREHOUSE E KNOWLEDGE DISCOVERY Prof. Fabio A. Schreiber Dipartimento di Elettronica e Informazione Politecnico di Milano DATA WAREHOUSE (DW) A TECHNIQUE FOR CORRECTLY ASSEMBLING AND MANAGING DATA

More information

How To Understand The Scientific Theory Of Evolution

How To Understand The Scientific Theory Of Evolution Introduction to Statistics for the Life Sciences Fall 2014 Volunteer Definition A biased sample systematically overestimates or underestimates a characteristic of the population Paid subjects for drug

More information

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d. EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models

More information

Understanding Raster Data

Understanding Raster Data Introduction The following document is intended to provide a basic understanding of raster data. Raster data layers (commonly referred to as grids) are the essential data layers used in all tools developed

More information

Principles of Data Mining

Principles of Data Mining Principles of Data Mining Instructor: Sargur N. 1 University at Buffalo The State University of New York srihari@cedar.buffalo.edu Introduction: Topics 1. Introduction to Data Mining 2. Nature of Data

More information

GE Medical Systems Training in Partnership. Module 8: IQ: Acquisition Time

GE Medical Systems Training in Partnership. Module 8: IQ: Acquisition Time Module 8: IQ: Acquisition Time IQ : Acquisition Time Objectives...Describe types of data acquisition modes....compute acquisition times for 2D and 3D scans. 2D Acquisitions The 2D mode acquires and reconstructs

More information

South Carolina College- and Career-Ready (SCCCR) Pre-Calculus

South Carolina College- and Career-Ready (SCCCR) Pre-Calculus South Carolina College- and Career-Ready (SCCCR) Pre-Calculus Key Concepts Arithmetic with Polynomials and Rational Expressions PC.AAPR.2 PC.AAPR.3 PC.AAPR.4 PC.AAPR.5 PC.AAPR.6 PC.AAPR.7 Standards Know

More information

Example SECTION 13-1. X-AXIS - the horizontal number line. Y-AXIS - the vertical number line ORIGIN - the point where the x-axis and y-axis cross

Example SECTION 13-1. X-AXIS - the horizontal number line. Y-AXIS - the vertical number line ORIGIN - the point where the x-axis and y-axis cross CHAPTER 13 SECTION 13-1 Geometry and Algebra The Distance Formula COORDINATE PLANE consists of two perpendicular number lines, dividing the plane into four regions called quadrants X-AXIS - the horizontal

More information

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3 COMP 5318 Data Exploration and Analysis Chapter 3 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping

More information

Representing Geography

Representing Geography 3 Representing Geography OVERVIEW This chapter introduces the concept of representation, or the construction of a digital model of some aspect of the Earth s surface. The geographic world is extremely

More information

Digital Imaging and Multimedia. Filters. Ahmed Elgammal Dept. of Computer Science Rutgers University

Digital Imaging and Multimedia. Filters. Ahmed Elgammal Dept. of Computer Science Rutgers University Digital Imaging and Multimedia Filters Ahmed Elgammal Dept. of Computer Science Rutgers University Outlines What are Filters Linear Filters Convolution operation Properties of Linear Filters Application

More information

Vectors. Objectives. Assessment. Assessment. Equations. Physics terms 5/15/14. State the definition and give examples of vector and scalar variables.

Vectors. Objectives. Assessment. Assessment. Equations. Physics terms 5/15/14. State the definition and give examples of vector and scalar variables. Vectors Objectives State the definition and give examples of vector and scalar variables. Analyze and describe position and movement in two dimensions using graphs and Cartesian coordinates. Organize and

More information

Core Components Data Type Catalogue Version 3.1 17 October 2011

Core Components Data Type Catalogue Version 3.1 17 October 2011 Core Components Data Type Catalogue Version 3.1 17 October 2011 Core Components Data Type Catalogue Version 3.1 Page 1 of 121 Abstract CCTS 3.0 defines the rules for developing Core Data Types and Business

More information

Measurement. How are variables measured?

Measurement. How are variables measured? Measurement Y520 Strategies for Educational Inquiry Robert S Michael Measurement-1 How are variables measured? First, variables are defined by conceptual definitions (constructs) that explain the concept

More information

NOTES ON LINEAR TRANSFORMATIONS

NOTES ON LINEAR TRANSFORMATIONS NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all

More information

Module 2: Introduction to Quantitative Data Analysis

Module 2: Introduction to Quantitative Data Analysis Module 2: Introduction to Quantitative Data Analysis Contents Antony Fielding 1 University of Birmingham & Centre for Multilevel Modelling Rebecca Pillinger Centre for Multilevel Modelling Introduction...

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

Solving Systems of Linear Equations Using Matrices

Solving Systems of Linear Equations Using Matrices Solving Systems of Linear Equations Using Matrices What is a Matrix? A matrix is a compact grid or array of numbers. It can be created from a system of equations and used to solve the system of equations.

More information

Jim Lambers MAT 169 Fall Semester 2009-10 Lecture 25 Notes

Jim Lambers MAT 169 Fall Semester 2009-10 Lecture 25 Notes Jim Lambers MAT 169 Fall Semester 009-10 Lecture 5 Notes These notes correspond to Section 10.5 in the text. Equations of Lines A line can be viewed, conceptually, as the set of all points in space that

More information