Foundations of Data Science 1


 Clarence Hancock
 2 years ago
 Views:
Transcription
1 Foundations of Data Science John Hopcroft Ravindran Kannan Version 2/8/204 These notes are a first draft of a book being written by Hopcroft and Kannan and in many places are incomplete. However, the notes are in good enough shape to prepare lectures for a modern theoretical course in computer science. Please do not put solutions to exercises online as it is important for students to work out solutions for themselves rather than copy them from the internet. Thanks JEH Copyright 20. All rights reserved
2 Contents Introduction 7 2 HighDimensional Space 0 2. Properties of HighDimensional Space The Law of Large Numbers The HighDimensional Sphere The Sphere and the Cube in High Dimensions Volume and Surface Area of the Unit Sphere The Volume is Near the Equator The Volume is in a Narrow Annulus The Surface Area is Near the Equator Volumes of Other Solids Generating Points Uniformly at Random on the Surface of a Sphere Gaussians in High Dimension Bounds on Tail Probability Applications of the tail bound Random Projection and JohnsonLindenstrauss Theorem Bibliographic Notes Exercises BestFit Subspaces and Singular Value Decomposition (SVD) Singular Vectors Singular Value Decomposition (SVD) Best Rank k Approximations Left Singular Vectors Power Method for Computing the Singular Value Decomposition Applications of Singular Value Decomposition Principal Component Analysis Clustering a Mixture of Spherical Gaussians Spectral Decomposition Singular Vectors and Ranking Documents An Application of SVD to a Discrete Optimization Problem Singular Vectors and Eigenvectors Bibliographic Notes Exercises Random Graphs The G(n, p) Model Degree Distribution Existence of Triangles in G(n, d/n) Phase Transitions The Giant Component
3 4.4 Branching Processes Cycles and Full Connectivity Emergence of Cycles Full Connectivity Threshold for O(ln n) Diameter Phase Transitions for Increasing Properties Phase Transitions for CNFsat Nonuniform and Growth Models of Random Graphs Nonuniform Models Giant Component in Random Graphs with Given Degree Distribution Growth Models Growth Model Without Preferential Attachment Growth Model With Preferential Attachment Small World Graphs Bibliographic Notes Exercises Random Walks and Markov Chains Stationary Distribution Electrical Networks and Random Walks Random Walks on Undirected Graphs with Unit Edge Weights Random Walks in Euclidean Space The Web as a Markov Chain Markov Chain Monte Carlo MetropolisHasting Algorithm Gibbs Sampling Areas and Volumes Convergence of Random Walks on Undirected Graphs Using Normalized Conductance to Prove Convergence Bibliographic Notes Exercises Learning and VCdimension Learning Linear Separators, the Perceptron Algorithm, and Margins Nonlinear Separators, Support Vector Machines, and Kernels Strong and Weak Learning  Boosting Number of Examples Needed for Prediction: VCDimension VapnikChervonenkis or VCDimension Examples of Set Systems and Their VCDimension The Shatter Function Shatter Function for Set Systems of Bounded VCDimension Intersection Systems
4 6.7 The VC Theorem Simple Learning Bibliographic Notes Exercises Algorithms for Massive Data Problems Frequency Moments of Data Streams Number of Distinct Elements in a Data Stream Counting the Number of Occurrences of a Given Element Counting Frequent Elements The Second Moment Matrix Algorithms Using Sampling Matrix Multiplication Using Sampling Sketch of a Large Matrix Sketches of Documents Exercises Clustering Some Clustering Examples A kmeans Clustering Algorithm A Greedy Algorithm for kcenter Criterion Clustering Spectral Clustering Recursive Clustering Based on Sparse Cuts Kernel Methods Agglomerative Clustering Dense Submatrices and Communities Flow Methods Finding a Local Cluster Without Examining the Whole Graph Axioms for Clustering An Impossibility Result A Satisfiable Set of Axioms Exercises Topic Models, Hidden Markov Process, Graphical Models, and Belief Propagation Topic Models Hidden Markov Model Graphical Models, and Belief Propagation Bayesian or Belief Networks Markov Random Fields Factor Graphs Tree Algorithms Message Passing in general Graphs Graphs with a Single Cycle
5 9.0 Belief Update in Networks with a Single Loop Maximum Weight Matching Warning Propagation Correlation Between Variables Exercises Other Topics Rankings Hare System for Voting Compressed Sensing and Sparse Vectors Unique Reconstruction of a Sparse Vector The Exact Reconstruction Property Restricted Isometry Property Applications Sparse Vector in Some Coordinate Basis A Representation Cannot be Sparse in Both Time and Frequency Domains Biological Finding Overlapping Cliques or Communities Low Rank Matrices Gradient Linear Programming The Ellipsoid Algorithm Integer Optimization SemiDefinite Programming Exercises Appendix 357. Asymptotic Notation Useful relations Useful Inequalities Probability Sample Space, Events, Independence Linearity of Expectation Union Bound Indicator Variables Variance Variance of the Sum of Independent Random Variables Median The Central Limit Theorem Probability Distributions Bayes Rule and Estimators Tail Bounds and Chernoff inequalities
6 .5 Eigenvalues and Eigenvectors Eigenvalues and Eigenvectors Symmetric Matrices Relationship between SVD and Eigen Decomposition Extremal Properties of Eigenvalues Eigenvalues of the Sum of Two Symmetric Matrices Norms Important Norms and Their Properties Linear Algebra Distance between subspaces Generating Functions Generating Functions for Sequences Defined by Recurrence Relationships The Exponential Generating Function and the Moment Generating Function Miscellaneous Lagrange multipliers Finite Fields Hash Functions Application of Mean Value Theorem Sperner s Lemma Prüfer Exercises Index 40 6
7 Foundations of Data Science John Hopcroft and Ravindran Kannan 2/8/204 Introduction Computer science as an academic discipline began in the 60 s. Emphasis was on programming languages, compilers, operating systems, and the mathematical theory that supported these areas. Courses in theoretical computer science covered finite automata, regular expressions, context free languages, and computability. In the 70 s, algorithms was added as an important component of theory. The emphasis was on making computers useful. Today, a fundamental change is taking place and the focus is more on applications. There are many reasons for this change. The merging of computing and communications has played an important role. The enhanced ability to observe, collect and store data in the natural sciences, in commerce, and in other fields calls for a change in our understanding of data and how to handle it in the modern setting. The emergence of the web and social networks, which are by far the largest such structures, presents both opportunities and challenges for theory. While traditional areas of computer science are still important and highly skilled individuals are needed in these areas, the majority of researchers will be involved with using computers to understand and make usable massive data arising in applications, not just how to make computers useful on specific welldefined problems. With this in mind we have written this book to cover the theory likely to be useful in the next 40 years, just as automata theory, algorithms and related topics gave students an advantage in the last 40 years. One of the major changes is the switch from discrete mathematics to more of an emphasis on probability, statistics, and numerical methods. Early drafts of the book have been used for both undergraduate and graduate courses. Background material needed for an undergraduate course has been put in the appendix. For this reason, the appendix has homework problems. This book starts with the treatment of high dimensional geometry. Modern data in diverse fields such as Information Processing, Search, Machine Learning, etc., is often Copyright 20. All rights reserved 7
8 represented advantageously as vectors with a large number of components. This is so even in cases when the vector representation is not the natural first choice. Our intuition from two or three dimensional space can be surprisingly off the mark when it comes to high dimensional space. Chapter 2 works out the fundamentals needed to understand the differences. The emphasis of the chapter, as well as the book in general, is to get across the mathematical foundations rather than dwell on particular applications that are only briefly described. The mathematical areas most relevant to dealing with highdimensional data are matrix algebra and algorithms. We focus on singular value decomposition, a central tool in this area. Chapter 4 gives a fromfirstprinciples description of this. Applications of singular value decomposition include principal component analysis, a widely used technique which we touch upon, as well as modern applications to statistical mixtures of probability densities, discrete optimization, etc., which are described in more detail. Central to our understanding of large structures, like the web and social networks, is building models to capture essential properties of these structures. The simplest model is that of a random graph formulated by Erdös and Renyi, which we study in detail proving that certain global phenomena, like a giant connected component, arise in such structures with only local choices. We also describe other models of random graphs. One of the surprises of computer science over the last two decades is that some domainindependent methods have been immensely successful in tackling problems from diverse areas. Machine learning is a striking example. We describe the foundations of machine learning, both learning from given training examples, as well as the theory of Vapnik Chervonenkis dimension, which tells us how many training examples suffice for learning. Another important domainindependent technique is based on Markov chains. The underlying mathematical theory, as well as the connections to electrical networks, forms the core of our chapter on Markov chains. The field of algorithms has traditionally assumed that the input data to a problem is presented in random access memory, which the algorithm can repeatedly access. This is not feasible for modern problems. The streaming model and other models have been formulated to better reflect this. In this setting, sampling plays a crucial role and, indeed, we have to sample on the fly. in Chapter?? we study how to draw good samples efficiently and how to estimate statistical, as well as linear algebra quantities, with such samples. One of the most important tools in the modern toolkit is clustering, dividing data into groups of similar objects. After describing some of the basic methods for clustering, such as the kmeans algorithm, we focus on modern developments in understanding these, as well as newer algorithms. The chapter ends with a study of clustering criteria. This book also covers graphical models and belief propagation, ranking and voting, 8
9 sparse vectors, and compressed sensing. The appendix includes a wealth of background material. A word about notation in the book. To help the student, we have adopted certain notations, and with a few exceptions, adhered to them. We use lower case letters for scaler variables and functions, bold face lower case for vectors, and upper case letters for matrices. Lower case near the beginning of the alphabet tend to be constants, in the middle of the alphabet, such as i, j, and k, are indices in summations, n and m for integer sizes, and x, y and z for variables. Where the literature traditionally uses a symbol for a quantity, we also used that symbol, even if it meant abandoning our convention. If we have a set of points in some vector space, and work with a subspace, we use n for the number of points, d for the dimension of the space, and k for the dimension of the subspace. The term almost surely means with probability one. We use ln n for the natural logarithm and log n for the base two logarithm. If we want base ten, we will use log 0. To simplify notation and to make it easier to read we use E 2 ( x) for ( E( x) ) 2 and E( x) 2 for E ( ( x) 2). 9
10 2 HighDimensional Space In many applications data is in the form of vectors. In other applications, data is not in the form of vectors, but could be usefully represented by vectors. The Vector Space Model [SWY75] is a good example. In the vector space model, a document is represented by a vector, each component of which corresponds to the number of occurrences of a particular term in the document. The English language has on the order of 25,000 words or terms, so each document is represented by a 25,000 dimensional vector. A collection of n documents is represented by a collection of n vectors, one vector per document. The vectors may be arranged as columns of a 25, 000 n matrix. See Figure 2.. A query is also represented by a vector in the same space. The component of the vector corresponding to a term in the query, specifies the importance of the term to the query. To find documents about cars that are not race cars, a query vector will have a large positive component for the word car and also for the words engine and perhaps door, and a negative component for the words race, betting, etc. One needs a measure of relevance or similarity of a query to a document. The dot product or cosine of the angle between the two vectors is an often used measure of similarity. To respond to a query, one computes the dot product or the cosine of the angle between the query vector and each document vector and returns the documents with the highest values of these quantities. While it is by no means clear that this approach will do well for the information retrieval problem, many empirical studies have established the effectiveness of this general approach. The vector space model is useful in ranking or ordering a large collection of documents in decreasing order of importance. For large collections, an approach based on human understanding of each document is not feasible. Instead, an automated procedure is needed that is able to rank documents with those central to the collection ranked highest. Each document is represented as a vector with the vectors forming the columns of a matrix A. The similarity of pairs of documents is defined by the dot product of the vectors. All pairwise similarities are contained in the matrix product A T A. If one assumes that the documents central to the collection are those with high similarity to other documents, then computing A T A enables one to create a ranking. Define the total similarity of document i to be the sum of the entries in the i th row of A T A and rank documents by their total similarity. It turns out that with the vector representation on hand, a better way of ranking is to first find the best fit direction. That is, the unit vector u, for which the sum of squared perpendicular distances of all the vectors to u is minimized. See Figure 2.2. Then, one ranks the vectors according to their dot product with u. The bestfit direction is a wellstudied notion in linear algebra. There is elegant theory and efficient algorithms presented in Chapter 3 that facilitate the ranking as well as applications in many other domains. In the vector space representation of data, properties of vectors such as dot products, 0
11 Figure 2.: A document and its termdocument vector along with a collection of documents represented by their termdocument vectors. distance between vectors, and orthogonality, often have natural interpretations and this is what makes the vector representation more important than just a book keeping device. For example, the squared distance between two 0 vectors representing links on web pages is the number of web pages linked to by only one of the pages. In Figure 2.3, pages 4 and 5 both have links to pages, 3, and 6, but only page 5 has a link to page 2. Thus, the squared distance between the two vectors is one. We have seen that dot products measure similarity. Orthogonality of two nonnegative vectors says that they are disjoint. Thus, if a document collection, e.g., all news articles of a particular year, contained documents on two or more disparate topics, vectors corresponding to documents from different topics would be nearly orthogonal. The dot product, cosine of the angle, distance, etc., are all measures of similarity or dissimilarity, but there are important mathematical and algorithmic differences between them. The random projection theorem presented in this chapter states that a collection of vectors can be projected to a lowerdimensional space approximately preserving all pairwise distances between vectors. Thus, the nearest neighbors of each vector in the collection can be computed in the projected lowerdimensional space. Such a savings in time is not possible for computing pairwise dot products using a simple projection. Our aim in this book is to present the reader with the mathematical foundations to deal with highdimensional data. There are two important parts of this foundation. The first is highdimensional geometry, along with vectors, matrices, and linear algebra. The second more modern aspect is the combination with probability. High dimensionality is a common characteristic in many models and for this reason much of this chapter is devoted to the geometry of highdimensional space, which is quite different from our intuitive understanding of two and three dimensions. We focus first on volumes and surface areas of highdimensional objects like hyperspheres. We will not present details of any one application, but rather present the fundamental theory useful to many applications. One reason probability comes in is that many computational problems are hard if our algorithms are required to be efficient on all possible data. In practical situations, domain knowledge often enables the expert to formulate stochastic models of data. In
12 best fit line Figure 2.2: The best fit line is the line that minimizes the sum of the squared perpendicular distances. (,0,,0,0,) web page 4 (,,,0,0,) web page 5 Figure 2.3: Two web pages as vectors. The squared distance between the two vectors is the number of web pages linked to by just one of the two web pages. customerproduct data, a common assumption is that the goods each customer buys are independent of what goods the others buy. One may also assume that the goods a customer buys satisfies a known probability law, like the Gaussian distribution. In keeping with the spirit of the book, we do not discuss specific stochastic models, but present the fundamentals. An important fundamental is the law of large numbers that states that under the assumption of independence of customers, the total consumption of each good is remarkably close to its mean value. The central limit theorem is of a similar flavor. Indeed, it turns out that picking random points from geometric objects like hyperspheres exhibits almost identical properties in high dimensions. One calls this phenomena the law of large dimensions. We will establish these geometric properties first before discussing Chernoff bounds and related theorems on aggregates of independent random variables. 2. Properties of HighDimensional Space Our intuition about space was formed in two and three dimensions and is often misleading in high dimensions. Consider placing 00 points uniformly at random in a unit square. Each coordinate is generated independently and uniformly at random from the interval [0, ]. Select a point and measure the distance to all other points and observe 2
13 the distribution of distances. Then increase the dimension and generate the points uniformly at random in a 00dimensional unit cube. The distribution of distances becomes concentrated about an average distance. The reason is easy to see. Let x and y be two such points in ddimensions. The distance between x and y is x y = d (x i y i ) 2. i= Since d i= (x i y i ) 2 is the summation of a number of independent random variables of bounded variance, by the law of large numbers the distribution of x y 2 is concentrated about its expected value. Contrast this with the situation where the dimension is two or three and the distribution of distances is spread out. For another example, consider the difference between picking a point uniformly at random from a unitradius circle and from a unitradius sphere in ddimensions. In d dimensions the distance from the point to the center of the sphere is very likely to be between c and, where c is a constant independent of d. This implies that most of d the mass is near the surface of the sphere. Furthermore, the first coordinate, x, of such a point is likely to be between c d and + c d, which we express by saying that most of the mass is near the equator. The equator perpendicular to the x axis is the set {x x = 0}. We will prove these results in this chapter, but first a review of some probability. 2.2 The Law of Large Numbers In the previous section, we claimed that points generated at random in high dimensions were all essentially the same distance apart. The reason is that if one averages n independent samples x, x 2,..., x n of a random variable x, the result will be close to the expected value of x. Specifically the probability that the average will differ from the expected value by more than ɛ is less than some value σ2. nɛ 2 ( ) x + x x n Prob E(x) n > ɛ σ2 nɛ. (2.) 2 Here the σ 2 in the numerator is the variance of x. The larger the variance of the random variable, the greater the probability that the error will exceed ɛ. The number of points n is in the denominator since the more values that are averaged, the smaller the probability that the difference will exceed ɛ. Similarly the larger ɛ is, the smaller the probability that the difference will exceed ɛ and hence ɛ is in the denominator. Notice that squaring ɛ makes the fraction a dimensionalless quantity. To prove the law of large numbers we use two inequalities. The first is Markov s inequality. One can bound the probability that a nonnegative random variable exceeds a by the expected value of the variable divided by a. 3
14 Theorem 2. (Markov s inequality) Let x be a nonnegative random variable. Then for a > 0, Prob(x a) E(x) a. Proof: We prove the theorem for continuous random variables. So we use integrals. The same proof works for discrete random variables with sums instead of integrals. E (x) = xp(x)dx = a xp(x)dx + xp(x)dx xp(x)dx 0 0 a ap(x)dx = a p(x)dx = ap(x a) a a a Thus, Prob(x a) E(x) a. Corollary 2.2 Prob (x ce(x)) c Proof: Substitute ce(x) for a. Markov s inequality bounds the tail of a distribution using only information about the mean. A tighter bound can be obtained by also using the variance. Theorem 2.3 (Chebyshev s inequality) Let x be a random variable with mean m and variance σ 2. Then Prob( x m aσ) a 2. Proof: Prob( x m aσ) = Prob ( (x m) 2 a 2 σ 2). Note that (x m) 2 is a nonnegative random variable, so Markov s inequality can be a applied giving: Prob ( (x m) 2 a 2 σ 2) E ( (x m) 2) = σ2 a 2 σ 2 a 2 σ = 2 a. 2 Thus, Prob ( x m aσ) a 2. The law of large numbers follows from Chebyshev s inequality. Recall that E(x + y) = E(x) + E(y), σ 2 (cx) = c 2 σ 2 (x), σ 2 (x m) = σ 2 (x), and if x and y are independent, then E(xy) = E(x)E(y) and σ 2 (x + y) = σ 2 (x) + σ 2 (y). To prove σ 2 (x + y) = σ 2 (x) + σ 2 (y) when x and y are independent, since σ 2 (x m) = σ 2 (x), one can assume E(x) = 0 and E(y) = 0. Thus, σ 2 (x + y) = E ( (x + y) 2) = E(x 2 ) + E(y 2 ) + 2E(xy) = E(x 2 ) + E(y 2 ) + 2E(x)E(y) = σ 2 (x) + σ 2 (y). Replacing E(xy) by E(x)E(y) required independence. 4
15 2 2 d Figure 2.4: Illustration of the relationship between the sphere and the cube in 2, 4, and ddimensions. Theorem 2.4 (Law of large numbers) Let x, x 2,..., x n be n samples of a random variable x. Then ( ) x + x x n Prob E(x) n > ɛ σ2 nɛ 2 Proof: By Chebychev s inequality ( ) x + x x n ( Prob E(x) n > ɛ σ2 x +x 2 + +x n ) n ɛ 2 n 2 ɛ 2 σ2 (x + x x n ) ( σ 2 (x n 2 ɛ 2 ) + σ 2 (x 2 ) + + σ 2 (x n ) ) σ2 (x) nɛ. 2 The law of large numbers bounds the difference of the sample average and the expected value. Note that the size of the sample for a given error bound is independent of the size of the population class. In the limit, when the sample size goes to infinity, the central limit theorem says that the distribution of the sample average is Gaussian provided the random variable has finite variance. Later, we will consider random variables that are the sum of random variables. That is, x = x + x x n. Chernoff bounds will tell us about the probability of x differing from its expected value. We will delay this until Section The HighDimensional Sphere One of the interesting facts about a unitradius sphere in high dimensions is that as the dimension increases, the volume of the sphere goes to zero. This has important 5
Foundations of Data Science 1
Foundations of Data Science John Hopcroft Ravindran Kannan Version /4/204 These notes are a first draft of a book being written by Hopcroft and Kannan and in many places are incomplete. However, the notes
More informationFoundations of Data Science 1
Foundations of Data Science John Hopcroft Ravindran Kannan Version 4/9/203 These notes are a first draft of a book being written by Hopcroft and Kannan and in many places are incomplete. However, the notes
More informationFoundations of Data Science 1
Foundations of Data Science 1 John Hopcroft Ravindran Kannan Version 3/03/2013 These notes are a first draft of a book being written by Hopcroft and Kannan and in many places are incomplete. However, the
More informationFoundations of Data Science
Foundations of Data Science Avrim Blum, John Hopcroft and Ravindran Kannan Thursday 9 th June, 206 Copyright 205. All rights reserved Contents Introduction 8 2 HighDimensional Space 2. Introduction...................................
More informationSenior Secondary Australian Curriculum
Senior Secondary Australian Curriculum Mathematical Methods Glossary Unit 1 Functions and graphs Asymptote A line is an asymptote to a curve if the distance between the line and the curve approaches zero
More informationSection 1.1. Introduction to R n
The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to
More information1 Singular Value Decomposition (SVD)
Contents 1 Singular Value Decomposition (SVD) 2 1.1 Singular Vectors................................. 3 1.2 Singular Value Decomposition (SVD)..................... 7 1.3 Best Rank k Approximations.........................
More informationMATHEMATICS (CLASSES XI XII)
MATHEMATICS (CLASSES XI XII) General Guidelines (i) All concepts/identities must be illustrated by situational examples. (ii) The language of word problems must be clear, simple and unambiguous. (iii)
More informationAlgebra 2 Chapter 1 Vocabulary. identity  A statement that equates two equivalent expressions.
Chapter 1 Vocabulary identity  A statement that equates two equivalent expressions. verbal model A word equation that represents a reallife problem. algebraic expression  An expression with variables.
More informationWe call this set an ndimensional parallelogram (with one vertex 0). We also refer to the vectors x 1,..., x n as the edges of P.
Volumes of parallelograms 1 Chapter 8 Volumes of parallelograms In the present short chapter we are going to discuss the elementary geometrical objects which we call parallelograms. These are going to
More informationAlgebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
More informationFoundation. Scheme of Work. Year 10 September 2016July 2017
Foundation Scheme of Work Year 10 September 016July 017 Foundation Tier Students will be assessed by completing two tests (topic) each Half Term. PERCENTAGES Use percentages in reallife situations VAT
More informationREVISED GCSE Scheme of Work Mathematics Higher Unit 6. For First Teaching September 2010 For First Examination Summer 2011 This Unit Summer 2012
REVISED GCSE Scheme of Work Mathematics Higher Unit 6 For First Teaching September 2010 For First Examination Summer 2011 This Unit Summer 2012 Version 1: 28 April 10 Version 1: 28 April 10 Unit T6 Unit
More informationSolving Simultaneous Equations and Matrices
Solving Simultaneous Equations and Matrices The following represents a systematic investigation for the steps used to solve two simultaneous linear equations in two unknowns. The motivation for considering
More informationPURE MATHEMATICS AM 27
AM SYLLABUS (013) PURE MATHEMATICS AM 7 SYLLABUS 1 Pure Mathematics AM 7 Syllabus (Available in September) Paper I(3hrs)+Paper II(3hrs) 1. AIMS To prepare students for further studies in Mathematics and
More informationPURE MATHEMATICS AM 27
AM Syllabus (015): Pure Mathematics AM SYLLABUS (015) PURE MATHEMATICS AM 7 SYLLABUS 1 AM Syllabus (015): Pure Mathematics Pure Mathematics AM 7 Syllabus (Available in September) Paper I(3hrs)+Paper II(3hrs)
More informationCalculus C/Multivariate Calculus Advanced Placement G/T Essential Curriculum
Calculus C/Multivariate Calculus Advanced Placement G/T Essential Curriculum UNIT I: The Hyperbolic Functions basic calculus concepts, including techniques for curve sketching, exponential and logarithmic
More informationNEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document
More informationBiggar High School Mathematics Department. National 5 Learning Intentions & Success Criteria: Assessing My Progress
Biggar High School Mathematics Department National 5 Learning Intentions & Success Criteria: Assessing My Progress Expressions & Formulae Topic Learning Intention Success Criteria I understand this Approximation
More informationIntroduction. The Aims & Objectives of the Mathematical Portion of the IBA Entry Test
Introduction The career world is competitive. The competition and the opportunities in the career world become a serious problem for students if they do not do well in Mathematics, because then they are
More informationGCSE Maths Linear Higher Tier Grade Descriptors
GSE Maths Linear Higher Tier escriptors Fractions /* Find one quantity as a fraction of another Solve problems involving fractions dd and subtract fractions dd and subtract mixed numbers Multiply and divide
More informationIn mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.
MATHEMATICS: THE LEVEL DESCRIPTIONS In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. Attainment target
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More information11.1. Objectives. Component Form of a Vector. Component Form of a Vector. Component Form of a Vector. Vectors and the Geometry of Space
11 Vectors and the Geometry of Space 11.1 Vectors in the Plane Copyright Cengage Learning. All rights reserved. Copyright Cengage Learning. All rights reserved. 2 Objectives! Write the component form of
More informationProbability and Statistics
CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b  0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute  Systems and Modeling GIGA  Bioinformatics ULg kristel.vansteen@ulg.ac.be
More informationAdvanced Algebra 2. I. Equations and Inequalities
Advanced Algebra 2 I. Equations and Inequalities A. Real Numbers and Number Operations 6.A.5, 6.B.5, 7.C.5 1) Graph numbers on a number line 2) Order real numbers 3) Identify properties of real numbers
More information3. INNER PRODUCT SPACES
. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.
More informationKEANSBURG SCHOOL DISTRICT KEANSBURG HIGH SCHOOL Mathematics Department. HSPA 10 Curriculum. September 2007
KEANSBURG HIGH SCHOOL Mathematics Department HSPA 10 Curriculum September 2007 Written by: Karen Egan Mathematics Supervisor: Ann Gagliardi 7 days Sample and Display Data (Chapter 1 pp. 447) Surveys and
More informationMathematics Course 111: Algebra I Part IV: Vector Spaces
Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 19967 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are
More informationThnkwell s Homeschool Precalculus Course Lesson Plan: 36 weeks
Thnkwell s Homeschool Precalculus Course Lesson Plan: 36 weeks Welcome to Thinkwell s Homeschool Precalculus! We re thrilled that you ve decided to make us part of your homeschool curriculum. This lesson
More informationMath 1B, lecture 5: area and volume
Math B, lecture 5: area and volume Nathan Pflueger 6 September 2 Introduction This lecture and the next will be concerned with the computation of areas of regions in the plane, and volumes of regions in
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationAlgebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 201213 school year.
This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra
More informationInteractive Math Glossary Terms and Definitions
Terms and Definitions Absolute Value the magnitude of a number, or the distance from 0 on a real number line Additive Property of Area the process of finding an the area of a shape by totaling the areas
More informationProbability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur
Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special DistributionsVI Today, I am going to introduce
More informationHigher Education Math Placement
Higher Education Math Placement Placement Assessment Problem Types 1. Whole Numbers, Fractions, and Decimals 1.1 Operations with Whole Numbers Addition with carry Subtraction with borrowing Multiplication
More informationNumerical Analysis Lecture Notes
Numerical Analysis Lecture Notes Peter J. Olver 5. Inner Products and Norms The norm of a vector is a measure of its size. Besides the familiar Euclidean norm based on the dot product, there are a number
More informationPrentice Hall Algebra 2 2011 Correlated to: Colorado P12 Academic Standards for High School Mathematics, Adopted 12/2009
Content Area: Mathematics Grade Level Expectations: High School Standard: Number Sense, Properties, and Operations Understand the structure and properties of our number system. At their most basic level
More informationPreAlgebra 2008. Academic Content Standards Grade Eight Ohio. Number, Number Sense and Operations Standard. Number and Number Systems
Academic Content Standards Grade Eight Ohio PreAlgebra 2008 STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express large numbers and small
More informationCAMI Education linked to CAPS: Mathematics
 1  TOPIC 1.1 Whole numbers _CAPS Curriculum TERM 1 CONTENT Properties of numbers Describe the real number system by recognizing, defining and distinguishing properties of: Natural numbers Whole numbers
More informationGRADES 7, 8, AND 9 BIG IDEAS
Table 1: Strand A: BIG IDEAS: MATH: NUMBER Introduce perfect squares, square roots, and all applications Introduce rational numbers (positive and negative) Introduce the meaning of negative exponents for
More informationWHICH LINEARFRACTIONAL TRANSFORMATIONS INDUCE ROTATIONS OF THE SPHERE?
WHICH LINEARFRACTIONAL TRANSFORMATIONS INDUCE ROTATIONS OF THE SPHERE? JOEL H. SHAPIRO Abstract. These notes supplement the discussion of linear fractional mappings presented in a beginning graduate course
More informationExpression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds
Isosceles Triangle Congruent Leg Side Expression Equation Polynomial Monomial Radical Square Root Check Times Itself Function Relation One Domain Range Area Volume Surface Space Length Width Quantitative
More informationMiddle Grades Mathematics 5 9
Middle Grades Mathematics 5 9 Section 25 1 Knowledge of mathematics through problem solving 1. Identify appropriate mathematical problems from realworld situations. 2. Apply problemsolving strategies
More informationPSS 27.2 The Electric Field of a Continuous Distribution of Charge
Chapter 27 Solutions PSS 27.2 The Electric Field of a Continuous Distribution of Charge Description: Knight ProblemSolving Strategy 27.2 The Electric Field of a Continuous Distribution of Charge is illustrated.
More informationCurrent Standard: Mathematical Concepts and Applications Shape, Space, and Measurement Primary
Shape, Space, and Measurement Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two and threedimensional shapes by demonstrating an understanding of:
More informationMATHS LEVEL DESCRIPTORS
MATHS LEVEL DESCRIPTORS Number Level 3 Understand the place value of numbers up to thousands. Order numbers up to 9999. Round numbers to the nearest 10 or 100. Understand the number line below zero, and
More informationDiploma Plus in Certificate in Advanced Engineering
Diploma Plus in Certificate in Advanced Engineering Mathematics New Syllabus from April 2011 Ngee Ann Polytechnic / School of Interdisciplinary Studies 1 I. SYNOPSIS APPENDIX A This course of advanced
More informationThe Australian Curriculum Mathematics
The Australian Curriculum Mathematics Mathematics ACARA The Australian Curriculum Number Algebra Number place value Fractions decimals Real numbers Foundation Year Year 1 Year 2 Year 3 Year 4 Year 5 Year
More informationUnderstanding Basic Calculus
Understanding Basic Calculus S.K. Chung Dedicated to all the people who have helped me in my life. i Preface This book is a revised and expanded version of the lecture notes for Basic Calculus and other
More informationState of Stress at Point
State of Stress at Point Einstein Notation The basic idea of Einstein notation is that a covector and a vector can form a scalar: This is typically written as an explicit sum: According to this convention,
More informationCommon Core Unit Summary Grades 6 to 8
Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity 8G18G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations
More informationPrentice Hall Mathematics Courses 13 Common Core Edition 2013
A Correlation of Prentice Hall Mathematics Courses 13 Common Core Edition 2013 to the Topics & Lessons of Pearson A Correlation of Courses 1, 2 and 3, Common Core Introduction This document demonstrates
More informationNonlinear Iterative Partial Least Squares Method
Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., RichardPlouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for
More informationTangent and normal lines to conics
4.B. Tangent and normal lines to conics Apollonius work on conics includes a study of tangent and normal lines to these curves. The purpose of this document is to relate his approaches to the modern viewpoints
More informationOrthogonal Projections
Orthogonal Projections and Reflections (with exercises) by D. Klain Version.. Corrections and comments are welcome! Orthogonal Projections Let X,..., X k be a family of linearly independent (column) vectors
More informationAble Enrichment Centre  Prep Level Curriculum
Able Enrichment Centre  Prep Level Curriculum Unit 1: Number Systems Number Line Converting expanded form into standard form or vice versa. Define: Prime Number, Natural Number, Integer, Rational Number,
More informationSouth Carolina College and CareerReady (SCCCR) PreCalculus
South Carolina College and CareerReady (SCCCR) PreCalculus Key Concepts Arithmetic with Polynomials and Rational Expressions PC.AAPR.2 PC.AAPR.3 PC.AAPR.4 PC.AAPR.5 PC.AAPR.6 PC.AAPR.7 Standards Know
More informationAdvanced Higher Mathematics Course Assessment Specification (C747 77)
Advanced Higher Mathematics Course Assessment Specification (C747 77) Valid from August 2015 This edition: April 2016, version 2.4 This specification may be reproduced in whole or in part for educational
More informationFigure 1.1 Vector A and Vector F
CHAPTER I VECTOR QUANTITIES Quantities are anything which can be measured, and stated with number. Quantities in physics are divided into two types; scalar and vector quantities. Scalar quantities have
More informationA Introduction to Matrix Algebra and Principal Components Analysis
A Introduction to Matrix Algebra and Principal Components Analysis Multivariate Methods in Education ERSH 8350 Lecture #2 August 24, 2011 ERSH 8350: Lecture 2 Today s Class An introduction to matrix algebra
More informationAQA Level 2 Certificate FURTHER MATHEMATICS
AQA Qualifications AQA Level 2 Certificate FURTHER MATHEMATICS Level 2 (8360) Our specification is published on our website (www.aqa.org.uk). We will let centres know in writing about any changes to the
More informationSection 2.1 Rectangular Coordinate Systems
P a g e 1 Section 2.1 Rectangular Coordinate Systems 1. Pythagorean Theorem In a right triangle, the lengths of the sides are related by the equation where a and b are the lengths of the legs and c is
More informationLINEAR ALGEBRA W W L CHEN
LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,
More informationKS4 Curriculum Plan Maths FOUNDATION TIER Year 9 Autumn Term 1 Unit 1: Number
KS4 Curriculum Plan Maths FOUNDATION TIER Year 9 Autumn Term 1 Unit 1: Number 1.1 Calculations 1.2 Decimal Numbers 1.3 Place Value Use priority of operations with positive and negative numbers. Simplify
More information1.6 Powers of 10 and standard form Write a number in standard form. Calculate with numbers in standard form.
Unit/section title 1 Number Unit objectives (Edexcel Scheme of Work Unit 1: Powers, decimals, HCF and LCM, positive and negative, roots, rounding, reciprocals, standard form, indices and surds) 1.1 Number
More informationREVISED GCSE Scheme of Work Mathematics Higher Unit T3. For First Teaching September 2010 For First Examination Summer 2011
REVISED GCSE Scheme of Work Mathematics Higher Unit T3 For First Teaching September 2010 For First Examination Summer 2011 Version 1: 28 April 10 Version 1: 28 April 10 Unit T3 Unit T3 This is a working
More informationpp. 4 8: Examples 1 6 Quick Check 1 6 Exercises 1, 2, 20, 42, 43, 64
Semester 1 Text: Chapter 1: Tools of Algebra Lesson 11: Properties of Real Numbers Day 1 Part 1: Graphing and Ordering Real Numbers Part 1: Graphing and Ordering Real Numbers Lesson 12: Algebraic Expressions
More informationThe NotFormula Book for C1
Not The NotFormula Book for C1 Everything you need to know for Core 1 that won t be in the formula book Examination Board: AQA Brief This document is intended as an aid for revision. Although it includes
More informationNEW MEXICO Grade 6 MATHEMATICS STANDARDS
PROCESS STANDARDS To help New Mexico students achieve the Content Standards enumerated below, teachers are encouraged to base instruction on the following Process Standards: Problem Solving Build new mathematical
More informationWhole Numbers and Integers (44 topics, no due date)
Course Name: PreAlgebra into Algebra Summer Hwk Course Code: GHMKUKPMR9 ALEKS Course: PreAlgebra Instructor: Ms. Rhame Course Dates: Begin: 05/30/2015 End: 12/31/2015 Course Content: 302 topics Whole
More informationSummary of week 8 (Lectures 22, 23 and 24)
WEEK 8 Summary of week 8 (Lectures 22, 23 and 24) This week we completed our discussion of Chapter 5 of [VST] Recall that if V and W are inner product spaces then a linear map T : V W is called an isometry
More informationSECTION 0.11: SOLVING EQUATIONS. LEARNING OBJECTIVES Know how to solve linear, quadratic, rational, radical, and absolute value equations.
(Section 0.11: Solving Equations) 0.11.1 SECTION 0.11: SOLVING EQUATIONS LEARNING OBJECTIVES Know how to solve linear, quadratic, rational, radical, and absolute value equations. PART A: DISCUSSION Much
More informationMATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!
MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Prealgebra Algebra Precalculus Calculus Statistics
More informationUtah Core Curriculum for Mathematics
Core Curriculum for Mathematics correlated to correlated to 2005 Chapter 1 (pp. 2 57) Variables, Expressions, and Integers Lesson 1.1 (pp. 5 9) Expressions and Variables 2.2.1 Evaluate algebraic expressions
More informationMathematics (MAT) MAT 061 Basic Euclidean Geometry 3 Hours. MAT 051 PreAlgebra 4 Hours
MAT 051 PreAlgebra Mathematics (MAT) MAT 051 is designed as a review of the basic operations of arithmetic and an introduction to algebra. The student must earn a grade of C or in order to enroll in MAT
More informationMATH 132: CALCULUS II SYLLABUS
MATH 32: CALCULUS II SYLLABUS Prerequisites: Successful completion of Math 3 (or its equivalent elsewhere). Math 27 is normally not a sufficient prerequisite for Math 32. Required Text: Calculus: Early
More informationBX in ( u, v) basis in two ways. On the one hand, AN = u+
1. Let f(x) = 1 x +1. Find f (6) () (the value of the sixth derivative of the function f(x) at zero). Answer: 7. We expand the given function into a Taylor series at the point x = : f(x) = 1 x + x 4 x
More informationApplied Algorithm Design Lecture 5
Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture  17 ShannonFanoElias Coding and Introduction to Arithmetic Coding
More informationISOMETRIES OF R n KEITH CONRAD
ISOMETRIES OF R n KEITH CONRAD 1. Introduction An isometry of R n is a function h: R n R n that preserves the distance between vectors: h(v) h(w) = v w for all v and w in R n, where (x 1,..., x n ) = x
More informationSolutions to Homework 10
Solutions to Homework 1 Section 7., exercise # 1 (b,d): (b) Compute the value of R f dv, where f(x, y) = y/x and R = [1, 3] [, 4]. Solution: Since f is continuous over R, f is integrable over R. Let x
More information3. Continuous Random Variables
3. Continuous Random Variables A continuous random variable is one which can take any value in an interval (or union of intervals) The values that can be taken by such a variable cannot be listed. Such
More informationTHREE DIMENSIONAL GEOMETRY
Chapter 8 THREE DIMENSIONAL GEOMETRY 8.1 Introduction In this chapter we present a vector algebra approach to three dimensional geometry. The aim is to present standard properties of lines and planes,
More informationNumber Sense and Operations
Number Sense and Operations representing as they: 6.N.1 6.N.2 6.N.3 6.N.4 6.N.5 6.N.6 6.N.7 6.N.8 6.N.9 6.N.10 6.N.11 6.N.12 6.N.13. 6.N.14 6.N.15 Demonstrate an understanding of positive integer exponents
More informationChapter 15 Introduction to Linear Programming
Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2014 WeiTa Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationA Sublinear Bipartiteness Tester for Bounded Degree Graphs
A Sublinear Bipartiteness Tester for Bounded Degree Graphs Oded Goldreich Dana Ron February 5, 1998 Abstract We present a sublineartime algorithm for testing whether a bounded degree graph is bipartite
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More information4. Factor polynomials over complex numbers, describe geometrically, and apply to realworld situations. 5. Determine and apply relationships among syn
I The Real and Complex Number Systems 1. Identify subsets of complex numbers, and compare their structural characteristics. 2. Compare and contrast the properties of real numbers with the properties of
More informationMathematics Scope and Sequence: Foundation to Year 6
Number Algebra Number place value Fractions decimals Real numbers Foundation Year Year 1 Year 2 Year 3 Year 4 Year 5 Year 6 Establish understing of the language processes of counting by naming numbers
More informationCommon Core State Standard I Can Statements 8 th Grade Mathematics. The Number System (NS)
CCSS Key: The Number System (NS) Expressions & Equations (EE) Functions (F) Geometry (G) Statistics & Probability (SP) Common Core State Standard I Can Statements 8 th Grade Mathematics 8.NS.1. Understand
More informationMathematics Standards
1 Table of Contents Mathematics Standards Subject Pages Algebra 12 24 Algebra 34 56 AP Calculus AB and BC Standards 7 AP Statistics Standards 8 Consumer Math 9 Geometry 12 1011 Honors Differential
More informationLesson 17: Graphing the Logarithm Function
Lesson 17 Name Date Lesson 17: Graphing the Logarithm Function Exit Ticket Graph the function () = log () without using a calculator, and identify its key features. Lesson 17: Graphing the Logarithm Function
More informationIf A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?
Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question
More informationProbability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0
Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0 This primer provides an overview of basic concepts and definitions in probability and statistics. We shall
More informationYear 8  Maths Autumn Term
Year 8  Maths Autumn Term Whole Numbers and Decimals Order, add and subtract negative numbers. Recognise and use multiples and factors. Use divisibility tests. Recognise prime numbers. Find square numbers
More informationMATH 095, College Prep Mathematics: Unit Coverage Prealgebra topics (arithmetic skills) offered through BSE (Basic Skills Education)
MATH 095, College Prep Mathematics: Unit Coverage Prealgebra topics (arithmetic skills) offered through BSE (Basic Skills Education) Accurately add, subtract, multiply, and divide whole numbers, integers,
More informationFactoring Patterns in the Gaussian Plane
Factoring Patterns in the Gaussian Plane Steve Phelps Introduction This paper describes discoveries made at the Park City Mathematics Institute, 00, as well as some proofs. Before the summer I understood
More informationContinued Fractions and the Euclidean Algorithm
Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction
More information