Path Estimation from GPS Tracks
|
|
|
- Noah Garrison
- 9 years ago
- Views:
Transcription
1 Path Estimation from GPS Tracks Chris Brunsdon Department of Geography University of Leicester University Road, Leicester LE1 7RU Telephone: Fax: Introduction The widespread availability of hand-held GPS units has led to a proliferation in data on the tracks of individuals as they walk, drive or otherwise go about journeys. This data has been used in a number of ways - for example the OpenStreetMap project (The OpenStreetMap Foundation 2007). One characteristic of projects such as this is that there will often be several GPS tracks for the same stretch of road. In general, repeatedly measuring something and taking the average of measurements leads to a more accurate result. The question addressed here is is it possible to average GPS tracks and if so, does this lead to a better estimate of road location?. 2. Tracing Paths From GPS Data In this section, the key technique for identifying paths from GPS data will be introduced. This approach is called Principal Curve Analysis (Hastie and Stuetzle 1989). Here, the basic principal curve algorithm will be used, as well as some proposed modifications to address specific issues relating to the estimation of cartographic data. 2.1 Description of the Data The GPS data considered here is tracking data recorded in the GPX format. The track data coordinates were transformed from longitude and latitude to OS national grid coordinates 1 to allow comparison. Although the track data can be treated as line objects (with each line corresponding to a track), the technique outlined in the next section only requires the point information in each of the tracks. The points considered in this way will be referred to as a point cloud. The point cloud recorded by the author is shown in figure 1, and consists of 342 points. Using approximate bearings, and starting from the northernmost point, there is a short walk south-west (beside Waterloo Road), then a longer walk (south-east, along a footpath New Walk), then another long section (south west again, along University Road) and a final short walk south east into Leicester University s campus. 1 Proj4 string: +proj=tmerc +lat 0=49 +lon 0=-2 +k= x 0= y 0= ellps=airy +towgs84= , , ,0.1502,0.2470, , units=m +no defs 1
2 Figure 1: A point cloud showing recorded GPS tracks in Leicester. 2
3 2.2 Principal Curve Analysis The idea here is to find a curve running through the middle of the point cloud. One way of defining the middle curve is to say that it is the curve minimising the total squared distances to each point in the point cloud. If we consider the point cloud to be a list of n coordinate pairs {p i = (x i, y i ) : i = 1, n}, and our curve as a parametrised curve f(λ) = (f x (λ), f y (λ)), then the distance between p i and f is the closest point to p i on f: D(p i, f) = argmin λ p i f(λ) (1) and the middle curve ˆf mimimises the expression i D2 (f, p i ). Curves satisfying these requirements are known as principal curves. It is noted that the parametrisation of f is not unique f(λ) could be replaced by f(g(λ)) where g(.) is any monotone function. To resolve this ambiguity, it is specified that λ should be the distance travelled along the curve f in this case, λ is therefore the distance travelled along the middle curve of the GPS point cloud. In order to estimate f (Hastie and Stuetzle 1989) attempt to reconstruct the curve at a number of discrete points {f(λ i ) : i = 1...n} where i corresponds to the index for the points in the point cloud. Given an initial guess at f, the curve is reconstructed using the following method: 1. Find the nearest points to each p i on the guessed curve, and compute the distance along the curve to each point. This provides a set of estimates for the λ i s 2. The estimate of f is then updated by updating estimates for the two functions f x (λ), f y (λ) using a non-parametric smooth regression procedure (such as Cleveland (1979) or Green and Silverman (1994)) applied respectively to the (λ i, x i ) and (λ i, y i ) pairs. 3. Return to step 1 with the updated estimate of f The result of applying the algorithm to the point cloud data used here is shown in figure 2. The principal curve is shown in red - the black lines correspond to the distances of each individual point in the cloud to the line. Note a key difference between this technique and standard non-parametric regression here the distances to be minimised can be in any direction, depending on the line joining a GPS point and the closest point to it on the principal curve, while in standard regression techniques, distances are always measured in the same direction parallel to the y-axis 2.3 Adapting the Algorithm Figure 2 demonstrates the general accuracy of the principal curve algorithm, but also highlights one of its pitfalls. At times GPS tracks can exhibit systematic errors - at the north end of New Walk, there is clearly a rogue track which veers noticeably from the true location - it may be seen as a dog leg swinging away from New Walk, apparently crossing the railway line having passed through some buildings. This means that although in most places the curve provides a good estimate of the road or pathway, it swings out in locations near to the rogue track. A way of overcoming this is to devise a robust variant on the principal curve algorithm - the approach is outlined below: 1. Fit a principal curve using the standard approach. 2. Note the distances from each point to the curve - call these {d i } 3. Standardise these distances by dividing by their standard deviation - call these {d i }. 4. Compute a set of weights as a monotone decreasing function of the d i s - typically let w i = 1 if d i < 3, w i = 4 d i if 3 < d i < 4 and w i = 0 if d i > 4. 3
4 Figure 2: Principal curve (red) fitted to GPS point cloud data. 4
5 Figure 3: Robust principal curve (green) fitted to GPS point cloud data. 5
6 5. Re-run the principal curve algorithm, but use weights {w i } in the nonparametric regression stage. The result of applying this modified algorithm to the point cloud is shown in figure 3. From this, it is clear that the influence of the rogue track has been greatly reduced, and the estimated path now correctly follows the footbridge over the railway and main road. 3. Assessing the Quality of Principal Curves As stated earlier, two ways of assessing the quality of the estimated curves are proposed - in terms of accuracy and precision. Accuracy the ability of the curve to reproduce the ground truth of path location may be carried out visually using figures 2 and 3. Here, particularly with the robust modification, the results are encouraging. Precision is essentially a measure of the reliability of the estimated paths, given that the logged GPS tracks are samples of locations on the actual paths containing some random error. Here, it is proposed to use a bootstrapping approach (Efron 1981; Effron 1982) to estimate confidence bands around the paths. Briefly, this method estimates the sampling distribution of an arbitary statistic s from a data set {X i : i = 1...n} by randomly sampling n items from the data set with replacement a number of times. Effectively we estimate the true distribution of the X i s as a mass point distribution in which each individual value X i has a probability of 1 of occurring. n Here, s is not a number, but a line on a map. However, the bootstrap idea can still be applied. This is done here for both the standard and robust curves. The results are visualised by drawing each bootstrap sample of the principal curve on the same map, but using alpha blending (Porter and Duff 1984) when drawing the curves. In figure 4 the bootstrap paths for the standard method are shown in red, and those for the robust method are shown in blue. In general, the robust method has lower sampling variability. 4. Conclusions A method has been proposed to reconstruct road or footpaths from GPS point cloud data, based on the idea of principal curves. The method has been assessed visually for both accuracy (by comparing the fitted paths against OS Landline data) and for precision (by considering bootstrap samples of the point cloud data). The results are encouraging particularly for the robust curve estimation algorithm, suggesting that this may provide a viable method for automatic path detection from GPS data. This could be incorporated, for example, in the JOSM map data editing program (Scholz 2007) associated with the OpenStreetMap project. One characteristic of the algorithm is that there are as many estimated λ i values and f points as there are points in the point cloud - so that as the size of the point cloud increases, the principal curves contain an increasing number of points. For this reason, a final pass of the principal curve could be applied, to thin out the number of points. This could be achieved, for example, with the Douglas-Peuker algorithm (Douglas and Peuker 1973). An example of this is shown in figure 5, with the thinned-out principal curve shown in red, and 100 bootstrap samples of this curve shown in blue. A final issue to address is the comparison of the principal curve fitting algorithm used here with others, such as Einbeck, Tutz, and Evers (2005). For now, it seems that the algorithm in use at least satisfies the need to detect paths from GPS data, however further work may throw light on more efficient, precise or accurate approaches. 6
7 Figure 4: Bootstrap sampling variability visualisations for standard (red) and robust (blue) principal curves. 7
8 Figure 5: A Generalized principal curve (red) and its bootstrap sampling variability estimate (blue). 8
9 5. References Cleveland, W. S, 1979, Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association 74, Douglas, D and Peuker, T, 1973, Algorithms for the reduction of the number of points required to represent a digitised line or its caricature. The Canadian Cartographer 10, Effron, B, 1982, The Jacknife, the Bootstrap and Other Resampling Plans. Philadelphia, Pennsylvania: Society for Industrial and Applied Mathematics. Efron, B, 1981, Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika 68, Einbeck, J, Tutz, G, and Evers, L, 2005, Local principal curves. Statistics and Computing 15, Green, P and Silverman, B, 1994, Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. London: Chapman and Hall. Hastie, T. J and Stuetzle, W, 1989, Principal curves. Journal of the American Statistical Association 84(406), Porter, T and Duff, T, 1984, Compositing digital images. Computer Graphics 18(3), Scholz, I, 2007, JOSM OpenStreetMap. Web Site. (Viewed April 3, 2007). The OpenStreetMap Foundation, 2007, OpenStreetMap. Web Site. openstreetmap.org (Viewed April 3, 2007). 9
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
Introduction to nonparametric regression: Least squares vs. Nearest neighbors
Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
4 The Rhumb Line and the Great Circle in Navigation
4 The Rhumb Line and the Great Circle in Navigation 4.1 Details on Great Circles In fig. GN 4.1 two Great Circle/Rhumb Line cases are shown, one in each hemisphere. In each case the shorter distance between
Analysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
Nonlinear Regression:
Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference
Reflection and Refraction
Equipment Reflection and Refraction Acrylic block set, plane-concave-convex universal mirror, cork board, cork board stand, pins, flashlight, protractor, ruler, mirror worksheet, rectangular block worksheet,
GEOMETRIC MENSURATION
GEOMETRI MENSURTION Question 1 (**) 8 cm 6 cm θ 6 cm O The figure above shows a circular sector O, subtending an angle of θ radians at its centre O. The radius of the sector is 6 cm and the length of the
Predicting daily incoming solar energy from weather data
Predicting daily incoming solar energy from weather data ROMAIN JUBAN, PATRICK QUACH Stanford University - CS229 Machine Learning December 12, 2013 Being able to accurately predict the solar power hitting
M-Series (BAS) Geolocators Short Manual
M-Series (BAS) Geolocators Short Manual Communicating with M-SERIES geolocators...2 Starting recording...3 Pre- and Post-Calibration (Ground Truthing)...4 Deployment...5 Download...7 Appendix 1 - Linux
Mathematics (Project Maths Phase 1)
2011. S133S Coimisiún na Scrúduithe Stáit State Examinations Commission Junior Certificate Examination Sample Paper Mathematics (Project Maths Phase 1) Paper 2 Ordinary Level Time: 2 hours 300 marks Running
Visualizing of Berkeley Earth, NASA GISS, and Hadley CRU averaging techniques
Visualizing of Berkeley Earth, NASA GISS, and Hadley CRU averaging techniques Robert Rohde Lead Scientist, Berkeley Earth Surface Temperature 1/15/2013 Abstract This document will provide a simple illustration
Least-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
1 Review of Least Squares Solutions to Overdetermined Systems
cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares
ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite
ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
Arrangements And Duality
Arrangements And Duality 3.1 Introduction 3 Point configurations are tbe most basic structure we study in computational geometry. But what about configurations of more complicated shapes? For example,
Building risk prediction models - with a focus on Genome-Wide Association Studies. Charles Kooperberg
Building risk prediction models - with a focus on Genome-Wide Association Studies Risk prediction models Based on data: (D i, X i1,..., X ip ) i = 1,..., n we like to fit a model P(D = 1 X 1,..., X p )
EdExcel Decision Mathematics 1
EdExcel Decision Mathematics 1 Linear Programming Section 1: Formulating and solving graphically Notes and Examples These notes contain subsections on: Formulating LP problems Solving LP problems Minimisation
SP500 September 2011 Outlook
SP500 September 2011 Outlook This document is designed to provide the trader and investor of the Standard and Poor s 500 with an overview of the seasonal tendency as well as the current cyclic pattern
Map reading made easy
Map reading made easy What is a map? A map is simply a plan of the ground on paper. The plan is usually drawn as the land would be seen from directly above. A map will normally have the following features:
X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
Regression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
Friday 24 May 2013 Morning
Friday 24 May 2013 Morning AS GCE MATHEMATICS 4732/01 Probability & Statistics 1 QUESTION PAPER * 4 7 1 5 5 2 0 6 1 3 * Candidates answer on the Printed Answer Book. OCR supplied materials: Printed Answer
CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION
CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION N PROBLEM DEFINITION Opportunity New Booking - Time of Arrival Shortest Route (Distance/Time) Taxi-Passenger Demand Distribution Value Accurate
A Comparison of Correlation Coefficients via a Three-Step Bootstrap Approach
ISSN: 1916-9795 Journal of Mathematics Research A Comparison of Correlation Coefficients via a Three-Step Bootstrap Approach Tahani A. Maturi (Corresponding author) Department of Mathematical Sciences,
Spring Force Constant Determination as a Learning Tool for Graphing and Modeling
NCSU PHYSICS 205 SECTION 11 LAB II 9 FEBRUARY 2002 Spring Force Constant Determination as a Learning Tool for Graphing and Modeling Newton, I. 1*, Galilei, G. 1, & Einstein, A. 1 (1. PY205_011 Group 4C;
The Crescent Primary School Calculation Policy
The Crescent Primary School Calculation Policy Examples of calculation methods for each year group and the progression between each method. January 2015 Our Calculation Policy This calculation policy has
WHITE PAPER: SALES & MARKETING. Seven levers of sales and marketing performance that drive sales growth and deliver sustainable competitive advantage
WHITE PAPER: SALES & MARKETING Seven levers of sales and marketing performance that drive sales growth and deliver sustainable competitive advantage Sales & Marketing 2 Seven levers of sales and marketing
The relationships between Argo Steric Height and AVISO Sea Surface Height
The relationships between Argo Steric Height and AVISO Sea Surface Height Phil Sutton 1 Dean Roemmich 2 1 National Institute of Water and Atmospheric Research, New Zealand 2 Scripps Institution of Oceanography,
AMARILLO BY MORNING: DATA VISUALIZATION IN GEOSTATISTICS
AMARILLO BY MORNING: DATA VISUALIZATION IN GEOSTATISTICS William V. Harper 1 and Isobel Clark 2 1 Otterbein College, United States of America 2 Alloa Business Centre, United Kingdom [email protected]
The Standard Normal distribution
The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance
Local classification and local likelihoods
Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor
6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
Introduction to MATLAB IAP 2008
MIT OpenCourseWare http://ocw.mit.edu Introduction to MATLAB IAP 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. Introduction to Matlab Ideas for
A comparison of radio direction-finding technologies. Paul Denisowski, Applications Engineer Rohde & Schwarz
A comparison of radio direction-finding technologies Paul Denisowski, Applications Engineer Rohde & Schwarz Topics General introduction to radiolocation Manual DF techniques Doppler DF Time difference
Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.
Epipolar Geometry We consider two perspective images of a scene as taken from a stereo pair of cameras (or equivalently, assume the scene is rigid and imaged with a single camera from two different locations).
Data Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
3.2. Solving quadratic equations. Introduction. Prerequisites. Learning Outcomes. Learning Style
Solving quadratic equations 3.2 Introduction A quadratic equation is one which can be written in the form ax 2 + bx + c = 0 where a, b and c are numbers and x is the unknown whose value(s) we wish to find.
Risk pricing for Australian Motor Insurance
Risk pricing for Australian Motor Insurance Dr Richard Brookes November 2012 Contents 1. Background Scope How many models? 2. Approach Data Variable filtering GLM Interactions Credibility overlay 3. Model
Several Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
Lecture 5 : The Poisson Distribution
Lecture 5 : The Poisson Distribution Jonathan Marchini November 10, 2008 1 Introduction Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,
Describe the Create Profile dialog box. Discuss the Update Profile dialog box.examine the Annotate Profile dialog box.
A profile represents the ground surface along a specified path. A profile of the horizontal alignment showing the existing surface ground line is required before creating the vertical alignment, also known
AP Physics 1 and 2 Lab Investigations
AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks
Sample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
Physics Lab Report Guidelines
Physics Lab Report Guidelines Summary The following is an outline of the requirements for a physics lab report. A. Experimental Description 1. Provide a statement of the physical theory or principle observed
MARS STUDENT IMAGING PROJECT
MARS STUDENT IMAGING PROJECT Data Analysis Practice Guide Mars Education Program Arizona State University Data Analysis Practice Guide This set of activities is designed to help you organize data you collect
Decision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
LANDSAT 8 Level 1 Product Performance
Réf: IDEAS-TN-10-QualityReport LANDSAT 8 Level 1 Product Performance Quality Report Month/Year: January 2016 Date: 26/01/2016 Issue/Rev:1/9 1. Scope of this document On May 30, 2013, data from the Landsat
Colour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone
Regularized Logistic Regression for Mind Reading with Parallel Validation
Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland
Statistical Learning for Short-Term Photovoltaic Power Predictions
Statistical Learning for Short-Term Photovoltaic Power Predictions Björn Wolff 1, Elke Lorenz 2, Oliver Kramer 1 1 Department of Computing Science 2 Institute of Physics, Energy and Semiconductor Research
Elasticity. I. What is Elasticity?
Elasticity I. What is Elasticity? The purpose of this section is to develop some general rules about elasticity, which may them be applied to the four different specific types of elasticity discussed in
5 Correlation and Data Exploration
5 Correlation and Data Exploration Correlation In Unit 3, we did some correlation analyses of data from studies related to the acquisition order and acquisition difficulty of English morphemes by both
Lasso on Categorical Data
Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.
Petrel TIPS&TRICKS from SCM
Petrel TIPS&TRICKS from SCM Knowledge Worth Sharing Histograms and SGS Modeling Histograms are used daily for interpretation, quality control, and modeling in Petrel. This TIPS&TRICKS document briefly
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
Microsoft Azure Machine learning Algorithms
Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql [email protected] http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation
Blue Ocean Strategy The tools and techniques behind Blue Ocean Strategy formulation 2-day workshop
Blue Ocean Strategy The tools and techniques behind Blue Ocean Strategy formulation 2-day workshop Blue Ocean Strategy challenges everything you thought you knew about strategy. Business Strategy Review,
The Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
Towards running complex models on big data
Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation
Fitting Subject-specific Curves to Grouped Longitudinal Data
Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: [email protected] Currie,
Exploratory Data Analysis
Exploratory Data Analysis Johannes Schauer [email protected] Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
Hypothesis Testing for Beginners
Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes
CONTENTS. Page 3 What is orienteering? Page 4 Activity: orienteering map bingo. Page 5 Activity: know your colours. Page 6 Choosing your compass
THE RIGHT DIRECTION SCOUT ORIENTEER ACTIVITY BADGE CONTENTS Page What is orienteering? Page 4 Activity: orienteering map bingo Page 5 Activity: know your colours Page 6 Choosing your compass Page 7 Activity:
Session 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
Translating between Fractions, Decimals and Percents
CONCEPT DEVELOPMENT Mathematics Assessment Project CLASSROOM CHALLENGES A Formative Assessment Lesson Translating between Fractions, Decimals and Percents Mathematics Assessment Resource Service University
3D Drawing. Single Point Perspective with Diminishing Spaces
3D Drawing Single Point Perspective with Diminishing Spaces The following document helps describe the basic process for generating a 3D representation of a simple 2D plan. For this exercise we will be
Measuring Line Edge Roughness: Fluctuations in Uncertainty
Tutor6.doc: Version 5/6/08 T h e L i t h o g r a p h y E x p e r t (August 008) Measuring Line Edge Roughness: Fluctuations in Uncertainty Line edge roughness () is the deviation of a feature edge (as
The Dot and Cross Products
The Dot and Cross Products Two common operations involving vectors are the dot product and the cross product. Let two vectors =,, and =,, be given. The Dot Product The dot product of and is written and
RANGER S.A.S 3D (Survey Analysis Software)
RANGER S.A.S 3D (Survey Analysis Software) QUICK START USER MANUAL INTRODUCTION This document is designed to provide a step by step guide showing how easy it is to import and manipulate raw survey data
4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5
Physics 161 FREE FALL Introduction This experiment is designed to study the motion of an object that is accelerated by the force of gravity. It also serves as an introduction to the data analysis capabilities
What Do You Think? For You To Do GOALS
Activity 2 Newton s Law of Universal Gravitation GOALS In this activity you will: Explore the relationship between distance of a light source and intensity of light. Graph and analyze the relationship
We discuss 2 resampling methods in this chapter - cross-validation - the bootstrap
Statistical Learning: Chapter 5 Resampling methods (Cross-validation and bootstrap) (Note: prior to these notes, we'll discuss a modification of an earlier train/test experiment from Ch 2) We discuss 2
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
ST 371 (IV): Discrete Random Variables
ST 371 (IV): Discrete Random Variables 1 Random Variables A random variable (rv) is a function that is defined on the sample space of the experiment and that assigns a numerical variable to each possible
Statistical Process Control (SPC) Training Guide
Statistical Process Control (SPC) Training Guide Rev X05, 09/2013 What is data? Data is factual information (as measurements or statistics) used as a basic for reasoning, discussion or calculation. (Merriam-Webster
Describing, Exploring, and Comparing Data
24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
How To Manage Project Management
CS/SWE 321 Sections -001 & -003 Software Project Management Copyright 2014 Hassan Gomaa All rights reserved. No part of this document may be reproduced in any form or by any means, without the prior written
Analytical Test Method Validation Report Template
Analytical Test Method Validation Report Template 1. Purpose The purpose of this Validation Summary Report is to summarize the finding of the validation of test method Determination of, following Validation
Machine Learning. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34
Machine Learning Javier Béjar cbea LSI - FIB Term 2012/2013 Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34 Outline 1 Introduction to Inductive learning 2 Search and inductive learning
Multiple Regression: What Is It?
Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in
BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING
BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING Xavier Conort [email protected] Session Number: TBR14 Insurance has always been a data business The industry has successfully
The Role of SPOT Satellite Images in Mapping Air Pollution Caused by Cement Factories
The Role of SPOT Satellite Images in Mapping Air Pollution Caused by Cement Factories Dr. Farrag Ali FARRAG Assistant Prof. at Civil Engineering Dept. Faculty of Engineering Assiut University Assiut, Egypt.
An Introduction to Number Theory Prime Numbers and Their Applications.
East Tennessee State University Digital Commons @ East Tennessee State University Electronic Theses and Dissertations 8-2006 An Introduction to Number Theory Prime Numbers and Their Applications. Crystal
The number of marks is given in brackets [ ] at the end of each question or part question. The total number of marks for this paper is 72.
ADVANCED SUBSIDIARY GCE UNIT 4736/01 MATHEMATICS Decision Mathematics 1 THURSDAY 14 JUNE 2007 Afternoon Additional Materials: Answer Booklet (8 pages) List of Formulae (MF1) Time: 1 hour 30 minutes INSTRUCTIONS
Factoring Patterns in the Gaussian Plane
Factoring Patterns in the Gaussian Plane Steve Phelps Introduction This paper describes discoveries made at the Park City Mathematics Institute, 00, as well as some proofs. Before the summer I understood
Section 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
Geographically weighted visualization interactive graphics for scale-varying exploratory analysis
Geographically weighted visualization interactive graphics for scale-varying exploratory analysis Chris Brunsdon 1, Jason Dykes 2 1 Department of Geography Leicester University Leicester LE1 7RH [email protected]
