Sampling Biases in IP Topology Measurements
|
|
|
- Kellie Townsend
- 10 years ago
- Views:
Transcription
1 Sampling Biases in IP Topology Measurements Anukool Lakhina with John Byers, Mark Crovella and Peng Xie Department of Boston University
2 Discovering the Internet topology Goal: Discover the Internet Router Graph Vertices represent routers, Edges connect routers that are one IP hop apart source, A destination, B Measurement Primitive: traceroute Reports the IP path from A to B i.e., how IP paths are overlaid on the router graph
3 Traceroute studies today Sources k sources: Few active sources, strategically located. m destinations: Many passive destinations, globally dispersed. Union of many traceroute paths. Destinations (k,m)-traceroute study
4 High Variability in node degrees Degree distribution of routers found to be highly variable (degrees span several orders of magnitude). Various studies have even concluded that the degree distribution has a power law tail, Pr[ X > d] d c log(pr[x>d]) Dataset from [PG98] log(degree) [FFF99, GT00, BC01, ]
5 Our Question How reliable are (k,m)-traceroute methods in sampling graphs? We show that as a tool for measuring degree distribution, (k,m)-traceroute methods exhibit significant bias.
6 A thought experiment Idea: Simulate topology measurements on a random graph. 1. Generate a sparse Erdös-Rényi random graph, G=(V,E). Each edge present independently with probability p 1 1 Assign weights: w(e) = 1 + ε, where ε in, 2. Pick k unique source nodes, uniformly at random 3. Pick m unique destination nodes, uniformly at random 4. Simulate traceroute from k sources to m destinations, i.e. learn shortest paths between k sources and m destinations. 5. Let Ĝ be union of shortest paths. V V Ask: How does Ĝ compare with G?
7 Underlying Random Graph, G log(pr[x>x]) Measured Graph, Ĝ Underlying Graph: N=100000, p= Measured Graph: k=3, m=1000 log(degree) Ĝ is a biased sample of G with a dramatically different degree distribution. Can high variability be a measurement artifact?
8 Outline Motivation and Thought Experiments Understanding Bias on Simulated Topologies Detecting Bias in Simulated Scenarios Statistical hypotheses to infer presence of bias Examining Internet Maps
9 Understanding Bias (k,m)-traceroute sampling of graphs is biased An intuitive explanation: When traces are run from few sources to large destinations, some portions of underlying graph are explored more than others. Edges incident to a node in Ĝ are sampled disproportionately.
10 Analyzing nonuniform edge sampling Question: Given some vertex in Ĝ that is h hops from the source, what fraction of its true edges are contained in Ĝ? Analysis reveals that: As h increases, fraction of edges discovered falls off sharply. Fraction of node edges discovered 1000dst 600dst 100dst Distance from source
11 What does this suggest? Destinations Edges close to the source are sampled more often than edges further away. S3 Intuitive Picture: Neighborhood near sources is well explored but, this visibility falls with hop distance from sources. S1 Destinations S2
12 Inferring Bias Goal: Given a measured Ĝ, is it a biased sample? Why this is difficult: Don t have underlying graph. Don t have criteria for checking bias. General Approach: Examine statistical properties as a function of distance from nearest source. Unbiased sample No change Change Bias
13 Towards Detecting Bias Examine Pr[D H], the conditional probability that a node has degree d, given that it is at distance h from the source. log(pr[x>x]) Ĝ degrees H=3 log(degree) Ĝ degrees H=2 Two observations: 1. Highest degree nodes are near the source. 2. Degree distribution of nodes near the source differs from those further away.
14 A Statistical Test for C1 C1: Are the highest-degree nodes near the source? If so, then consistent with bias. H C1 0 The 1% highest degree nodes occur at random with distance to nearest source. Cut vertex set in half: N (near) and F (far), by distance from nearest source. Let v : (0.01) V k : fraction of v highest-degree nodes that lie in N Can bound likelihood k deviates from 1/2 using Chernoff-bounds: Pr[ k δ (1+ δ ) > 2 ] < (1+ δ ) Reject null hypothesis with confidence 1-α if: α e (1+ δ) e (1+ δ) δ (1+δ ) v 2 v 2
15 A Statistical Test for C2 C2: Is the degree distribution of nodes near the source different from those further away? If so, consistent with bias. H C2 0 Degree distribution of nodes near the source is consistent with that of all nodes. Compare degree distribution of nodes in N and Ĝ, using the Chi-Square Test: 2 χ = l i= 1 ( O i E ) where O and E are observed and expected degree frequencies and l is histogram bin size. Reject hypothesis with confidence 1-α if: i 2 / E i 2 χ 2 > χ[ α, l 1]
16 Our Definition of Bias Bias (Definition): Failure of a sampled graph to meet statistical tests for randomness associated with C1 and C2. Disclaimer: Tests are binary and don t tell us how biased datasets are. A dataset that fails both tests is a poor choice for making generalizations about underlying graph.
17 log(pr[x>x]) Introducing datasets Dataset Name Date # Nodes # Links # Srcs # Dsts Reference Pansiot-Grad ,888 4, PG98 Mercator , ,149 1 NA GT00 Skitter ,202 11, BBBC01 Pansiot-Grad Mercator Skitter log(degree)
18 Testing C1 H C1 0 The 1% highest degree nodes occur at random with distance to source. Pansiot-Grad: Mercator: Skitter: 93% of the highest degree nodes are in N 90% of the highest degree nodes are in N 84% of the highest degree nodes are in N
19 Testing C2 H C2 0 Degree distribution of nodes near the source is consistent with that of all nodes. Pansiot-Grad Mercator Skitter log(pr[x>x]) Far All Near Far All Near Far All Near log(degree)
20 Summary of Statistical Tests For all datasets, we reject both null hypotheses of no bias. We conclude that it is likely that true degree distribution of sampled routers is different than what is shown in these datasets.
21 Final Remarks Using (k,m)-traceroute methods to discover Internet topology yields biased samples. Rocketfuel [SMW:02] may avoid some pitfalls of (k,m)- traceroute studies but is limited-scale One open question: How to sample the degree of a router at random?
Evaluation of a New Method for Measuring the Internet Degree Distribution: Simulation Results
Evaluation of a New Method for Measuring the Internet Distribution: Simulation Results Christophe Crespelle and Fabien Tarissan LIP6 CNRS and Université Pierre et Marie Curie Paris 6 4 avenue du président
Assignment #3 Routing and Network Analysis. CIS3210 Computer Networks. University of Guelph
Assignment #3 Routing and Network Analysis CIS3210 Computer Networks University of Guelph Part I Written (50%): 1. Given the network graph diagram above where the nodes represent routers and the weights
Estimating Network Layer Subnet Characteristics via Statistical Sampling
Estimating Network Layer Subnet Characteristics via Statistical Sampling M. Engin Tozal and Kamil Sarac Department of Computer Science The University of Texas at Dallas, Richardson, TX 758 USA engintozal,[email protected]
Social Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
WISE Power Tutorial All Exercises
ame Date Class WISE Power Tutorial All Exercises Power: The B.E.A.. Mnemonic Four interrelated features of power can be summarized using BEA B Beta Error (Power = 1 Beta Error): Beta error (or Type II
Mining Social Network Graphs
Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand
Distributed Computing over Communication Networks: Topology. (with an excursion to P2P)
Distributed Computing over Communication Networks: Topology (with an excursion to P2P) Some administrative comments... There will be a Skript for this part of the lecture. (Same as slides, except for today...
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
12.5: CHI-SQUARE GOODNESS OF FIT TESTS
125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
Correlational Research
Correlational Research Chapter Fifteen Correlational Research Chapter Fifteen Bring folder of readings The Nature of Correlational Research Correlational Research is also known as Associational Research.
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes
Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes Simcha Pollack, Ph.D. St. John s University Tobin College of Business Queens, NY, 11439 [email protected]
Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses
Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the
" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
Internet (IPv4) Topology Mapping. Department of Computer Science The University of Texas at Dallas
Internet (IPv4) Topology Mapping Kamil Sarac ([email protected]) Department of Computer Science The University of Texas at Dallas Internet topology measurement/mapping Need for Internet topology measurement
Adverse Impact Ratio for Females (0/ 1) = 0 (5/ 17) = 0.2941 Adverse impact as defined by the 4/5ths rule was not found in the above data.
1 of 9 12/8/2014 12:57 PM (an On-Line Internet based application) Instructions: Please fill out the information into the form below. Once you have entered your data below, you may select the types of analysis
8.1 Min Degree Spanning Tree
CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree
I. ADDITIONAL EVALUATION RESULTS. A. Environment
1 A. Environment I. ADDITIONAL EVALUATION RESULTS The tests have been performed in a virtualized environment with Mininet 1.0.0 [?]. Mininet is tool to create a virtual network running actual kernel, switch
What cannot be measured on the Internet? Yvonne-Anne Pignolet, Stefan Schmid, G. Trédan. Misleading stars
: What cannot be measured on the Internet? Yvonne-Anne Pignolet, Stefan Schmid, Gilles Tredan How accurate are network maps? Why? To develop/adapt protocols to Internet PaDIS, RMTP To understand the impact
High-Frequency Active Internet Topology Mapping
High-Frequency Active Internet Topology Mapping Cyber Security Division 2012 Principal Investigators Meeting October 10, 2012 Robert Beverly Assistant Professor Naval Postgraduate School [email protected]
The Joint Degree Distribution as a Definitive Metric of the Internet AS-level Topologies
The Joint Degree Distribution as a Definitive Metric of the Internet AS-level Topologies Priya Mahadevan, Dimitri Krioukov, Marina Fomenkov, Brad Huffaker, Xenofontas Dimitropoulos, kc claffy, Amin Vahdat
Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia
Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine Aït-Sahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times
BGP Prefix Hijack: An Empirical Investigation of a Theoretical Effect Masters Project
BGP Prefix Hijack: An Empirical Investigation of a Theoretical Effect Masters Project Advisor: Sharon Goldberg Adam Udi 1 Introduction Interdomain routing, the primary method of communication on the internet,
Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations
Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU 1 Introduction What can we do with graphs? What patterns
The Coremelt Attack. Ahren Studer and Adrian Perrig. We ve Come to Rely on the Internet
The Coremelt Attack Ahren Studer and Adrian Perrig 1 We ve Come to Rely on the Internet Critical for businesses Up to date market information for trading Access to online stores One minute down time =
Posit: An Adaptive Framework for Lightweight IP Geolocation
Posit: An Adaptive Framework for Lightweight IP Geolocation Brian Eriksson Department of Computer Science Boston University [email protected] Bruce Maggs Department of Computer Science Duke University
Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.
Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal
Violent crime total. Problem Set 1
Problem Set 1 Note: this problem set is primarily intended to get you used to manipulating and presenting data using a spreadsheet program. While subsequent problem sets will be useful indicators of the
EFFICIENT DETECTION IN DDOS ATTACK FOR TOPOLOGY GRAPH DEPENDENT PERFORMANCE IN PPM LARGE SCALE IPTRACEBACK
EFFICIENT DETECTION IN DDOS ATTACK FOR TOPOLOGY GRAPH DEPENDENT PERFORMANCE IN PPM LARGE SCALE IPTRACEBACK S.Abarna 1, R.Padmapriya 2 1 Mphil Scholar, 2 Assistant Professor, Department of Computer Science,
Social Media Mining. Network Measures
Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users
Network (Tree) Topology Inference Based on Prüfer Sequence
Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 [email protected],
Chi-square test Fisher s Exact test
Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions
Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation
Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.
Distance Degree Sequences for Network Analysis
Universität Konstanz Computer & Information Science Algorithmics Group 15 Mar 2005 based on Palmer, Gibbons, and Faloutsos: ANF A Fast and Scalable Tool for Data Mining in Massive Graphs, SIGKDD 02. Motivation
2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing
November 08, 2010. 155S8.6_3 Testing a Claim About a Standard Deviation or Variance
Chapter 8 Hypothesis Testing 8 1 Review and Preview 8 2 Basics of Hypothesis Testing 8 3 Testing a Claim about a Proportion 8 4 Testing a Claim About a Mean: σ Known 8 5 Testing a Claim About a Mean: σ
Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:
Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve
Introduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
p-values and significance levels (false positive or false alarm rates)
p-values and significance levels (false positive or false alarm rates) Let's say 123 people in the class toss a coin. Call it "Coin A." There are 65 heads. Then they toss another coin. Call it "Coin B."
Chi Square Tests. Chapter 10. 10.1 Introduction
Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square
Detecting Network Anomalies. Anant Shah
Detecting Network Anomalies using Traffic Modeling Anant Shah Anomaly Detection Anomalies are deviations from established behavior In most cases anomalies are indications of problems The science of extracting
Effective Network Monitoring
Effective Network Monitoring Yuri Breitbart, Feodor Dragan, Hassan Gobjuka Department of Computer Science Kent State University Kent, OH 44242 {yuri,dragan,hgobjuka}@cs.kent.edu 1 Abstract Various network
Part 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
TEST 2 STUDY GUIDE. 1. Consider the data shown below.
2006 by The Arizona Board of Regents for The University of Arizona All rights reserved Business Mathematics I TEST 2 STUDY GUIDE 1 Consider the data shown below (a) Fill in the Frequency and Relative Frequency
WITH THE RAPID growth of the Internet, overlay networks
2182 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 24, NO. 12, DECEMBER 2006 Network Topology Inference Based on End-to-End Measurements Xing Jin, Student Member, IEEE, W.-P. Ken Yiu, Student
BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394
BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete
CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING
CHAPTER 6 CROSS LAYER BASED MULTIPATH ROUTING FOR LOAD BALANCING 6.1 INTRODUCTION The technical challenges in WMNs are load balancing, optimal routing, fairness, network auto-configuration and mobility
Graph Theory and Complex Networks: An Introduction. Chapter 06: Network analysis
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.0, [email protected] Chapter 06: Network analysis Version: April 8, 04 / 3 Contents Chapter
Some Examples of Network Measurements
Some Examples of Network Measurements Example 1 Data: Traceroute measurements Objective: Inferring Internet topology at the router-level Example 2 Data: Traceroute measurements Objective: Inferring Internet
Variables Control Charts
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. Variables
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION
CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION N PROBLEM DEFINITION Opportunity New Booking - Time of Arrival Shortest Route (Distance/Time) Taxi-Passenger Demand Distribution Value Accurate
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2
Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS
INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem
HYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
Graph Theory and Complex Networks: An Introduction. Chapter 08: Computer networks
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, [email protected] Chapter 08: Computer networks Version: March 3, 2011 2 / 53 Contents
Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.
HYPOTHESIS TESTING Learning Objectives Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test. Know how to perform a hypothesis test
Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
Internet Firewall CSIS 4222. Packet Filtering. Internet Firewall. Examples. Spring 2011 CSIS 4222. net15 1. Routers can implement packet filtering
Internet Firewall CSIS 4222 A combination of hardware and software that isolates an organization s internal network from the Internet at large Ch 27: Internet Routing Ch 30: Packet filtering & firewalls
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
Testing Random- Number Generators
Testing Random- Number Generators Raj Jain Washington University Saint Louis, MO 63130 [email protected] Audio/Video recordings of this lecture are available at: http://www.cse.wustl.edu/~jain/cse574-08/
Chapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
Stats Review Chapters 9-10
Stats Review Chapters 9-10 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by Michael Sullivan, III And the corresponding Test
Hypothesis Testing for Beginners
Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes
In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%
Hypothesis Testing for a Proportion Example: We are interested in the probability of developing asthma over a given one-year period for children 0 to 4 years of age whose mothers smoke in the home In the
Statistics 2014 Scoring Guidelines
AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home
How To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
Temporal Dynamics of Scale-Free Networks
Temporal Dynamics of Scale-Free Networks Erez Shmueli, Yaniv Altshuler, and Alex Sandy Pentland MIT Media Lab {shmueli,yanival,sandy}@media.mit.edu Abstract. Many social, biological, and technological
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
Statistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
Part 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
Chapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420
BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 1. Which of the following will increase the value of the power in a statistical test
