Introduction
What is Statistics? Statistics is about Collecting data Organizing data Analyzing data Presenting data
What is Statistics? Statistics is divided into two areas: descriptive statistics and inferential statistics. The descriptive part involves arranging, summarizing, and presenting a set of data. We describe individuals or items with graphical techniques and numerical descriptive measures. The inferential part is a collection of methods used to draw conclusions or inference about the characteristics of populations based on sample data..
Writing a science report. Statistical perspective! Formulate a research problem. Define the population and plan a sample survey. Find relevant variables to measure. Make descriptive statistics. Make inferential statistics. Write report.
Define the population and perform a sample survey. Population: A group of individuals which we want to investigate. Total survey: All the units in a population are investigated. Sample survey: A subsample of the populations is chosen and investigated. Random sampling: The sample units are chosen by some random mechanism.
Where it can go wrong! A study is carried out to understand the training habits of students at Umeå university. The researcher hands out questionnaires at the entrance of Iksu. Result: Students at Umeå University train a lot more than expected. Where did it go wrong?
Where it can go wrong! A study is carried out to find out if students at Umeå University prefer the campus pups more than the inner city pubs. The researcher hands out questionnaires in the queue at a campus pub. Result: A majority of the students prefer campus pubs. Where did it go wrong?
Why a sample survey rather than a total survey? Why a sample survey rather than a total survey? Cheaper Faster Cannot be used when the population is very big or infinitely large Trials where the objects are used or destroyed
Why make a random sample? Give objective measures of the precision of the results of the survey. Perform a theory for effective planning of a survey. Make objective comparisons between different sampling plans prior to the survey. Calculate how large samples you need in order to achieve a certain margin of error. Use probability theory to control the uncertainness.
Find relevant variables to measure. Variable: Property connected with the units in a population. Measurement: An allocation of numbers to the subjects in a survey such that specific relationships between the subjects, in consideration to some specific property, can be seen in the numbers.
Why do we measure? To describe, To compare, To evaluate. Examples of things we want to measure: Length Stress Welfare Consumer satisfaction.
Data levels Nominal Data Classification, Ordinal Data Classification and Order Interval Data Classification, Order, and Equivalent distance Ratio Data Classification, Order, Equivalent distance, and Absolute zero
Which type of variable? (Help me!) Age Age group 25-34, 35-44, 45-54,... Sex (male/female) Education (primary, secondary, university) Smoker (yes/no) BMI (23.45, 28.12, ) Car model (Volvo, Saab, Fiat) Temperature (12C, -4C, 14C, )
Descriptive statistics Measures of location mean median Measure of spread range (min-max) variance standard deviation, SD standard error of the mean, SEM percentiles/quantiles (p 25, p 75, q1, q3,...) Frequency tables Graphs barchart/histogram boxplot scatterplot
Center and spread Answering research questions is to compare measures.
Workload and exam result investigation. Is there a difference in the study results between males and females? If so, what does the difference depend on? A sample of graphs and plots.
Exam results (scale)
Workload (scale)
Histogram of Exam Score (scale)
Bar chart Grades (ordinal/ nominal)
Pie Chart of Grade (ordinal/ nominal)
Boxplot of Exam Score, gender (scale vs nominal)
Bar chart of Grade, gender (nominal vs nominal)
Boxplot of Total Study Time, gender (scale vs nominal)