Math 213: Applied Statistics, Gannon University MINITAB 15 Guide 1

Math 213: Applied Statistics, Gannon University MINITAB 15 Guide 1 November 15, 2007 This guide contains instructions for most of the MINITAB commands used in the course. More commands may be added to this document as needed. To obtain an up-to-date copy, follow this link. Data Entry (1) Data should be typed in single column with the first entry being a label for the data. (2) The label will be called the variable name and is used by MINITAB to process your data or plot it. General Statistics (1) To produce sample statistics for a data set, choose Stat Basic Statistics Display Descriptive Statistics and select your data as the variable. Click the Statistics button to choose which values you wish to display. The most useful for us will be Mean, SE of Mean, Standard Deviation, Variance, Minimum, Maximum, Range, N total, First quartile, Median, Third Quartile, Interquartile range, and Mode. (2) Under the Graph option in the Display Descriptive Statistics window, you may generate a histogram and/or boxplot for your data. (3) To find all Modes of your data, choose Stat Tables Tally Individual Variables and select your data as the variable. Make sure that Counts is checked. A list of the data elements and their frequencies will be produced. The element(s) with the largest count(s) is/are the mode. (4) Since the assumed data for MINITAB is sample data, you have to compute population variance and population standard deviation separately. First obtain the Mean using the process above and enter it as its own column in your worksheet. Label a new column for the variance and/or standard deviation. Choose Calc Calculator. Store the result in variance or standard deviation as appropriate. For population variance, enter SUM(( Data - Mean )**2) / COUNT( Data ) as the expression, where Data is the variable name for you data values. To produce the population standard deviation, either follow the above and then use the Calculator to compute SQRT( variance ) or enter the variance formula above with SQRT() surrounding it. Histograms (1) To create a histogram from a column of data, select Graph Histogram. Choose Simple. Double click the variable name and select OK. (2) In order to make a relative frequency histogram, follow the directions above, but before selecting the final OK, select Scale and choose the tab Y-Scale Type. Then select Percent and continue as before. Alternatively, once a histogram is 1 Dr. Geoffrey Dietz

2 produced, you may double click the Y-Axis. Choose the tab Type and then select Percent. (3) For either type of histogram, you may choose the midpoints of the classes or the cutpoints between classes. Double click the X-Axis and then select the tab Binning. Choose either Midpoint or Cutpoint as needed and then enter the desired points under Midpoint/Cutpoint positions. A short cut is to enter Smallest Point:Largest Point/Number of Classes. For instance, to specify midpoints from 3.5 to 35.5 evenly spaced for 8 classes, enter 3.5:35.5/8. Ogives (Cumulative Frequency Graphs) (1) To create an ogive, you first need to generate a table of cumulative frequency data. Your table should have two columns. The first starts with the minimum data point in your set and then lists the upper class limits. The second column starts with 0 and then lists the cumulative frequency numbers associated to each class. The first row is thus (min. data value, 0) and the other rows look like (ucl, cum. f). (2) To graph, choose Graph Scatterplot and select With Connect Line. Select your second column, cumulative frequency, as the Y-variable and the first column, upper class limit, as the X-variable. Stem-and-Leaf Plots (1) To create a stem-and-leaf plot, choose Graph Stem-and-Leaf. Select the variable to plot, uncheck Trim Outliers, and enter 10 for the increment. (2) To make a more refined plot, enter a smaller increment, such as 5. Dotplots (1) Choose Graph Dotplot, select Simple, and then select the variable. Scatterplots and Time Series Charts (1) To create a scatterplot, you need two columns of data that are paired. Choose Graph Scatterplot, select Simple, and then select one column as the Y- variable and the other column as the X-variable. (2) If your paired data represents a time series, then follow the directions above but make sure to enter the Time data column as the X-variable. Boxplots (1) To make a boxplot, choose Graph Boxplot and select Simple. Enter your data column as the variable and select Scale. Check the box transpose value and category scales and generate the plot. Generating Random Data (1) To generate random numbers, choose Calc Random Data Integer. Enter the number of Rows of data to produce, a column to store the numbers, and the minimum and maximum integer values to produce.

Binomial Distributions (1) To compute probabilities for a binomial distribution, choose Calc Probability Distributions Binomial. Enter the number of trials, value of p (probability of success), and column name with your values of x. To compute values of P(x), choose Probability in the window. To compute values of P(X < x), choose Cumulative Probability. (2) To graph a binomial distribution, create a column for x with 0 to the number of trials and a column called P(x). Then follow the directions above to compute values of P(x) but select a P(x) for Optional storage. Now choose Graph Bar Chart and select Values From a Table and Simple from the next window. Select P(x) as the Graph variable and x as the Categorical variable. After the graph appears, double click on the X-Axis, uncheck the box next to Gap between clusters, and enter 0 in the field. Your graph will now look like a relative frequency histogram. Normal Distributions (1) To compute cumulative probabilities for normal distributions, choose Calc Probability Distributions Normal. Select Cumulative probability and then enter the Mean and Standard Deviation. Enter an x-value into Input constant to compute P (X < x). (2) To find the x-value that has a certain proportion P of the distribution area to the left of x, choose Calc Probability Distributions Normal and select Inverse cumulative probability. Enter the Mean, Standard Deviation, and put the proportion P in as the Input constant. (3) To compute with sampling distributions when the sample size is n 30 or when the population is normally distributed, use the given mean and compute the standard error of the mean using the given standard deviation and sample size. To compute cumulative or inverse cumulative probabilities, choose Calc Probability Distributions Normal. Select Cumulative probability or Inverse cumulative probability, enter the Mean and then enter the standard error of the mean as the Standard Deviation. Finally, enter the appropriate Input constant. (4) Alternatively, use Graph Probability Distribution Plots. Select View Probability, choose Normal distribution, and enter the correct values for Mean and Standard Deviation under the Distribution tab. Then select the Shaded Area tab. Choose whether you are entering Probability values or X values and whether you want information about the Right Tail, Left Tail, Both Tails, or Middle. The result will display a shaded graph with the appropriate probabilities and X values marked. Confidence Intervals for the Mean (1) To find a confidence interval when the sample size is large (n 30) or when the data is normal and the exact value of σ is known, use the following commands: Stat Basic Statistics 1-Sample Z, enter your data or summarized data, enter the population standard deviation or sample std. dev. if σ is unknown. 3

4 (2) To find a confidence interval when the sample size is small (n < 30), the exact value of σ is unknown, and the data is approximately normal, select Stat Basic Statistics 1-Sample t and enter your data or summary data. (3) The default for MINITAB is a 95% confidence interval. To change the confidence level, select Options in the 1-Sample Z (or t) window and enter a new value. (4) You should leave the Perform hypothesis test box unchecked. Make sure that the under Options, the Alternative is set to not equal. Hypothesis Testing for the Mean (1) If n 30, then obtain the P -value by choosing Stat Basic Statistics 1-Sample Z. Check the box to Perform hypothesis test. Enter your data or summary data and a test mean. Select Options and enter 1 α as the Confidence level and whether H a is a, <, or > statement. (2) If n < 30, then follow the directions above but choose Stat Basic Statistics 1-Sample t. (3) If you have two independent samples, choose Stat Basic Statistics 2-Sample t. If you have data, it should be entered as two separate columns. Otherwise you should have summary statistics for each sample. Enter the appropriate information in the 2-Sample t window and enter the appropriate Confidence level, test difference (usually 0.0), and H a type under Options. If the sample is small and variances are equal, then check the box Assume equal variances in the 2-Sample t window. (4) If you have two dependent or paired samples, then enter the paired data in 2 columns. Choose Stat Basic Statistics Paired t. Enter the column names or summary data. Under Options enter the appropriate Confidence level, test mean, and H a type. Confidence Intervals and Hypothesis Testing for a Proportion (1) To find a confidence interval for a population proportion p, choose Stat Basic Statistics 1 Proportion. Under Summarized data, enter the sample size (Number of trials) and the number of successes (Number of events). Select Options to enter the desired Confidence level and check the box Use test and interval based on normal distribution. The Alternative should be set to not equal, and the box for Perform hypothesis test should be unchecked. (2) To perform a hypothesis test for a proportion p, perform the steps above but also check the box to Perform hypothesis test and enter the hypothesized value of p and whether H a is a >,, or < statement in Options. (3) If you have two independent samples, compute the number of successes for each. Choose Stat Basic Statistics 2 Proportions and enter the sample sizes and number of successes. Under Options enter the Confidence level, Test difference, and H a type. Check the box for Use pooled estimate of p for test.

Correlation and Linear Regression (1) To find the correlation between two variables, enter the data in columns, select Stat Basic Statistics Correlation, and enter the variable names. To test whether ρ is significant, check the box for Display P -values. (2) To find the regression line for correlated data, enter the data in columns. Select Stat Regression Fitted Line Plot, choose the correct x and y variables, and make sure the regression type is linear. A scatter plot with the regression line and the linear formula will be displayed. (3) To find the coefficient of determination (r 2 ) or standard error of estimate (S or s e ), use the method above to find the regression line from your data. The r 2 value will be displayed as R-Sq and standard error as S. 5