R Commander Tutorial



Similar documents
Introduction to GIS software

R with Rcmdr: BASIC INSTRUCTIONS

MetroBoston DataCommon Training

SPSS: Getting Started. For Windows

R and Rcmdr : Basic Functions for Managing Data

Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller

Introduction to SPSS 16.0

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

FIRST STEPS WITH SCILAB

4 Other useful features on the course web page. 5 Accessing SAS

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

MicroStrategy Desktop

Migrating to Azure SQL Database

SPSS Manual for Introductory Applied Statistics: A Variable Approach

Using SPSS, Chapter 2: Descriptive Statistics

Module 5: Statistical Analysis

WHAT YOU OWN HOME INVENTORY SOFTWARE

File Storage. This is a manual that contains pertinent information about your File Storage space at SLC.

Excel Reports and Macros

PubMed My NCBI: Saving Searches & Creating Alerts

Learn About Analysis, Interactive Reports, and Dashboards

CREATING AND EDITING CONTENT AND BLOG POSTS WITH THE DRUPAL CKEDITOR

Getting started manual

Chapter 4: Website Basics

Configure Single Sign on Between Domino and WPS

Lesson 07: MS ACCESS - Handout. Introduction to database (30 mins)

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

Introduction to Exploratory Data Analysis

Instructions for creating a data entry form in Microsoft Excel

Data exploration with Microsoft Excel: analysing more than one variable

Free Excel add-in for linear regression and multivariate data analysis

How To Create A Hyperlink In Publisher On Pc Or Macbookpress.Com (Windows) On Pc/Apple) On A Pc Or Apple Powerbook (Windows 7) On Macbook Pressbook (Apple) Or Macintosh (Windows 8

Shasta College SharePoint Tutorial. Create an HTML Form

Chapter 4 Creating Charts and Graphs

Using Microsoft Office to Manage Projects

SPSS Explore procedure

InfiniteInsight 6.5 sp4

BID2WIN Workshop. Advanced Report Writing

Notepad++ The COMPSCI 101 Text Editor for Windows. What is a text editor? Install Python 3

Scatter Plots with Error Bars

Importing Contacts to Outlook

Lab: Data Backup and Recovery in Windows XP

Rational Quality Manager. Quick Start Tutorial

introduction to emarketing

Avaya Network Configuration Manager User Guide

How To Connect Your Cloud

MicroStrategy Analytics Express User Guide

Novell ZENworks Asset Management 7.5

The Dummy s Guide to Data Analysis Using SPSS

STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.

TM Online Storage: StorageSync

Quickstart for Desktop Version

HRS 750: UDW+ Ad Hoc Reports Training 2015 Version 1.1

Introduction to Microsoft Access 2003

USING MS OUTLOOK WITH FUS

Fountas & Pinnell Benchmark Assessment System Data Management Software (DMS) User s Guide

Spreadsheets and Laboratory Data Analysis: Excel 2003 Version (Excel 2007 is only slightly different)

DataPA OpenAnalytics End User Training

CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08

Word 2010: Mail Merge to with Attachments

Microsoft Excel Tutorial

ORACLE BUSINESS INTELLIGENCE WORKSHOP

How to Download Census Data from American Factfinder and Display it in ArcMap

Creating Custom Crystal Reports Tutorial

Using Excel for Data Manipulation and Statistical Analysis: How-to s and Cautions

Data Analysis Tools. Tools for Summarizing Data

Lab - Data Backup and Recovery in Windows XP

4. Are you satisfied with the outcome? Why or why not? Offer a solution and make a new graph (Figure 2).

January 26, 2009 The Faculty Center for Teaching and Learning

User s Guide The SimSphere Biosphere/Atmosphere Modeling Tool

Introduction to the use of the environment of Microsoft Visual Studio 2008

Configuring a Custom Load Evaluator Use the XenApp1 virtual machine, logged on as the XenApp\administrator user for this task.

Scribe Online Integration Services (IS) Tutorial

Summary of important mathematical operations and formulas (from first tutorial):

An introduction to using Microsoft Excel for quantitative data analysis

UCINET Visualization and Quantitative Analysis Tutorial

Google Docs A Tutorial

Novell Filr. Windows Client

7-zip Encryption Instructions

Microsoft Access 2010 handout

ARCONICS CONTENT MANAGEMENT SYSTEM FOR UL

Please note that a username and password will be made available upon request. These are necessary to transfer files.

CREATING AN IMAGE FROM AUTOCAD CADD NOTE 16. MENU: AutoCAD, File, Plot COMMAND: plot ICON:

How To Manage Your Storage In Outlook On A Pc Or Macintosh Outlook On Pc Or Pc Or Ipa On A Macintosh Or Ipad On A Computer Or Ipo On A Laptop Or Ipod On A Desktop Or Ipoo On A

How to Setup and Connect to an FTP Server Using FileZilla. Part I: Setting up the server

MS Word Microsoft Outlook 2010 Mailbox Maintenance

MICROSOFT OFFICE ACCESS NEW FEATURES

USING OUTLOOK WITH ENTERGROUP. Microsoft Outlook

Getting started with the Stata

Using R for Windows and Macintosh

Appendix A How to create a data-sharing lab

PROJECT ON MICROSOFT ACCESS (HOME TAB AND EXTERNAL DATA TAB) SUBMITTED BY: SUBMITTED TO: NAME: ROLL NO: REGN NO: BATCH:

Introduction to RStudio

TSM for Windows Installation Instructions: Download the latest TSM Client Using the following link:

A Guide to Using Excel in Physics Lab

How to Configure Windows 8.1 to run ereports on IE11

Microsoft Word 2011: Create a Table of Contents

Transcription:

R Commander Tutorial Introduction R is a powerful, freely available software package that allows analyzing and graphing data. However, for somebody who does not frequently use statistical software packages, the big drawback of R is that it is command line based and thus not very intuitive to use. For users who do not use statistical software very often, R commander might be a good alternative. The R commander is a software package that allows running R from a graphical user interface. This makes analyzing and graphing your data in R a lot easier. Objective The objective of this tutorial is to give you a basic introduction to R Commander and how to use it to run basic statistics and create graphs. 1. Start the R Commander Open R by either clicking on the R icon on your desktop or by navigating to R in your programs folder. Once you opened R, go to Packages/Load Packages on the R menu bar and find Rcmdr in the R packages list (R packages are similar to software programs that have been written by different contributors for R). Highlight Rcmdr by clicking on it and click OK. R might give you a warning message. If so, just ignore it and click No. The R Commander console should now appear on your screen and you are ready to run some statistics and make some graphs in R. 1

2. Reading your data into R After you come back from the field, your notebook shows the following data recordings: Now you want to create a digital copy of your data. To do this, start your computer and type the data table into notepad or another text editor of your choice and save the data table on your hard drive (Important: Data have to be separated by commas as shown below). Make sure you remember where you save it so you can navigate to the dataset later on. 2

On the R Commander menu bar, go to Data/Import data and select from text file, clipboard, or URL which should bring up the window below. Make the same selections as shown in the window below (e.g. name your data set cover_moisture and select Commas as your field separator since we separated our data by commas when we entered them into our text editor earlier). Click OK and a window appears that allows you to navigate to your data file. Once you navigated to your data file, highlight it by clicking on it and click Open. You can now view your data by clicking on View data set on the R Commander menu bar. You can also directly enter your data into R by selecting Data from the R Commander menu bar and clicking on New dataset.this will bring up the following window. The Data Editor window appears that allows you to directly enter your data into R. By clicking on the column header, you can change the variable name of each column (e.g. change var1 to location, var2 to 3

cover, and var3 to soil moisture). The variable editor also allows you to select the type of your variables you are entering. Since you are entering numeric values, select numeric under variable type. Type in your data as shown below. 3. Summary statistics To get some summary statistics of your data, go to Statistics/Summaries and select Numerical summaries. Now you should see the following window: Pick cover and soil.moisture (Note: to select more than one variable you have to hold down the Ctrl key) and click OK. A summary table will appear that shows the mean, standard deviation, and the 0, 0.25, 0.50, 0.75, 1 quantiles of the cover and soil.moisture data. 4. Scatterplot To see if there is a relationship between cover and soil moisture it might be a good idea to look at a scatterplot of the data. To create a scatterplot, go to Graphs on the R Commander menu bar and select Scatterplot. This will bring up a table. Select cover as you x-variable and soil moisture as your y- variable. Lable your x- and y-axis Cover (%) and Soil Moisture %, respectively. Next, click OK and a scatterplot will appear (Important: Make sure you highlight the R Console by clicking on it to be able to see the scatterplot). You can save the scatterplot (or any other plot you create) by clicking on the plot 4

(Important: if you do not select the plot you won t be able to save it) and on the R menu bar (Note: R menu bar and not the R commander menu bar) going to File/Save as/jpeg and click on 100% quality. This will bring up a window that allows you to specify the location on your computer where you want to save the plot as a Jpeg image. 5. Fitting a linear regression model The scatterplot above shows us that there is a positive relationship between soil moisture and cover. However, the scatterplot does not tell us how strong the relationship is, if the relationship is significant etc. To get this information we do have to fit a linear regression model. To fit a linear regression model go to Statistics/Fit models on the R Commander menu bar and select Linear model. Select soil moisture as your response variable (aka y- variable or dependent variable) and cover as your explanatory variable (aka x-variable or independent variable) and click OK. 5

The following output will appear in the Output Window of the R Commander: We will talk in class how to interpret the output table (e.g. what do those numbers mean).to check the basic model diagnostics for the linear model you just fit, go to Models/Graphs on the R Commander menu bar and select Basic diagnostic plots. This brings up the following window (We will discuss in class how to interpret the model diagnostic plot): 6

6. Fitting multiple regression models In this part of the tutorial you learn how to fit a multiple regression model. Your hypothesis is that air temperature, solar radiation, and wind speed are significant predictors of ozone. To test this hypothesis, you collected the data called airquality.txt that are available in the class Dropbox folder (C:\...\Dropbox\Jan Teaching Files\CSS 560\Data\R Commander\airquality.txt) (Note: The data was taken from Daalgard, 2002). Let's import the data into R commander and call the dataset airquality (if you can't remember how to import data please refer to x.x in the document). Let's take a look at the data to familiarize ourselves with the data by selecting airquality from the Data set dropdown menu. Next, let's plot the relationships between the different variables in the dataset. To do this, make the R Console active by clicking on it and type the following command into the R Console command line prompt: pairs(airqualit). 7

Now you should see the following figure: This is how you read the figure: It looks like there is some sort of relationship between ozone and temperature and ozone and wind. However, there seems to be no relationship between ozone and solar radiation. OK - let's now fit a multiple regression model to test if solar radiation, wind, and temperature are significant predictors of ozone. To fit a multiple regression model let's go to Statistics/Fit models... on 8

the R Commander menu bar and select Linear model.... A window appears that should be somewhat familiar to you from section 5 of this tutorial. The model you want to fit basically says that ozone is a function of solar radiation, air temperature, and wind. Mathematically, we can write this model as follows: Ozone ~ Solar.R + Temp + Wind [1] After typing model [1] in the appropriate section of the linear model window (see above) click OK. You should now see the following output: 9

Let's also take a look at the model diagnostics: We will discuss the interpretation of the model output as well the interpretation of the model diagnostics in more detail in class. 7. Paired t-test Next, we will to conduct a paired t-test to see if there is a statistical significant difference in soil moisture before and after a rain event. The data for the paired t-test is in the class Dropbox folder (C:\Users\Jan\Dropbox\Jan Teaching Files\CSS 560\Data\R Commander\paired _t_test.txt). Import the data into R by following the steps you learned about at the beginning of this tutorial and name the dataset soil_moisture (Hint: Open the paired_t_test.txt file in a text editor. You will see that the paired_t_test.txt file is a tab delimited file and not comma delimited file. You need that information to properly import the data into R). Before conducting a paired t-test (and any other t-test) it might be a good idea to look at a boxplot of the data first. To do this you do have to stack your data first (you just re-arranging the data so they are in a format that can be used by the computer to create a boxplot of your data) by going to Data/Active data set on the R Commander menu bar and click on Stack variables in active data set. 10

You should now see the Stack Variables window shown below. Select both the soil.moisture.after and soil.moisture.before variables and name the stacked dataset stacked_soil_moisture. Keep the rest of the default settings as shown below and click OK. Next, go to Graphs/Boxplots on the R Commander menu bar. In the window that pops up select Plot by groups and group your variables by factor and click OK. Now you should see the following boxplot: Based on the boxplot, do you think the soil moisture changed significantly after the rain event? After visually looking at the data we are ready to run a paired t-test. To do this, let s go back to our original, unstacked dataset by going to Data set on the R Commander menu bar and selecting soil_moisture. Click OK. 11

Next, go to Statistics/Means on the R Commander menu bar and select Paired t-test. Next, select soil.moisture.before as your first variable and soil.moisture.after as you second variable. Keep the rest at the default settings as shown below. After clicking OK you should get the following output. We will discuss in class how to interpret the output. 8. Two-sample t-test In this section of the tutorial we will learn how to conduct a two sample t-test. We want to test the following hypothesis: soil ph of the non treated stand in the Ponderosa State Park is statistically 12

significantly different than the soil ph in the treated part of the Park. The hypothetical data that were collected are available in the class Dropbox folder (C:\...\Dropbox\Jan Teaching Files\CSS 560\Data\R Commander\ph.txt). Let's import the data into the R commander and create a boxplot of the data as we learned in section 7 of this tutorial (remember: you first have to stack the data in order to create the boxplot below. For more details please refer to section 7 of this tutorial). OK - it looks like the soil ph in the non treated part of the forest is lower than in the treated part. Let's now do a two-sample t-test to see if the soil ph are statistically significantly different from each other. To do this, keep your stacked ph dataset active and go to the R Commander menu bar and select Statistics/Means and select Independent samples t-test... (in case Independent samples t-test... option is greyed out make sure you i) stacked the ph dataset and ii) that the stacked ph dataset is the active dataset). 13

The window that now appears should look similar to the one below: Keep the default settings and click OK. Now you should see the following output: We will discuss in the class how to interpret the output. 9. Customize your graphs If you want to customize your figures, you do have to do a little bit of programming. For example, the boxplot you creaed in section 8 of this tutorial is associated with the following line of code in your R Commander script window: boxplot(variable ~ factor, ylab = "ph", xlab="factor", data = ph_stacked) 14

We can now change this line of code some to make the boxplot a little nicer. For example, we could type the following into the R Console: boxplot(variable ~ factor, ylab = "Soil ph", xlab = "", names = c("treated Forest", "Untreated Forest"), data = ph_stacked) If you write the code above into the R Console and hit enter you should see the following boxplot: It becomes clear that you need some R programming experience and knowledge to change the appearance of the figure beyond what the R Commander allows you to do. If you do want to learn more about how to program in R, the R website is a good starting point (http://www.r-project.org/ ) as well as Peter Dalgaard's book "Introductory Statistics in R". 10. Closing R Commander and R To close the R Commander and R, go to File/Exit and select From Commander and R. 15

Next, the R Commander will ask you if you want to exit the program. Click OK. Next it will ask you if you want to save the script file and the output file. Click No in both cases. Congratulations - you successfully finished the R Commander tutorial. Other resources Getting started with the R Commander. You can find a pdf of this tutorial on our class website (http://ecosensing.org/teaching/css-560/digital-library/tutorials). If you want to learn more about the R commander I recommend you working through this tutorial. Literature cited Dalgaard, Peter. 2002. Introductory Statistics in R. Springer Science and Business Media, Inc. Important: If you used a MOSS computer for this tutorial, please make sure you delete all the files you created from the computer after you are done with the tutorial. Thanks! Disclaimer Always consult a trained statistician to validate the correctness of the statistical approach you are taking. Please e-mail any suggestions of how to potentially improve this document to Jan Eitel (jeitel@ uidaho.edu). Use of trade names does not constitute an official endorsement by the McCall Outdoor Science School. 16