Getting your data into R



Similar documents
Importing Data into R

Reading and writing files

Using R for Windows and Macintosh

Dataframes. Lecture 8. Nicholas Christian BIOST 2094 Spring 2011

Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller

jexcel plugin user manual v0.2

SPSS for Windows importing and exporting data

The Saves Package. an approximate benchmark of performance issues while loading datasets. Gergely Daróczi

MS Excel. Handout: Level 2. elearning Department. Copyright 2016 CMS e-learning Department. All Rights Reserved. Page 1 of 11

Importing and Exporting With SPSS for Windows 17 TUT 117

Activity 3.7 Statistical Analysis with Excel

Using the Advanced Tier Data Collection Tool. A Troubleshooting Guide

Microsoft Excel 2007 Mini Skills Overview of Tables

Excel 2013 What s New. Introduction. Modified Backstage View. Viewing the Backstage. Process Summary Introduction. Modified Backstage View

Monthly Payroll to Finance Reconciliation Report: Access and Instructions

How to Import Data into Microsoft Access

Generating a Custom Bill of Materials

R data import and export

Creating Excel Link reports with efficient design

Introduction to IBM SPSS Statistics

Advanced Excel 10/20/2011 1

Installing R and the psych package

SPSS: Getting Started. For Windows

Designed by Jason Wagner, Course Web Programmer, Office of e-learning ZPRELIMINARY INFORMATION... 1 LOADING THE INITIAL REPORT... 1 OUR EXAMPLE...

How to use MS Excel to regenerate a report from the Report Editor

Working together with Word, Excel and PowerPoint

Intro to Mail Merge. Contents: David Diskin for the University of the Pacific Center for Professional and Continuing Education. Word Mail Merge Wizard

rm(list=ls()) library(sqldf) system.time({large = read.csv.sql("large.csv")}) # seconds, 4.23GB of memory used by R

Easy Map Excel Tool USER GUIDE

Creating CSV Files. Recording Data in Spreadsheets

How to Excel with CUFS Part 2 Excel 2010

4. Are you satisfied with the outcome? Why or why not? Offer a solution and make a new graph (Figure 2).

MICROSOFT WORD: MAIL MERGE

ACCESS Importing and Exporting Data Files. Information Technology. MS Access 2007 Users Guide. IT Training & Development (818)

A Guide to Stat/Transfer File Transfer Utility, Version 10

Accounts Receivable: Importing Remittance Data

As in the example above, a Budget created on the computer typically has:

IRF Business Objects. Using Excel as a Data Provider in an IRF BO Report. September, 2009

UCINET Quick Start Guide

Estimating a market model: Step-by-step Prepared by Pamela Peterson Drake Florida Atlantic University

RA MODEL VISUALIZATION WITH MICROSOFT EXCEL 2013 AND GEPHI

Basic Excel Handbook

Novell ZENworks Asset Management 7.5

NAIP Consortium Strengthening Statistical Computing for NARS SAS Enterprise Business Intelligence

Microsoft Excel Tips & Tricks

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1

Microsoft Access Rollup Procedure for Microsoft Office Click on Blank Database and name it something appropriate.

Microsoft Excel Basics

3 What s New in Excel 2007

Working with Spreadsheets

SENDING S WITH MAIL MERGE

Introduction to RStudio

Appendix 2.1 Tabular and Graphical Methods Using Excel

A Short Guide to R with RStudio

Work with the MiniBase App

Tips and Tricks SAGE ACCPAC INTELLIGENCE

VDF Query User Manual

Converting Excel Spreadsheets or Comma Separated Values files into Database File or Geodatabases for use in the USGS Metadata Wizard

MicroStrategy Desktop

SPSS Workbook 1 Data Entry : Questionnaire Data

INTEGRATING MICROSOFT DYNAMICS CRM WITH SIMEGO DS3

2/24/2010 ClassApps.com

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

INTRODUCTION: SQL SERVER ACCESS / LOGIN ACCOUNT INFO:

Welcome to the topic on Master Data and Documents.

ing a large amount of recipients

Constant Contact Export

MS Excel Template Building and Mapping for Neat 5

How to Configure and Use the Moodle Grade Book

An introduction to using Microsoft Excel for quantitative data analysis

Vodafone Bulk Text. User Guide. Copyright Notice. Copyright Phonovation Ltd

Personal Geodatabase 101

NCSS Statistical Software

How to Create a Spreadsheet With Updating Stock Prices Version 3, February 2015

Creating and Managing Online Surveys LEVEL 2

Creating an Excel Spreadsheet for Mail Merge. Excel Spreadsheet Mail Merge. 1 of 9 Design & Print Offline: Mail Merge

Create a Pipe Network with Didger & Voxler

2. The Open dialog box appears and you select Text Files (*.prn,*.txt,*.csv) from the drop-down list in the lower right-hand corner.

Tableau for Robotics: Storing Data

emobile Bulk Text User Guide Copyright Notice Copyright Phonovation Ltd

How To Create A Powerpoint Intelligence Report In A Pivot Table In A Powerpoints.Com

How To Create A Report In Excel

WEBFOCUS QUICK DATA FOR EXCEL

Quick Start Guide. Highly customizable automated trading Automate your trades according to rules and models you create.

CDOT Linking Excel Documents to MicroStation

GUIDE TO REDCAP EXPORTED FILES

CHAPTER 6: ANALYZE MICROSOFT DYNAMICS NAV 5.0 DATA IN MICROSOFT EXCEL

Chapter 2 Introduction to SPSS

MoCo SMS Suite Quick Guides for Direct Marketing and CRM

Seven Steps to Creating an Accessible Excel Worksheet

Downloading RIT Account Analysis Reports into Excel

Concession FTP User Guide May 2011 Version 1.2

Word 2010: Mail Merge to with Attachments

XPost: Excel Workbooks for the Post-estimation Interpretation of Regression Models for Categorical Dependent Variables

LabVIEW Report Generation Toolkit for Microsoft Office

Research Electronic Data Capture Prepared by Angela Juan

PortfolioCenter Export Wizard in Practice: Evaluating IRA Account Holder Ages and Calculating Required Minimum Distribution (RMD) Amounts

Basics Series-4004 Database Manager and Import Version 9.0

Visualization with Excel Tools and Microsoft Azure

Transcription:

1 Getting your data into R How to load and export data Dr Simon R. White MRC Biostatistics Unit, Cambridge 2012

2 Part I Case study: Reading data into R and getting output from R

3 Data formats Not just applicable to R Some pitfalls Many programs store information in special files which can only be opened by that program this can cause problems Following a few simple guidelines can save a lot of time; proper data entry protocols make analysis easier Zero, non-response and missing are different Un-learn these spreadsheet ideas Mixing raw data and functions applied to data Mixing data and how it is displayed (ever inserted an empty column to make a sheet look pretty?) Combining raw data and graphics (in a workbook)

4 Reading SAS, Minitab, SPSS, Stata The foreign package allows you to read data files from other statistical programmes. First you must load the package using: > library(foreign) read.spss() Read an SPSS Data File read.dta() Read Stata Binary Files read.mtp() Read a Minitab Portable Worksheet read.xport() Read a SAS XPORT Format Library

5 What about Microsoft Excel? You cannot (reliably) read Excel files into R directly The simplest method is to save each sheet as a Comma Separated Value (CSV) file. Beware the defaults Excel saves the formatted version of values (which is very unhelpful): if you have set a column to 2 decimal places, only 2 decimal places will be saved in the csv file Functions saved as text, rather than their calculated value Note Excel, and spreadsheets in general, are not the place for statistical analysis or even data storage: they combine data, function, graphics, all together into workbooks, when the separation of these components is to be preferred www.burns-stat.com/pages/tutor/spreadsheet_addiction. html (a good read on spreadsheets)

6 Reading csv files read.csv("filename.csv") Reads the contents of a file into a data frame There are many options to this function, but of particular note header is the first line the column names (TRUE/FALSE) na.strings specifies strings to be interpreted as NA read.table("filename") General function for reading data from a file. Note, read.csv() calls the read.table() function with predefined parameters Remember to assign the output of these functions to an object

7 Reading our first data into R (A) Exporting from a spreadsheet You should have the following directories on your machine: C:/RCourse/ data-original/ data-exported/ data/ In the data-original directory are several spreadsheet files. Open the Centre Information.xls file and export it as a comma separated value file into the data-exported directory Note: we will be using LibreOffice, an open source alternative to Microsoft Office Note for Windows users: R uses / instead of \ in file paths

8 Reading our first data into R (B) Reading a csv file > Data.Dir <- "C:/RCourse/data" > Centre.Info <- read.csv( file.path(data.dir,"centre_information.csv") ) > str(centre.info) What has R done? Are the assumed defaults sensible? > Centre.Info.2 <- read.csv( file.path(data.dir,"centre_information.csv"), colclasses=c(na,na,na,na,"character") )

9 Sharing R objects R can export data into a format for sharing with other programs write.table() General function for writing data to a file. See also write.csv() If sharing with another R user, one can save R objects directly save( obj, file="obj.rdata" ) Saves obj in a file that can be loaded into another R session save.session( file="session.rdata" ) Save an entire R session all objects that are returned by ls() Can load() file to resume working session load( file="obj.rdata" ) Load objects from file

10 Getting data out of R > write.csv( Centre.Info ) > write.csv( Centre.Info, file= file.path( Data.Dir, "Out_1.csv" ) ) What has R done? Are the assumed defaults sensible? > write.csv( Centre.Info, file= file.path( Data.Dir, "Out_1.csv" ), row.names=false )

11 Getting output into a paper > Tab.1 <- aggregate( Coverage ~ Region, Centre.Information, sum ) > write.csv( Tab.1 ) Importing a table into a paper Copy and paste content from R window into document Under Table menu, select convert Text to table Adjust table within document Could we make life easier? > write.csv( Tab.1, quote=false, row.names=false )

12 When a comma just will not work... Extra tricks > write.table( Tab.1, quote=false, row.names=false, sep="\t" ) Change the order of rows and/or columns > write.table( Tab.1[order(Tab.1$Coverage),], quote=false, row.names=false, sep="\t" ) > write.table( Tab.1[,c(2,1)], quote=false, row.names=false, sep=" " )