R data import and export



Similar documents
Reading and writing files

Importing Data into R

Package RODBC. R topics documented: June 29, 2015

Dataframes. Lecture 8. Nicholas Christian BIOST 2094 Spring 2011

Data Storage STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley

Chapter 2 Getting Data into R

Getting your data into R

Using R for Windows and Macintosh

R Data Import/Export. Version ( ) R Development Core Team

The Saves Package. an approximate benchmark of performance issues while loading datasets. Gergely Daróczi

rm(list=ls()) library(sqldf) system.time({large = read.csv.sql("large.csv")}) # seconds, 4.23GB of memory used by R

Installing R and the psych package

SPSS for Windows importing and exporting data

How To Write A File System On A Microsoft Office (Windows) (Windows 2.3) (For Windows 2) (Minorode) (Orchestra) (Powerpoint) (Xls) (

NCSS Statistical Software

R FOR SAS AND SPSS USERS. Bob Muenchen

Tutorial 2: Reading and Manipulating Files Jason Pienaar and Tom Miller

Your Best Next Business Solution Big Data In R 24/3/2010

Getting Data from the Web with R

Resources You can find more resources for Sync & Save at our support site:

R Data Import/Export. Version ( ) R Development Core Team

R Tools Evaluation. A review by Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Quick Introduction System Requirements Main features Getting Started Connecting to Active Directory... 4

Aspose.Cells Product Family

Jet Data Manager 2012 User Guide

Oracle Essbase Integration Services. Readme. Release

Introduction to RStudio

Create a New Database in Access 2010

Package sjdbc. R topics documented: February 20, 2015

Introduction to R software for statistical computing

Guidelines for Data Collection & Data Entry

Introduction to R June 2006

Hal E-Bank Foreign payments (Format of export/import files)

Participant Guide RP301: Ad Hoc Business Intelligence Reporting

Search help. More on Office.com: images templates

Ad Hoc Reporting: Data Export

Methodologies for Converting Microsoft Excel Spreadsheets to SAS datasets

EXCEL IMPORT user guide

Connect to MySQL or Microsoft SQL Server using R

CPM release notes

How to Link Tables Using. SQL Named Parameters

Navicat Premium è uno strumento di amministrazione per database con connessioni-multiple, consente di connettersi

Banana is a native application for Windows, Linux and Mac and includes functions that allow the user to manage different types of accounting files:

HOW-TO. Access Data using BCI. Brian Leach Consulting Limited.

Microsoft Office. Mail Merge in Microsoft Word

A Guide to Stat/Transfer File Transfer Utility, Version 10

Creating CSV Files. Recording Data in Spreadsheets

Package hoarder. June 30, 2015

Import Clients. Importing Clients Into STX

How To Create A Report In Excel

Package HadoopStreaming

Data Migration between Document-Oriented and Relational Databases

Package RSQLite. February 19, 2015

Package retrosheet. April 13, 2015

Exchanger XML Editor - Data Import

Data Management for Large Studies Robert R. Kelley, PhD. Thursday, September 27, 2012

Working with Financial Time Series Data in R

from Microsoft Office

Using R and the psych package to find ω

Import Data to Excel Start Excel.

Deal with big data in R using bigmemory package

DiskPulse DISK CHANGE MONITOR

Toad for Data Analysts, Tips n Tricks

Results CRM 2012 User Manual

Importing Data from a Dat or Text File into SPSS

Introduction to PASW Statistics

Vendor: Crystal Decisions Product: Crystal Reports and Crystal Enterprise

Setting Up ALERE with Client/Server Data

Importing and Exporting

E-FILE. Universal Service Administrative Company (USAC) Last Updated: September 2015

Echo360 Voluntary Product Accessibility Template

Time Clock Import Setup & Use

Apparo Fast Edit. Excel data import via 1 / 19

Migrating MS Access Database to MySQL database: A Process Documentation

Polynomial Neural Network Discovery Client User Guide

Table of Contents. Introduction: 2. Settings: 6. Archive 9. Search Browse Schedule Archiving: 18

Customer Support Tool. User s Manual XE-A207 XE-A23S. Before reading this file, please read Instruction Manual of XE-A207 and XE-A23S.

Generating Open For Business Reports with the BIRT RCP Designer

Importing Data into SAS

Chapter 2 Introduction to SPSS

A Short Guide to R with RStudio

Using and creating Crosstabs in Crystal Reports Juri Urbainczyk

Microsoft Access Rollup Procedure for Microsoft Office Click on Blank Database and name it something appropriate.

Coupling Microsoft Excel with NI Requirements Gateway

Customized Excel Output Using the Excel Libname Harry Droogendyk, Stratia Consulting Inc., Lynden, ON

EXCEL XML SPREADSHEET TUTORIAL

FEDEX DESKTOP CUSTOMER TOOLS USER GUIDE

Zoran Stankovic Bursa, 14 November 2014

Hands-on Exercise Using DataFerrett American Community Survey Public Use Microdata Sample (PUMS)

Package fimport. February 19, 2015

Instructions for Creating an Outlook Distribution List from an Excel File

Creating Online Surveys with Qualtrics Survey Tool

Transcription:

R data import and export A. Blejec andrej.blejec@nib.si October 27, 2013

Read R can read... plain text tables and files Excel files SPSS, SAS, Stata formated data databases (like MySQL) XML, HTML files

Write R can write... plain text tables and files Excel files - use tab delimited plain text SPSS, SAS, Stata formated data databases (like MySQL) XML, HTML files

Reading texts read.table() read.delim()... readlines() readline() read.clipboard() Read data formatted as a table read.table() variants Read lines from a file Reads a line from the Console Read text from clipboard

Reading text data Main function to read text data is read.table(). Data table: gender age weight height F 18 68.0 1.720 M 17 68.2 1.753 M 17 68.4 1.730 F 18 65.6 1.743 M 17 65.2 1.765 File: bmi.txt

Column names: header > data <- read.table("../data/bmi.txt",header=true) > head(data) gender age weight height 1 F 18 68.0 1.720 2 M 17 68.2 1.753 3 M 17 68.4 1.730 4 F 18 65.6 1.743 5 M 17 65.2 1.765 6 F 18 64.6 1.690

Other arguments > if(interactive()) +?read.table read.table(file, header = FALSE, sep = "", quote = "\"'", dec = ".", row.names, col.names, as.is =!stringsasfactors, na.strings = "NA", colclasses = NA, nrows = -1, skip = 0, check.names = TRUE, fill =!blank.line strip.white = FALSE, blank.lines.skip = TRUE, comment.char = "#", allowescapes = FALSE, flush = FALSE, stringsasfactors = default.stringsasfactors(), fileencoding = "", encoding = "unknown")

read.table() variants read.csv read.csv2 (file, header = TRUE, sep = ",", quote="\"", dec=".", fill = TRUE, comment.char="",...) (file, header = TRUE, sep = ";", quote="\"", dec=",", fill = TRUE, comment.char="",...) read.delim (file, header = TRUE, sep = "\t", quote="\"", dec=".", fill = TRUE, comment.char="",...) read.delim2(file, header = TRUE, sep = "\t", quote="\"", dec=",", fill = TRUE, comment.char="",...)

What next? If you want to use variables by names, attach your data! > dim(data) # dimension: n rows and columns [1] 45 4 > names(data) # variable names [1] "gender" "age" "weight" "height" > try(mean(weight)) # variables are not available yet > attach(data) # make variables usable > mean(weight) # variables are available [1] 61.00889 > table(gender,age) age gender 17 18 F 9 16 M 9 11

Reading files line by line > txt <- readlines("../data/bmicom.txt") > str(txt) chr [1:49] "# Data for BMI calculation\t\t\t\t"... # Data for BMI calculation # weight in kg # height in m gender age weight height F 18 1.720 # weight not known M 17 68.2 1.753 M 17 68.4 1.730 F 18 65.6 1.743 M 17 65.2 1.765 F 18 64.6 1.690

read.table() can skip comment lines > data <- read.table("../data/bmicom.txt",header=true,se > head(data) gender age weight height X 1 F 18 NA 1.720 NA 2 M 17 68.2 1.753 NA 3 M 17 68.4 1.730 NA 4 F 18 65.6 1.743 NA 5 M 17 65.2 1.765 NA 6 F 18 64.6 1.690 NA

Read lines from console/terminal > name <- "unknown" > if(interactive()) {name <- readline("who are you?") + cat("hello",name,"!") + }

Reading text from clipboard First four lines from file File: bmi.txt were copied to clipboard: > read.clipboard<- + function (header = TRUE, sep = "\t",...) { + read.table (file = "clipboard", header = header, s + } > if(interactive()) { + data <- read.clipboard(sep="\t") + data + } gender age weight height X 1 F 18 68.0 1.720 NA 2 M 17 68.2 1.753 NA 3 M 17 68.4 1.730 NA Empty cells are marked as NA

Reading Excel files function readclipboard() and read.table() variant package xlsreadwrite package RODBC (R Open DataBase Connectivity) package xlsx provides reading and writing of.xlsx files

read.clipboard Many times the fastest way to get data on the fly Grab first four lines from the file http://ablejec.nib.si/pub/i2r/dat/bmi.xls bmi.xls > if(interactive()) { + data <- read.clipboard() + data + } gender age weight height 1 F 18 68.0 1.720 2 M 17 68.2 1.753 3 M 17 68.4 1.730 Empty cells are marked as NA

xlsreadwrite Package xlsreadwrite provides function read.xls() to read.xls files. Package xlsx provides function read.xlsx() to read.xlsx files.

Package xlsx Reference by number > library(xlsx) > X <- read.xlsx("../data/bmi.xls",1) > head(x) gender age weight height 1 F 18 68.0 1.720 2 M 17 68.2 1.753 3 M 17 68.4 1.730 4 F 18 65.6 1.743 5 M 17 65.2 1.765 6 F 18 64.6 1.690 > #DataName <- latextranslate(paste(lfn,"/ Sheet",Sheet,

More on package xlsx Reference by name > lfn <- "../data/bmi.xls" > wb <- loadworkbook(lfn) > sheets <- getsheets(wb) > names(sheets)[1] [1] "bmi" > X <- read.xlsx(lfn,sheetname="bmi") > head(x) gender age weight height 1 F 18 68.0 1.720 2 M 17 68.2 1.753 3 M 17 68.4 1.730 4 F 18 65.6 1.743 5 M 17 65.2 1.765 6 F 18 64.6 1.690 > #DataName <- latextranslate(paste(lfn,"/ Sheet",Sheet,

Relacijske baze podatkov package RODBC ODBC Open Database Connectivity za R odbcconnect(dsn, uid =, pwd =,...) odbcconnectaccess(access.file, uid =, pwd =,...) odbcconnectaccess2007(access.file, uid =, pwd =,...) odbcconnectdbase(dbf.file,...) odbcconnectexcel(xls.file, readonly = TRUE,...) odbcconnectexcel2007(xls.file, readonly = TRUE,...)

sql* funkcije za dostop do baz podatkov [1] "sqlclear" "sqlcolumns" "sqlcopy" [4] "sqlcopytable" "sqldrop" "sqlfetch" [7] "sqlfetchmore" "sqlgetresults" "sqlprimarykeys" [10] "sqlquery" "sqlsave" "sqltables" [13] "sqltypeinfo" "sqlupdate"

Izvedeni paketi Za MySQL in Oracle sta razvita paketa: RMySQL ROracle

RODBC Note: works only for 32-bit Windows > library(rodbc) > read.xls <- function(file,..., sheet = "Sheet1", cch + comment = FALSE, rownames = TRUE, colnames = TRUE) { + z <- odbcconnectexcel(file) + myframe <- sqlfetch(z, sheet, rownames = rownames, + colnames =!colnames,...) + close(z) + if(rownames){ + rownames(myframe) <- myframe[, 1] + myframe <- myframe[, -1] + } + commentlines <- grep(paste("^", cch, sep = ""), rowna + if (!is.null(commentlines) & length(commentlines) > + 0 &!comment) + myframe <- myframe[-commentlines, ] + if (comment) + myframe <- rownames(myframe[commentlines, ])

read.xls > X <- read.xls("../data/bmi.xls",sheet="bmi",rownames=f > #head(x)

Reading SPSS files You can use read.spss() from package foreign > library(foreign) # you have to install it first! > X<-read.spss("../data/bmi.sav") Result X is a list with VARLABELS and VALUELABELS as attributes: > attr(x,"variable.labels") gender age "Gender" "Age at measurement" weight height "Weight (kg)" "Height (m)" You can convert the list to a data frame: > Y <- as.data.frame(x)

Reading SPSS files You can also use spss.get() from package Hmisc > library(hmisc) # you have to install it first! > X<-spss.get("../data/bmi.sav") Which produces a labeled data frame > head(x) gender age weight height 1 Male 17 64.2 1.770 2 Male 17 74.8 1.705 3 Male 17 55.8 1.770 4 Male 17 68.4 1.730 5 Male 17 68.2 1.753 6 Male 17 88.0 1.910

Reading SPSS files Labels are preserved > label(x$weight) weight "Weight (kg)"

Choosing files interactively > if(interactive()) + file.choose() For more than one file use choose.files()

Write tab delimited tables > data <- head(x) > write.table(data,"../data/results.txt",sep="\t") > write.table(data,"",sep="\t") "gender" "age" "weight" "height" "1" "Male" 17 64.2 1.77 "2" "Male" 17 74.8 1.705 "3" "Male" 17 55.8 1.77 "4" "Male" 17 68.4 1.73 "5" "Male" 17 68.2 1.753 "6" "Male" 17 88 1.91

Write tab delimited tables > write.table(data,"../data/results2.txt",sep="\t",col.n > write.table(data,"",sep="\t",col.names = NA) "" "gender" "age" "weight" "height" "1" "Male" 17 64.2 1.77 "2" "Male" 17 74.8 1.705 "3" "Male" 17 55.8 1.77 "4" "Male" 17 68.4 1.73 "5" "Male" 17 68.2 1.753 "6" "Male" 17 88 1.91

Writing Excel tables Package xlsx > lfn <- "./results2.xls" > if(file.exists(lfn)) + file.remove(lfn) [1] TRUE > dir(pattern="*.xls") character(0) > write.xlsx(data,"results2.xls") > dir(pattern="*.xls") [1] "results2.xls"

More... R Help/Manuals (in PDF) / R Data Import/Export http://ablejec.nib.si/r/xlsreadwrite.pdf http://ablejec.nib.si/r/i2r/doc/i2r.pdf