R for Beginners. Emmanuel Paradis. Institut des Sciences de l Évolution Université Montpellier II F Montpellier cédex 05 France

Size: px
Start display at page:

Download "R for Beginners. Emmanuel Paradis. Institut des Sciences de l Évolution Université Montpellier II F-34095 Montpellier cédex 05 France"

Transcription

1 R fr Beginners Emmanuel Paradis Institut des Sciences de l Évlutin Université Mntpellier II F Mntpellier cédex 05 France

2 I thank Julien Claude, Christphe Declercq, Éldie Gazave, Friedrich Leisch, Luis Luangkesrn, Françis Pinard, and Mathieu Rs fr their cmments and suggestins n earlier versins f this dcument. I am als grateful t all the members f the R Develpment Cre Team fr their cnsiderable effrts in develping R and animating the discussin list rhelp. Thanks als t the R users whse questins r cmments helped me t write R fr Beginners. Special thanks t Jrge Ahumada fr the Spanish translatin. c 2002, 2005, Emmanuel Paradis (12th September 2005) Permissin is granted t make and distribute cpies, either in part r in full and in any language, f this dcument n any supprt prvided the abve cpyright ntice is included in all cpies. Permissin is granted t translate this dcument, either in part r in full, in any language prvided the abve cpyright ntice is included.

3 Cntents 1 Preamble 1 2 A few cncepts befre starting Hw R wrks Creating, listing and deleting the bjects in memry The n-line help Data with R Objects Reading data in a file Saving data Generating data Regular sequences Randm sequences Manipulating bjects Creating bjects Cnverting bjects Operatrs Accessing the values f an bject: the indexing system Accessing the values f an bject with names The data editr Arithmetics and simple functins Matrix cmputatin Graphics with R Managing graphics Opening several graphical devices Partitining a graphic Graphical functins Lw-level pltting cmmands Graphical parameters A practical example The grid and lattice packages Statistical analyses with R A simple example f analysis f variance Frmulae Generic functins Packages

4 6 Prgramming with R in pratice Lps and vectrizatin Writing a prgram in R Writing yur wn functins Literature n R 71

5 1 Preamble The gal f the present dcument is t give a starting pint fr peple newly interested in R. I chse t emphasize n the understanding f hw R wrks, with the aim f a beginner, rather than expert, use. Given that the pssibilities ffered by R are vast, it is useful t a beginner t get sme ntins and cncepts in rder t prgress easily. I tried t simplify the explanatins as much as I culd t make them understandable by all, while giving useful details, smetimes with tables. R is a system fr statistical analyses and graphics created by Rss Ihaka and Rbert Gentleman 1. R is bth a sftware and a language cnsidered as a dialect f the S language created by the AT&T Bell Labratries. S is available as the sftware S-PLUS cmmercialized by Insightful 2. There are imprtant differences in the designs f R and f S: thse wh want t knw mre n this pint can read the paper by Ihaka & Gentleman (1996) r the R-FAQ 3, a cpy f which is als distributed with R. R is freely distributed under the terms f the GNU General Public Licence 4 ; its develpment and distributin are carried ut by several statisticians knwn as the R Develpment Cre Team. R is available in several frms: the surces (written mainly in C and sme rutines in Frtran), essentially fr Unix and Linux machines, r sme pre-cmpiled binaries fr Windws, Linux, and Macintsh. The files needed t install R, either frm the surces r frm the pre-cmpiled binaries, are distributed frm the internet site f the Cmprehensive R Archive Netwrk (CRAN) 5 where the instructins fr the installatin are als available. Regarding the distributins f Linux (Debian,... ), the binaries are generally available fr the mst recent versins; lk at the CRAN site if necessary. R has many functins fr statistical analyses and graphics; the latter are visualized immediately in their wn windw and can be saved in varius frmats (jpg, png, bmp, ps, pdf, emf, pictex, xfig; the available frmats may depend n the perating system). The results frm a statistical analysis are displayed n the screen, sme intermediate results (P-values, regressin cefficients, residuals,... ) can be saved, written in a file, r used in subsequent analyses. The R language allws the user, fr instance, t prgram lps t successively analyse several data sets. It is als pssible t cmbine in a single prgram different statistical functins t perfrm mre cmplex analyses. The 1 Ihaka R. & Gentleman R R: a language fr data analysis and graphics. Jurnal f Cmputatinal and Graphical Statistics 5: See fr mre infrmatin 3 4 Fr mre infrmatin: 5 1

6 R users may benefit frm a large number f prgrams written fr S and available n the internet 6, mst f these prgrams can be used directly with R. At first, R culd seem t cmplex fr a nn-specialist. This may nt be true actually. In fact, a prminent feature f R is its flexibility. Whereas a classical sftware displays immediately the results f an analysis, R stres these results in an bject, s that an analysis can be dne with n result displayed. The user may be surprised by this, but such a feature is very useful. Indeed, the user can extract nly the part f the results which is f interest. Fr example, if ne runs a series f 20 regressins and wants t cmpare the different regressin cefficients, R can display nly the estimated cefficients: thus the results may take a single line, whereas a classical sftware culd well pen 20 results windws. We will see ther examples illustrating the flexibility f a system such as R cmpared t traditinal sftwares. 6 Fr example: 2

7 2 A few cncepts befre starting Once R is installed n yur cmputer, the sftware is executed by launching the crrespnding executable. The prmpt, by default >, indicates that R is waiting fr yur cmmands. Under Windws using the prgram Rgui.exe, sme cmmands (accessing the n-line help, pening files,... ) can be executed via the pull-dwn menus. At this stage, a new user is likely t wnder What d I d nw? It is indeed very useful t have a few ideas n hw R wrks when it is used fr the first time, and this is what we will see nw. We shall see first briefly hw R wrks. Then, I will describe the assign peratr which allws creating bjects, hw t manage bjects in memry, and finally hw t use the n-line help which is very useful when running R. 2.1 Hw R wrks The fact that R is a language may deter sme users wh think I can t prgram. This shuld nt be the case fr tw reasns. First, R is an interpreted language, nt a cmpiled ne, meaning that all cmmands typed n the keybard are directly executed withut requiring t build a cmplete prgram like in mst cmputer languages (C, Frtran, Pascal,... ). Secnd, R s syntax is very simple and intuitive. Fr instance, a linear regressin can be dne with the cmmand lm(y ~ x) which means fitting a linear mdel with y as respnse and x as predictr. In R, in rder t be executed, a functin always needs t be written with parentheses, even if there is nthing within them (e.g., ls()). If ne just types the name f a functin withut parentheses, R will display the cntent f the functin. In this dcument, the names f the functins are generally written with parentheses in rder t distinguish them frm ther bjects, unless the text indicates clearly s. When R is running, variables, data, functins, results, etc, are stred in the active memry f the cmputer in the frm f bjects which have a name. The user can d actins n these bjects with peratrs (arithmetic, lgical, cmparisn,... ) and functins (which are themselves bjects). The use f peratrs is relatively intuitive, we will see the details later (p. 25). An R functin may be sketched as fllws: arguments ptins functin default arguments = result The arguments can be bjects ( data, frmulae, expressins,... ), sme 3

8 f which culd be defined by default in the functin; these default values may be mdified by the user by specifying ptins. An R functin may require n argument: either all arguments are defined by default (and their values can be mdified with the ptins), r n argument has been defined in the functin. We will see later in mre details hw t use and build functins (p. 67). The present descriptin is sufficient fr the mment t understand hw R wrks. All the actins f R are dne n bjects stred in the active memry f the cmputer: n temprary files are used (Fig. 1). The readings and writings f files are used fr input and utput f data and results (graphics,... ). The user executes the functins via sme cmmands. The results are displayed directly n the screen, stred in an bject, r written n the disk (particularly fr graphics). Since the results are themselves bjects, they can be cnsidered as data and analysed as such. Data files can be read frm the lcal disk r frm a remte server thrugh internet. keybard muse cmmands functins and peratrs.../library/base/ /stast/ /graphics/... library f functins screen data bjects 3 results bjects PS JPEG... data files internet Active memry Hard disk Figure 1: A schematic view f hw R wrks. The functins available t the user are stred in a library lcalised n the disk in a directry called R HOME/library (R HOME is the directry where R is installed). This directry cntains packages f functins, which are themselves structured in directries. The package named base is in a way the cre f R and cntains the basic functins f the language, particularly, fr reading and manipulating data. Each package has a directry called R with a file named like the package (fr instance, fr the package base, this is the file R HOME/library/base/R/base). This file cntains all the functins f the package. One f the simplest cmmands is t type the name f an bject t display its cntent. Fr instance, if an bject n cntents the value 10: > n [1] 10 4

9 The digit 1 within brackets indicates that the display starts at the first element f n. This cmmand is an implicit use f the functin print and the abve example is similar t print(n) (in sme situatins, the functin print must be used explicitly, such as within a functin r a lp). The name f an bject must start with a letter (A Z and a z) and can include letters, digits (0 9), dts (.), and underscres ( ). R discriminates between uppercase letters and lwercase nes in the names f the bjects, s that x and X can name tw distinct bjects (even under Windws). 2.2 Creating, listing and deleting the bjects in memry An bject can be created with the assign peratr which is written as an arrw with a minus sign and a bracket; this symbl can be riented left-t-right r the reverse: > n <- 15 > n [1] 15 > 5 -> n > n [1] 5 > x <- 1 > X <- 10 > x [1] 1 > X [1] 10 If the bject already exists, its previus value is erased (the mdificatin affects nly the bjects in the active memry, nt the data n the disk). The value assigned this way may be the result f an peratin and/r a functin: > n < > n [1] 12 > n <- 3 + rnrm(1) > n [1] The functin rnrm(1) generates a nrmal randm variate with mean zer and variance unity (p. 17). Nte that yu can simply type an expressin withut assigning its value t an bject, the result is thus displayed n the screen but is nt stred in memry: > (10 + 2) * 5 [1] 60 5

10 The assignment will be mitted in the examples if nt necessary fr understanding. The functin ls lists simply the bjects in memry: nly the names f the bjects are displayed. > name <- "Carmen"; n1 <- 10; n2 <- 100; m <- 0.5 > ls() [1] "m" "n1" "n2" "name" Nte the use f the semi-cln t separate distinct cmmands n the same line. If we want t list nly the bjects which cntain a given character in their name, the ptin pattern (which can be abbreviated with pat) can be used: > ls(pat = "m") [1] "m" "name" T restrict the list f bjects whse names start with this character: > ls(pat = "^m") [1] "m" The functin ls.str displays sme details n the bjects in memry: > ls.str() m : num 0.5 n1 : num 10 n2 : num 100 name : chr "Carmen" The ptin pattern can be used in the same way as with ls. Anther useful ptin f ls.str is max.level which specifies the level f detail fr the display f cmpsite bjects. By default, ls.str displays the details f all bjects in memry, included the clumns f data frames, matrices and lists, which can result in a very lng display. We can avid t display all these details with the ptin max.level = -1: > M <- data.frame(n1, n2, m) > ls.str(pat = "M") M : data.frame : 1 bs. f 3 variables: $ n1: num 10 $ n2: num 100 $ m : num 0.5 > ls.str(pat="m", max.level=-1) M : data.frame : 1 bs. f 3 variables: T delete bjects in memry, we use the functin rm: rm(x) deletes the bject x, rm(x,y) deletes bth the bjects x et y, rm(list=ls()) deletes all the bjects in memry; the same ptins mentined fr the functin ls() can then be used t delete selectively sme bjects: rm(list=ls(pat="^m")). 6

11 2.3 The n-line help The n-line help f R gives very useful infrmatin n hw t use the functins. Help is available directly fr a given functin, fr instance: >?lm will display, within R, the help page fr the functin lm() (linear mdel). The cmmands help(lm) and help("lm") have the same effect. The last ne must be used t access help with nn-cnventinal characters: >?* Errr: syntax errr > help("*") Arithmetic package:base R Dcumentatin Arithmetic Operatrs... Calling help pens a page (this depends n the perating system) with general infrmatin n the first line such as the name f the package where is (are) the dcumented functin(s) r peratrs. Then cmes a title fllwed by sectins which give detailed infrmatin. Descriptin: brief descriptin. Usage: fr a functin, gives the name with all its arguments and the pssible ptins (with the crrespnding default values); fr an peratr gives the typical use. Arguments: fr a functin, details each f its arguments. Details: detailed descriptin. Value: if applicable, the type f bject returned by the functin r the peratr. See Als: ther help pages clse r similar t the present ne. Examples: sme examples which can generally be executed withut pening the help with the functin example. Fr beginners, it is gd t lk at the sectin Examples. Generally, it is useful t read carefully the sectin Arguments. Other sectins may be encuntered, such as Nte, References r Authr(s). By default, the functin help nly searches in the packages which are laded in memry. The ptin try.all.packages, which default is FALSE, allws t search in all packages if its value is TRUE: 7

12 > help("bs") N dcumentatin fr bs in specified packages and libraries: yu culd try help.search("bs") > help("bs", try.all.packages = TRUE) Help fr tpic bs is nt in any laded package but can be fund in the fllwing packages: Package splines Library /usr/lib/r/library Nte that in this case the help page f the functin bs is nt displayed. The user can display help pages frm a package nt laded in memry using the ptin package: > help("bs", package = "splines") bs package:splines R Dcumentatin B-Spline Basis fr Plynmial Splines Descriptin:... Generate the B-spline basis matrix fr a plynmial spline. The help in html frmat (read, e.g., with Netscape) is called by typing: > help.start() A search with keywrds is pssible with this html help. The sectin See Als has here hypertext links t ther functin help pages. The search with keywrds is als pssible in R with the functin help.search. The latter lks fr a specified tpic, given as a character string, in the help pages f all installed packages. Fr instance, help.search("tree") will display a list f the functins which help pages mentin tree. Nte that if sme packages have been recently installed, it may be useful t refresh the database used by help.search using the ptin rebuild (e.g., help.search("tree", rebuild = TRUE)). The fnctin aprps finds all functins which name cntains the character string given as argument; nly the packages laded in memry are searched: > aprps(help) [1] "help" ".helpfrcall" "help.search" [4] "help.start" 8

13 3 Data with R 3.1 Objects We have seen that R wrks with bjects which are, f curse, characterized by their names and their cntent, but als by attributes which specify the kind f data represented by an bject. In rder t understand the usefulness f these attributes, cnsider a variable that takes the value 1, 2, r 3: such a variable culd be an integer variable (fr instance, the number f eggs in a nest), r the cding f a categrical variable (fr instance, sex in sme ppulatins f crustaceans: male, female, r hermaphrdite). It is clear that the statistical analysis f this variable will nt be the same in bth cases: with R, the attributes f the bject give the necessary infrmatin. Mre technically, and mre generally, the actin f a functin n an bject depends n the attributes f the latter. All bjects have tw intrinsic attributes: mde and length. The mde is the basic type f the elements f the bject; there are fur main mdes: numeric, character, cmplex 7, and lgical (FALSE r TRUE). Other mdes exist but they d nt represent data, fr instance functin r expressin. The length is the number f elements f the bject. T display the mde and the length f an bject, ne can use the functins mde and length, respectively: > x <- 1 > mde(x) [1] "numeric" > length(x) [1] 1 > A <- "Gmphtherium"; cmpar <- TRUE; z <- 1i > mde(a); mde(cmpar); mde(z) [1] "character" [1] "lgical" [1] "cmplex" Whatever the mde, missing data are represented by NA (nt available). A very large numeric value can be specified with an expnential ntatin: > N <- 2.1e23 > N [1] 2.1e+23 R crrectly represents nn-finite numeric values, such as ± with Inf and -Inf, r values which are nt numbers with NaN (nt a number). 7 The mde cmplex will nt be discussed in this dcument. 9

14 > x <- 5/0 > x [1] Inf > exp(x) [1] Inf > exp(-x) [1] 0 > x - x [1] NaN A value f mde character is input with duble qutes ". It is pssible t include this latter character in the value if it fllws a backslash \. The tw charaters altgether \" will be treated in a specific way by sme functins such as cat fr display n screen, r write.table t write n the disk (p. 14, the ptin qmethd f this functin). > x <- "Duble qutes \" delimitate R s strings." > x [1] "Duble qutes \" delimitate R s strings." > cat(x) Duble qutes " delimitate R s strings. Alternatively, variables f mde character can be delimited with single qutes ( ); in this case it is nt necessary t escape duble qutes with backslashes (but single qutes must be!): > x <- Duble qutes " delimitate R\ s strings. > x [1] "Duble qutes \" delimitate R s strings." The fllwing table gives an verview f the type f bjects representing data. bject mdes several mdes pssible in the same bject? vectr numeric, character, cmplex r lgical N factr numeric r character N array numeric, character, cmplex r lgical N matrix numeric, character, cmplex r lgical N data frame numeric, character, cmplex r lgical Yes ts numeric, character, cmplex r lgical N list numeric, character, cmplex, lgical, Yes functin, expressin,... 10

15 A vectr is a variable in the cmmnly admitted meaning. A factr is a categrical variable. An array is a table with k dimensins, a matrix being a particular case f array with k = 2. Nte that the elements f an array r f a matrix are all f the same mde. A data frame is a table cmpsed with ne r several vectrs and/r factrs all f the same length but pssibly f different mdes. A ts is a time series data set and s cntains additinal attributes such as frequency and dates. Finally, a list can cntain any type f bject, included lists! Fr a vectr, its mde and length are sufficient t describe the data. Fr ther bjects, ther infrmatin is necessary and it is given by nn-intrinsic attributes. Amng these attributes, we can cite dim which crrespnds t the dimensins f an bject. Fr example, a matrix with 2 lines and 2 clumns has fr dim the pair f values [2, 2], but its length is Reading data in a file Fr reading and writing in files, R uses the wrking directry. T find this directry, the cmmand getwd() (get wrking directry) can be used, and the wrking directry can be changed with setwd("c:/data") r setwd("/hme/- paradis/r"). It is necessary t give the path t a file if it is nt in the wrking directry. 8 R can read data stred in text (ASCII) files with the fllwing functins: read.table (which has several variants, see belw), scan and read.fwf. R can als read files in ther frmats (Excel, SAS, SPSS,... ), and access SQLtype databases, but the functins needed fr this are nt in the package base. These functinalities are very useful fr a mre advanced use f R, but we will restrict here t reading files in ASCII frmat. The functin read.table has fr effect t create a data frame, and s is the main way t read data in tabular frm. Fr instance, if ne has a file named data.dat, the cmmand: > mydata <- read.table("data.dat") will create a data frame named mydata, and each variable will be named, by default, V1, V2,... and can be accessed individually by mydata$v1, mydata$v2,..., r by mydata["v1"], mydata["v2"],..., r, still anther slutin, by mydata[, 1], mydata[,2 ],... 9 There are several ptins whse default values (i.e. thse used by R if they are mitted by the user) are detailed in the fllwing table: read.table(file, header = FALSE, sep = "", qute = "\" ", dec = ".", 8 Under Windws, it is useful t create a shrt-cut f Rgui.exe then edit its prperties and change the directry in the field Start in: under the tab Shrt-cut : this directry will then be the wrking directry if R is started frm this shrt-cut. 9 There is a difference: mydata$v1 and mydata[, 1] are vectrs whereas mydata["v1"] is a data frame. We will see later (p. 18) sme details n manipulating bjects. 11

16 rw.names, cl.names, as.is = FALSE, na.strings = "NA", clclasses = NA, nrws = -1, skip = 0, check.names = TRUE, fill =!blank.lines.skip, strip.white = FALSE, blank.lines.skip = TRUE, cmment.char = "#") file the name f the file (within "" r a variable f mde character), pssibly with its path (the symbl \ is nt allwed and must be replaced by /, even under Windws), r a remte access t a file f type URL (http://...) header a lgical (FALSE r TRUE) indicating if the file cntains the names f the variables n its first line sep the field separatr used in the file, fr instance sep="\t" if it is a tabulatin qute the characters used t cite the variables f mde character dec the character used fr the decimal pint rw.names a vectr with the names f the lines which can be either a vectr f mde character, r the number (r the name) f a variable f the file (by default: 1, 2, 3,... ) cl.names a vectr with the names f the variables (by default: V1, V2, V3,... ) as.is cntrls the cnversin f character variables as factrs (if FALSE) r keeps them as characters (TRUE); as.is can be a lgical, numeric r character vectr specifying the variables t be kept as character na.strings the value given t missing data (cnverted as NA) clclasses a vectr f mde character giving the classes t attribute t the clumns nrws the maximum number f lines t read (negative values are ignred) skip the number f lines t be skipped befre reading the data check.names if TRUE, checks that the variable names are valid fr R fill if TRUE and all lines d nt have the same number f variables, blanks are added strip.white (cnditinal t sep) if TRUE, deletes extra spaces befre and after the character variables blank.lines.skip if TRUE, ignres blank lines cmment.char a character defining cmments in the data file, the rest f the line after this character is ignred (t disable this argument, use cmment.char = "") The variants f read.table are useful since they have different default values: read.csv(file, header = TRUE, sep = ",", qute="\"", dec=".", fill = TRUE,...) read.csv2(file, header = TRUE, sep = ";", qute="\"", dec=",", fill = TRUE,...) read.delim(file, header = TRUE, sep = "\t", qute="\"", dec=".", fill = TRUE,...) read.delim2(file, header = TRUE, sep = "\t", qute="\"", dec=",", fill = TRUE,...) 12

17 The functin scan is mre flexible than read.table. A difference is that it is pssible t specify the mde f the variables, fr example: > mydata <- scan("data.dat", what = list("", 0, 0)) reads in the file data.dat three variables, the first is f mde character and the next tw are f mde numeric. Anther imprtant distinctin is that scan() can be used t create different bjects, vectrs, matrices, data frames, lists,... In the abve example, mydata is a list f three vectrs. By default, that is if what is mitted, scan() creates a numeric vectr. If the data read d nt crrespnd t the mde(s) expected (either by default, r specified by what), an errr message is returned. The ptins are the fllwings. scan(file = "", what = duble(0), nmax = -1, n = -1, sep = "", qute = if (sep=="\n") "" else " \"", dec = ".", skip = 0, nlines = 0, na.strings = "NA", flush = FALSE, fill = FALSE, strip.white = FALSE, quiet = FALSE, blank.lines.skip = TRUE, multi.line = TRUE, cmment.char = "", allwescapes = TRUE) file what nmax n sep qute dec skip nlines na.string flush fill strip.white quiet blank.lines.skip multi.line cmment.char allwescapes the name f the file (within ""), pssibly with its path (the symbl \ is nt allwed and must be replaced by /, even under Windws), r a remte access t a file f type URL (http://...); if file="", the data are entered with the keybard (the entree is terminated by a blank line) specifies the mde(s) f the data (numeric by default) the number f data t read, r, if what is a list, the number f lines t read (by default, scan reads the data up t the end f file) the number f data t read (by default, n limit) the field separatr used in the file the characters used t cite the variables f mde character the character used fr the decimal pint the number f lines t be skipped befre reading the data the number f lines t read the value given t missing data (cnverted as NA) a lgical, if TRUE, scan ges t the next line nce the number f clumns has been reached (allws the user t add cmments in the data file) if TRUE and all lines d nt have the same number f variables, blanks are added (cnditinal t sep) if TRUE, deletes extra spaces befre and after the character variables a lgical, if FALSE, scan displays a line shwing which fields have been read if TRUE, ignres blank lines if what is a list, specifies if the variables f the same individual are n a single line in the file (FALSE) a character defining cmments in the data file, the rest f the line after this character is ignred (the default is t have this disabled) specifies whether C-style escapes (e.g., \t ) be prcessed (the default) r read as verbatim 13

18 The functin read.fwf can be used t read in a file sme data in fixed width frmat: read.fwf(file, widths, header = FALSE, sep = "\t", as.is = FALSE, skip = 0, rw.names, cl.names, n = -1, buffersize = 2000,...) The ptins are the same than fr read.table() except widths which specifies the width f the fields (buffersize is the maximum number f lines read simultaneusly). Fr example, if a file named data.txt has the data indicated n the right, ne can read the data with the fllwing cmmand: > mydata <- read.fwf("data.txt", widths=c(1, 4, 3)) > mydata V1 V2 V3 1 A A B B C C A A B B C C Saving data The functin write.table writes in a file an bject, typically a data frame but this culd well be anther kind f bject (vectr, matrix,... ). The arguments and ptins are: write.table(x, file = "", append = FALSE, qute = TRUE, sep = " ", el = "\n", na = "NA", dec = ".", rw.names = TRUE, cl.names = TRUE, qmethd = c("escape", "duble")) x file append qute sep el na dec rw.names cl.names qmethd the name f the bject t be written the name f the file (by default the bject is displayed n the screen) if TRUE adds the data withut erasing thse pssibly existing in the file a lgical r a numeric vectr: if TRUE the variables f mde character and the factrs are written within "", therwise the numeric vectr indicates the numbers f the variables t write within "" (in bth cases the names f the variables are written within "" but nt if qute = FALSE) the field separatr used in the file the character t be used at the end f each line ("\n" is a carriage-return) the character t be used fr missing data the character used fr the decimal pint a lgical indicating whether the names f the lines are written in the file id. fr the names f the clumns specifies, if qute=true, hw duble qutes " included in variables f mde character are treated: if "escape" (r "e", the default) each " is replaced by \", if "d" each " is replaced by "" 14

19 T write in a simpler way an bject in a file, the cmmand write(x, file="data.txt") can be used, where x is the name f the bject (which can be a vectr, a matrix, r an array). There are tw ptins: nc (r ncl) which defines the number f clumns in the file (by default nc=1 if x is f mde character, nc=5 fr the ther mdes), and append (a lgical) t add the data withut deleting thse pssibly already in the file (TRUE) r deleting them if the file already exists (FALSE, the default). T recrd a grup f bjects f any type, we can use the cmmand save(x, y, z, file= "xyz.rdata"). T ease the transfert f data between different machines, the ptin ascii = TRUE can be used. The data (which are nw called a wrkspace in R s jargn) can be laded later in memry with lad("xyz.rdata"). The functin save.image() is a shrt-cut fr save(list =ls(all=true), file=".rdata"). 3.4 Generating data Regular sequences A regular sequence f integers, fr example frm 1 t 30, can be generated with: > x <- 1:30 The resulting vectr x has 30 elements. The peratr : has pririty n the arithmetic peratrs within an expressin: > 1:10-1 [1] > 1:(10-1) [1] The functin seq can generate sequences f real numbers as fllws: > seq(1, 5, 0.5) [1] where the first number indicates the beginning f the sequence, the secnd ne the end, and the third ne the increment t be used t generate the sequence. One can use als: > seq(length=9, frm=1, t=5) [1] One can als type directly the values using the functin c: > c(1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5) [1]

20 It is als pssible, if ne wants t enter sme data n the keybard, t use the functin scan with simply the default ptins: > z <- scan() 1: : Read 9 items > z [1] The functin rep creates a vectr with all its elements identical: > rep(1, 30) [1] The functin sequence creates a series f sequences f integers each ending by the numbers given as arguments: > sequence(4:5) [1] > sequence(c(10,5)) [1] The functin gl (generate levels) is very useful because it generates regular series f factrs. The usage f this fnctin is gl(k, n) where k is the number f levels (r classes), and n is the number f replicatins in each level. Tw ptins may be used: length t specify the number f data prduced, and labels t specify the names f the levels f the factr. Examples: > gl(3, 5) [1] Levels: > gl(3, 5, length=30) [1] Levels: > gl(2, 6, label=c("male", "Female")) [1] Male Male Male Male Male Male [7] Female Female Female Female Female Female Levels: Male Female > gl(2, 10) [1] Levels: 1 2 > gl(2, 1, length=20) [1] Levels: 1 2 > gl(2, 2, length=20) [1] Levels:

An Introduction to Statistical Learning

An Introduction to Statistical Learning Springer Texts in Statistics Gareth James Daniela Witten Trevr Hastie Rbert Tibshirani An Intrductin t Statistical Learning with Applicatins in R Springer Texts in Statistics 103 Series Editrs: G. Casella

More information

The Elements of Statistical Learning

The Elements of Statistical Learning Springer Series in Statistics Trevr Hastie Rbert Tibshirani Jerme Friedman The Elements f Statistical Learning Data Mining, Inference, and Predictin Secnd Editin This is page v Printer: paque this T ur

More information

Building Your Book for Kindle

Building Your Book for Kindle Building Yur Bk fr Kindle We are excited yu ve decided t design, frmat, and prepare yur bk fr Kindle! We ll walk yu thrugh the necessary steps in creating a prfessinal digital file f yur bk fr quick uplad

More information

How to use Moodle 2.7. Teacher s Manual for the world s most popular LMS. Jaswinder Singh

How to use Moodle 2.7. Teacher s Manual for the world s most popular LMS. Jaswinder Singh Teacher s Manual fr the wrld s mst ppular LMS Jaswinder Singh Hw t Use Mdle 2.7 2 Hw t use Mdle 2.7, 1 st Editin Teacher s Manual fr the wrld s mst ppular LMS Jaswinder Singh 3 This bk is dedicated t my

More information

What's New in SAS 9.4

What's New in SAS 9.4 What's New in SAS 9.4 SAS Dcumentatin The crrect bibligraphic citatin fr this manual is as fllws: SAS Institute Inc. 2013. What's New in SAS 9.4. Cary, NC: SAS Institute Inc. What's New in SAS 9.4 Cpyright

More information

Across a wide variety of fields, data are

Across a wide variety of fields, data are Frm Data Mining t Knwledge Discvery in Databases Usama Fayyad, Gregry Piatetsky-Shapir, and Padhraic Smyth Data mining and knwledge discvery in databases have been attracting a significant amunt f research,

More information

The Synchronization of Periodic Routing Messages

The Synchronization of Periodic Routing Messages The Synchrnizatin f Peridic Ruting Messages Sally Flyd and Van Jacbsn, Lawrence Berkeley Labratry, One Cycltrn Rad, Berkeley CA 9470, flyd@eelblgv, van@eelblgv T appear in the April 994 IEEE/ACM Transactins

More information

How to Write Program Objectives/Outcomes

How to Write Program Objectives/Outcomes Hw t Write Prgram Objectives/Outcmes Objectives Gals and Objectives are similar in that they describe the intended purpses and expected results f teaching activities and establish the fundatin fr assessment.

More information

SECURITY GUIDANCE FOR CRITICAL AREAS OF FOCUS IN CLOUD COMPUTING V3.0

SECURITY GUIDANCE FOR CRITICAL AREAS OF FOCUS IN CLOUD COMPUTING V3.0 SECURITY GUIDANCE FOR CRITICAL AREAS OF FOCUS IN CLOUD COMPUTING V3.0 INTRODUCTION The guidance prvided herein is the third versin f the Clud Security Alliance dcument, Security Guidance fr Critical Areas

More information

An Introduction to R. W. N. Venables, D. M. Smith and the R Core Team

An Introduction to R. W. N. Venables, D. M. Smith and the R Core Team An Introduction to R Notes on R: A Programming Environment for Data Analysis and Graphics Version 3.2.0 (2015-04-16) W. N. Venables, D. M. Smith and the R Core Team This manual is for R, version 3.2.0

More information

A Beginner s Guide to Successfully Securing Grant Funding

A Beginner s Guide to Successfully Securing Grant Funding A Beginner s Guide t Successfully Securing Grant Funding Intrductin There is a wide range f supprt mechanisms ut there in the funding wrld, including grants, lans, equity investments, award schemes and

More information

Most Significant Change

Most Significant Change Click4it Wiki - Tlkit Mst Significant Change Step by Step Step 1: Starting and raising interest A. It may help t use ne f the fllwing metaphrs t explain the MSC: Newspaper: Newspapers are structured int

More information

Develop Agency SPF From SafetyAnalystWiki

Develop Agency SPF From SafetyAnalystWiki Develp Agency SPF Frm SafetyAnalystWiki Cntents [hide] 1 Safety Perfrmance Functins 1.1 What SPFs Are Needed 1.2 Functinal Frm f SPFs 1.3 Data Needs fr Develpment f SPFs 1.4 Statistical Assumptins and

More information

THE INTERNATIONAL FRAMEWORK

THE INTERNATIONAL <IR> FRAMEWORK THE INTERNATIONAL FRAMEWORK ABOUT THE IIRC The Internatinal Integrated Reprting Cuncil (IIRC) is a glbal calitin f regulatrs, investrs, cmpanies, standard setters, the accunting prfessin and NGOs.

More information

European Investment Bank. Guide to Procurement

European Investment Bank. Guide to Procurement GUIDE TO PROCUREMENT fr prjects financed by the EIB Updated versin f June 2011 TABLE OF CONTENTS Intrductin 1. General Aspects...4 1.1. The Bank s Plicy... 4 1.2. Eligibility f Cntractrs and Suppliers

More information

No Unsafe Lift. Workbook

No Unsafe Lift. Workbook N Unsafe Lift Wrkbk Cver and Sectin Break image prvided curtesy f Arj Canada Inc. Table Of Cntents Purpse f this wrkbk... 2 Hw t use this wrkbk...3 SECTION ONE A Brief Review f the Literature...5 SECTION

More information

MEASURING AND/OR ESTIMATING SOCIAL VALUE CREATION: Insights Into Eight Integrated Cost Approaches

MEASURING AND/OR ESTIMATING SOCIAL VALUE CREATION: Insights Into Eight Integrated Cost Approaches MEASURING AND/OR ESTIMATING SOCIAL VALUE CREATION: Insights Int Eight Integrated Cst Appraches Prepared fr Bill & Melinda Gates Fundatin Impact Planning and Imprvement Prepared by Melinda T. Tuan P.O.

More information

ns Rev. 0 (3.9.15) Reporting water MDL is allowable) Preparatory Method Analysis Method The MDL programs and by covered The LOD reporting?

ns Rev. 0 (3.9.15) Reporting water MDL is allowable) Preparatory Method Analysis Method The MDL programs and by covered The LOD reporting? NR149 LOD/ /LOQ Clarificatin Required frequency Annually an MDL study must be perfrmed fr each cmbinatin f the fllwing: Matrix (if the slid and aqueus matrix methds are identical, extraplatin frm the water

More information

How to Convert your Paper into a Presentation

How to Convert your Paper into a Presentation Hw t Cnvert yur Paper int a Presentatin During yur cllege career, yu may be asked t present yur academic wrk in the classrm, at cnferences, r at special events. Tw types f talks are cmmn in academia: presentatins

More information

Chapter 2 Getting Data into R

Chapter 2 Getting Data into R Chapter 2 Getting Data into R In the following chapter we address entering data into R and organising it as scalars (single values), vectors, matrices, data frames, or lists. We also demonstrate importing

More information

Electronic Communication

Electronic Communication Applicatin fr Tree Wrks: Wrks t Trees Subject t a Tree Preservatin Order (TPO) and/r Ntificatin f Prpsed Wrks t Trees in Cnservatin Areas (CA) Twn and Cuntry Planning Act 1990 Electrnic Cmmunicatin If

More information

Social Media Use by Governments

Social Media Use by Governments Please cite this paper as: Mickleit, A. (2014), Scial Media Use by Gvernments: A Plicy Primer t Discuss Trends, Identify Plicy Opprtunities and Guide Decisin Makers, OECD Wrking Papers n Public Gvernance,

More information

Report for the Food Standards Agency. Nutrition and Public Health Intervention Research Unit London School of Hygiene & Tropical Medicine

Report for the Food Standards Agency. Nutrition and Public Health Intervention Research Unit London School of Hygiene & Tropical Medicine Cmparisn f cmpsitin (nutrients and ther substances) f rganically and cnventinally prduced fdstuffs: a systematic review f the available literature Reprt fr the Fd Standards Agency Nutritin and Public Health

More information

RISING TO THE CHALLENGE. Re-Envisioning Public Libraries

RISING TO THE CHALLENGE. Re-Envisioning Public Libraries RISING TO THE CHALLENGE Re-Envisining Public Libraries RISING TO THE CHALLENGE Re-Envisining Public Libraries A reprt f the Aspen Institute Dialgue n Public Libraries by Amy K. Garmer Directr Aspen Institute

More information

Essendant Online Terms of Use

Essendant Online Terms of Use Essendant Online Terms f Use Thank yu fr visiting this website. These Terms f Use gvern yur use f any website wned by Essendant C. r any f its subsidiaries (including Essendant Industrial LLC), n which

More information

W. J. Owen Department of Mathematics and Computer Science University of Richmond

W. J. Owen Department of Mathematics and Computer Science University of Richmond The Guide Version 2.5 W. J. Owen Department of Mathematics and Computer Science University of Richmond Black Cherry Trees large residual Volume 10 20 30 40 50 60 70 65 70 75 80 85 Consider a log transform

More information

1 IS THERE A CONTRACT?

1 IS THERE A CONTRACT? 1 IS THERE A CONTRACT? MANIFESTATION OF MUTUAL ASSENT: There must be an bjective manifestatin f mutual assent t a K. Judged by what a reasnable persn wuld understand the parties actins t mean. - At stake

More information

The Data Center Management Elephant

The Data Center Management Elephant The Data Center Management Elephant By David Cle DATA CENTER SOLUTIONS Fr Mre Infrmatin: (866) 787-3271 Sales@PTSdcs.cm 2010 N Limits Sftware. All rights reserved. N part f this publicatin may be used,

More information