Hands on S4 Classes. Yohan Chalabi. R/Rmetrics Workshop Meielisalp June 2009. ITP ETH, Zurich Rmetrics Association, Zurich Finance Online, Zurich



Similar documents
Acknowledgements. S4 Classes and Methods. Overview. Introduction. S4 has been designed and written by. John Chambers. These slides contain material by

Object Oriented Programming

Adatelemzés II. [SST35]

Package fimport. February 19, 2015

Package TimeWarp. R topics documented: April 1, 2015

S4 Classes in 15 pages, more or less

Object systems available in R. Why use classes? Information hiding. Statistics 771. R Object Systems Managing R Projects Creating R Packages

R Language Definition

Working with Financial Time Series Data in R

Package uptimerobot. October 22, 2015

Package HadoopStreaming

Working with xts and quantmod

AP Computer Science Java Subset

Errata and Notes for Software for Data Analysis: Programming with R

Package retrosheet. April 13, 2015

Dataframes. Lecture 8. Nicholas Christian BIOST 2094 Spring 2011

Package sendmailr. February 20, 2015

Using Open Source Software to Teach Mathematical Statistics p.1/29

Package sjdbc. R topics documented: February 20, 2015

Customising spatial data classes and methods

Object Oriented Software Design

Schema Classes. Polyhedra Ltd

Cluster Analysis using R

Lab 13: Logistic Regression

R Language Fundamentals

Classes and Objects in Java Constructors. In creating objects of the type Fraction, we have used statements similar to the following:

C++ INTERVIEW QUESTIONS

Appendix A Doing Things in R

Package polynom. R topics documented: June 24, Version 1.3-8

Distribute your R code with R package

Sample CSE8A midterm Multiple Choice (circle one)

Wave Analytics Data Integration

USING WIRESHARK TO CAPTURE AND ANALYZE NETWORK DATA

Introduction to Java

Object-Oriented Design Lecture 4 CSU 370 Fall 2007 (Pucella) Tuesday, Sep 18, 2007

Reading and writing files

Assignment 3 Version 2.0 Reactive NoSQL Due April 13

Package date. R topics documented: February 19, Version Title Functions for handling dates. Description Functions for handling dates.

url.sty version 3.4 Donald Arseneau

Data Storage STAT 133. Gaston Sanchez. Department of Statistics, UC Berkeley

1 Posix API vs Windows API

Exploratory Data Analysis and Plotting

Handout 1. Introduction to Java programming language. Java primitive types and operations. Reading keyboard Input using class Scanner.

Kit Rowley. Subject: Content type and workflow planning (SharePoint Server 2010) Attachments: image001.gif. Plan content types. Plan content types

Psychology 205: Research Methods in Psychology

Optimization of sampling strata with the SamplingStrata package

Package dsstatsclient

Introduction to the data.table package in R

Introduction to the R Language

Search and Replace in SAS Data Sets thru GUI

Lecture 5: Java Fundamentals III

Save Actions User Guide

Package DSsim. September 25, 2015

Classes for record linkage of big data sets

Specifications of Paradox for Windows

Building and Using Web Services With JDeveloper 11g

1001ICT Introduction To Programming Lecture Notes

Package cgdsr. August 27, 2015

Computing Concepts with Java Essentials

HowTo: Querying online Data

Authoring for System Center 2012 Operations Manager

Moving from CS 61A Scheme to CS 61B Java

The R Environment. A high-level overview. Deepayan Sarkar. 22 July Indian Statistical Institute, Delhi

Help on the Embedded Software Block

Java Programming Fundamentals

awk A UNIX tool to manipulate and generate formatted data

Package png. February 20, 2015

PHP Magic Tricks: Type Juggling. PHP Magic Tricks: Type Juggling

STAT10020: Exploratory Data Analysis

[MS-ASMS]: Exchange ActiveSync: Short Message Service (SMS) Protocol

CS 111 Classes I 1. Software Organization View to this point:

ASCII Encoding. The char Type. Manipulating Characters. Manipulating Characters

Using self-organizing maps for visualization and interpretation of cytometry data

1. Classification problems

Practical Differential Gene Expression. Introduction

An Incomplete C++ Primer. University of Wyoming MA 5310

White Paper. Fabasoft app.test Load Testing. Fabasoft app.test 2015 Update Rollup 2. Fabasoft app.test Load Testing 1

2 intervals-package. Index 33. Tools for working with points and intervals

Model Driven Laboratory Information Management Systems Hao Li 1, John H. Gennari 1, James F. Brinkley 1,2,3 Structural Informatics Group 1

Package RCassandra. R topics documented: February 19, Version Title R/Cassandra interface

Introduction of geospatial data visualization and geographically weighted reg

Basics of I/O Streams and File I/O

Introduction to Matlab

Sources: On the Web: Slides will be available on:

Classes and Methods for Spatial Data: the sp Package

Basic Programming and PC Skills: Basic Programming and PC Skills:

Member Functions of the istream Class

Implementing a WCF Service in the Real World

Python Lists and Loops

Package neuralnet. February 20, 2015

Basic Java Constructs and Data Types Nuts and Bolts. Looking into Specific Differences and Enhancements in Java compared to C

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Storage Classes CS 110B - Rule Storage Classes Page 18-1 \handouts\storclas

Informatica e Sistemi in Tempo Reale

Package GEOquery. August 18, 2015

: provid.ir

An Introduction to R. W. N. Venables, D. M. Smith and the R Core Team

Your Best Next Business Solution Big Data In R 24/3/2010

Example of a Java program

Online signature API. Terms used in this document. The API in brief. Version 0.20,

Transcription:

Hands on S4 Classes Yohan Chalabi ITP ETH, Zurich Rmetrics Association, Zurich Finance Online, Zurich R/Rmetrics Workshop Meielisalp June 2009

Outline 1 Introduction 2 S3 Classes/Methods 3 S4 Classes/Methods

Outline 1 Introduction 2 S3 Classes/Methods 3 S4 Classes/Methods

S3/S4 History The appendix in Software for Data Analysis by Chambers [1] is of great interest to learn more about the history of the S language. First discussion at Bell labs in May 1976 for a new system to interface a large Fortran library. By the end of 1976 Rick Becker and John Chambers with the help of co-workers have a first implementation of S running locally on Honeywell OS. This new language is later ported to UNIX systems and becomes the S version 2. About ten years after the first meeting, a new version with concepts inspired from UNIX system is developed with focus on functional programming and with object self-description. This is the S version 3. Around 1992 is introduced the concept of classes and methods as known today by S4 classes.

Goal of the Tutorial The goal of this tutorial is to introduce concepts and methods of S4 classes in R. We will start with a brief overview of S3 classes and introduce S4 classes in comparison with their S3 counterparts. As an example we will implement a class which could represent a time series. This object will hold a data part (matrix), timestamps (timedate) and additional information in the form of data.frame s that we will call flag s. > library(timedate) > time <- as.character(timesequence(length.out=4)) > data <- matrix(round(rnorm(8), 3), ncol = 2) > colnames(data) <- c("col1", "col2") > flag <- data.frame(flag = sample(c("m", "F"), 4, replace = TRUE))

Outline 1 Introduction 2 S3 Classes/Methods 3 S4 Classes/Methods

S3 classes An S3 class is defined by the special attribute class, a character string vector. In our case the, Bull class. 1 > bull <- data > attr(bull, "class") <- "Bull" > bull col1 col2 [1,] -2.123 1.126 [2,] 2.833-0.721 [3,] 0.921-1.976 [4,] -0.917-1.425 attr(,"class") [1] "Bull" > bull <- data > class(bull) <- "Bull" > bull col1 col2 [1,] -2.123 1.126 [2,] 2.833-0.721 [3,] 0.921-1.976 [4,] -0.917-1.425 attr(,"class") [1] "Bull" Note the class() function to define a class. 1 imagine bulls and cows mooing in the field next to the conference room

S3 classes Now we add new attributes for the timestamps and the additional information flag. > attr(bull, "time") <- time > attr(bull, "flag") <- flag > bull col1 col2 [1,] -2.123 1.126 [2,] 2.833-0.721 [3,] 0.921-1.976 [4,] -0.917-1.425 attr(,"class") [1] "Bull" attr(,"time") [1] "2009-06-01 06:04:46" "2009-06-02 06:04:46" [3] "2009-06-03 06:04:46" "2009-06-04 06:04:46" attr(,"flag") flag 1 F 2 F 3 M 4 F

S3 methods In the world of S3 classes, methods of generic functions can be defined with a new functions named according to the scheme <generic name>.<class>. Here a generic function is a function which dispatches the S3 method with UseMethod(). There are some functions which are S3 generics. For example print(), plot(),... Note S3 methods only dispatch on the type of the first argument. If no method is found, the default methods is used (<generic name>.default). > print function (x,...) UseMethod("print") <environment: namespace:base> > head(methods(print)) #-> too many methods [1] "print.acf" "print.anova" "print.aov" [4] "print.aovlist" "print.ar" "print.arima"

S3 methods Let s define a print() method for our class "Bull" > print.bull <- function(x,...) { y <- matrix(c(x), ncol = ncol(x)) dimnames(y) <- list(as.character(attr(x, "time")), colnames(x)) cat("meielisalp\n") print(y) invisible(x) } > bull Meielisalp col1 col2 2009-06-01 06:04:46-2.123 1.126 2009-06-02 06:04:46 2.833-0.721 2009-06-03 06:04:46 0.921-1.976 2009-06-04 06:04:46-0.917-1.425

S3 generic Let s define a new generic function with its default method > dinner <- function(x,...) UseMethod("dinner") > dinner.default <- function(x,...) cat("a Swiss Fondue\n") This will give with our class > dinner(bull) A Swiss Fondue and with a defined method for the class Bull. > dinner.bull <- function(x,...) cat("hay!!\n") > dinner(bull) Hay!!

S3 group generic There are group generic methods for a specified group of functions Math, Ops, Summary and Complex. > methods("math") [1] Math.data.frame Math.Date Math.difftime [4] Math.factor Math.POSIXt > getgroupmembers("math") [1] "abs" "sign" "sqrt" "ceiling" "floor" [6] "trunc" "cummax" "cummin" "cumprod" "cumsum" [11] "exp" "expm1" "log" "log10" "log2" [16] "log1p" "cos" "cosh" "sin" "sinh" [21] "tan" "tanh" "acos" "acosh" "asin" [26] "asinh" "atan" "atanh" "gamma" "lgamma" [31] "digamma" "trigamma"

S3 inheritance S3 classes indirectly inherits the methods of its data part because the S3 objects is just an R object with attributes. More than one string can be added in the class attributes if one wants to share common properties between different classes. A good example are the classes : POSIXct, POSIXlt and POSIXt. > class(sys.time()) [1] "POSIXt" "POSIXct" > class(as.posixlt(sys.time())) [1] "POSIXt" "POSIXlt"

S3 Classes - Key Functions class() methods() UseMethod() NextMethod() Defines the class attribute Lists S3 methods for a class Generic function mechanism Invokes the next method

Drawbacks of S3 Classes It does not check the consistency of the class. It has no control on inheritance. S3 methods can only dispatch on the first argument. By the time S4 classes were introduced there were too many software implemented in S3 style. We have to live with both worlds.

Outline 1 Introduction 2 S3 Classes/Methods 3 S4 Classes/Methods

S4 Classes A new class can be created with the function setclass(). It defines metadata with information about the new classes. setclass() requires the type of all components of the class. It ensure the consistency of the class. > setclass("cow", representation(data = "matrix", time = "character", flag = "data.frame")) [1] "Cow" > # class metadata >. C Cow Class "Cow" [in ".GlobalEnv"] Slots: Name: data time flag Class: matrix character data.frame

S4 Classes New instance of classes can be created with the function new(). > cow <- new("cow", data = data, time = time, flag = flag) > cow An object of class "Cow" Slot "data": col1 col2 [1,] -2.123 1.126 [2,] 2.833-0.721 [3,] 0.921-1.976 [4,] -0.917-1.425 Slot "time": [1] "2009-06-01 06:04:46" "2009-06-02 06:04:46" [3] "2009-06-03 06:04:46" "2009-06-04 06:04:46" Slot "flag": flag 1 F 2 F 3 M 4 F

S4 Classes The structure of the class can be inspected with the str() function. > str(cow) Formal class 'Cow' [package ".GlobalEnv"] with 3 slots..@ data: num [1:4, 1:2] -2.123 2.833 0.921-0.917 1.126.......- attr(*, "dimnames")=list of 2......$ : NULL......$ : chr [1:2] "col1" "col2"..@ time: chr [1:4] "2009-06-01 06:04:46" "2009-06-02 06:04:46" "2009-06-03 06..@ flag:'data.frame': 4 obs. of 1 variable:....$ flag: Factor w/ 2 levels "F","M": 1 1 2 1

S4 slots A class representation is organized in slots which can be accessed by the operator @ : > cow@data col1 col2 [1,] -2.123 1.126 [2,] 2.833-0.721 [3,] 0.921-1.976 [4,] -0.917-1.425 > cow@data <- data

restriction on the type of object in slots When a slot is assigned, the object is automatically checked for a valid slot type. For instance, if we try to assign a character vector to our @flag slot which is of type data.frame, we get an error. > cow@flag <- "bad" Error in checkslotassignment(object, name, value) : assignment of an object of class "character" is not valid for slot "flag" in an object of class "Cow"; is(value, "data.frame") is not TRUE

Inheritance In our definition of the Cow class, there is no inheritance method. Trying to use a generic function like + will throw an error. > cow + 1 Error in cow + 1 : non-numeric argument to binary operator

Inheritance But we could have defined the class with the contains argument in setclass(). Let s redefine our class such that it inherits from the class matrix. > setclass("cow", representation(time = "character", flag = "data.frame"), contains = "matrix") [1] "Cow" > cow <- new("cow", data, time = time, flag = flag) > cow + 1 An object of class "Cow" col1 col2 [1,] -1.123 2.126 [2,] 3.833 0.279 [3,] 1.921-0.976 [4,] 0.083-0.425 Slot "time": [1] "2009-06-01 06:04:46" "2009-06-02 06:04:46" [3] "2009-06-03 06:04:46" "2009-06-04 06:04:46" Slot "flag": flag

Inheritance Note a class inheriting from another class must have all slots from its superclass, and may define additional slots. S4 classes cannot inherits from S3 classes unless they have been redefined with the setoldclass() function. > getclass("cow") Class "Cow" [in ".GlobalEnv"] Slots: Name:.Data time flag Class: matrix character data.frame Extends: Class "matrix", from data part Class "array", by class "matrix", distance 2 Class "structure", by class "matrix", distance 3 Class "vector", by class "matrix", distance 4, with explicit coerce

S4 Validity Checks We can also define validity checks with setvalidity(). > validitycow <- function(object) { if (nrow(object@flag)!= nrow(object)) return("length of '@flag' not equal to '@.Data' extent") TRUE } > setvalidity("cow", validitycow) Class "Cow" [in ".GlobalEnv"] Slots: Name:.Data time flag Class: matrix character data.frame Extends: Class "matrix", from data part Class "array", by class "matrix", distance 2 Class "structure", by class "matrix", distance 3 Class "vector", by class "matrix", distance 4, with explicit coerce

S4 Validity Checks Now we define our own initialize() method to ensure that objects created with new() are valid. > setmethod("initialize", "Cow", function(.object,...) { value <- callnextmethod() validobject(value) value }) [1] "initialize" > new("cow", data, flag = data.frame(flag[1:3,])) Error in validobject(value) : invalid class "Cow" object: length of '@flag' not equal to '@.Data' extent

S4 Methods As you have just seen in the previous chunk, S4 methods are defined with setmethod(). Let s write a show() method for our class. > setmethod("show", "Cow", function(object) { value <- getdatapart(object) rownames(value) <- as.character(slot(object, "time")) flag <- as.matrix(slot(object, "flag")) colnames(flag) <- paste(colnames(flag), "*", sep ="") cat("meielisalp\n") print(cbind(value, flag), right = TRUE, quote = FALSE) }) [1] "show" > cow Meielisalp col1 col2 flag* 2009-06-01 06:04:46-2.123 1.126 F 2009-06-02 06:04:46 2.833-0.721 F 2009-06-03 06:04:46 0.921-1.976 M 2009-06-04 06:04:46-0.917-1.425 F

S4 Generic S4 generics are defined with setgeneric() and standardgeneric(). > setgeneric("cowseries", function(x, time, flag,...) standardgeneric("cowseries")) [1] "cowseries" Unlike S3 methods, the S4 setmethod() can turn any existing function to a generic, except primitive functions. Dispatch on primitive functions is implemented in C level and most of the primitive functions in R have it. One can also define group generics with setgroupgeneric() or use the predefined groups : Arith, Compare, Ops, Logic, Math, Math2, Summary, Complex.

Multiple dispatch S3 methods are only dispatch on the first argument. You often need many if... else... in your code when you are dealing with different argument types. > graphics:::plot.factor function (x, y, legend.text = NULL,...) { if (missing(y) is.factor(y)) { dargs <- list(...) axisnames <- if (!is.null(dargs$axes)) dargs$axes else if (!is.null(dargs$xaxt)) dargs$xaxt!= "n" else TRUE } if (missing(y)) { barplot(table(x), axisnames = axisnames,...) } else if (is.factor(y)) { if (is.null(legend.text)) spineplot(x, y,...) else { args <- c(list(x = x, y = y), list(...)) args$yaxlabels <- legend.text

Multiple dispatch With S4 methods you can define the type of all argument and also the special types ANY and missing. > setmethod("cowseries", signature("matrix", "character", "data.frame"), function(x, time, flag,...) new("cow", x, time = time, flag = flag)) [1] "cowseries" > cowseries(data, time, flag) Meielisalp col1 col2 flag* 2009-06-01 06:04:46-2.123 1.126 F 2009-06-02 06:04:46 2.833-0.721 F 2009-06-03 06:04:46 0.921-1.976 M 2009-06-04 06:04:46-0.917-1.425 F

Multiple dispatch > setmethod("cowseries", signature("matrix", "POSIXct", "character"), function(x, time, flag,...) { time <- as(time, "character") flag <- as.data.frame(flag) callgeneric(x, time, flag,...) }) [1] "cowseries" > timect <- seq(from = Sys.time(), to = (Sys.time() + 4*3600), length.out = 4) > flagstr <- as.character(flag[[1]]) > cowseries(data, timect, flagstr) Meielisalp col1 col2 flag* 2009-06-30 08:04:47-2.123 1.126 F 2009-06-30 09:24:47 2.833-0.721 F 2009-06-30 10:44:47 0.921-1.976 M 2009-06-30 12:04:47-0.917-1.425 F

Object Conversion as() can be used to convert an object to another class > as(cow, "matrix") col1 col2 [1,] -2.123 1.126 [2,] 2.833-0.721 [3,] 0.921-1.976 [4,] -0.917-1.425 and one can defined conversion methods with setas(). Let s define a more appropriate as() method for our class : > setas("cow", "matrix", function(from) { value <- getdatapart(from) rownames(value) <- as.character(slot(from, "time")) value }) [1] "coerce<-" > as(cow, "matrix") col1 col2 2009-06-01 06:04:46-2.123 1.126 2009-06-02 06:04:46 2.833-0.721 2009-06-03 06:04:46 0.921-1.976 2009-06-04 06:04:46-0.917-1.425

What is an S4 class in R? S4 slots are actually attributes and, in low level, S4 objects has a special S4 bit > attrscow <- attributes(cow) > madcow <- data > attributes(madcow) <- attrscow > ass4(madcow) Meielisalp col1 col2 flag* 2009-06-01 06:04:46-2.123 1.126 F 2009-06-02 06:04:46 2.833-0.721 F 2009-06-03 06:04:46 0.921-1.976 M 2009-06-04 06:04:46-0.917-1.425 F BUT! You have to promise that you will never use such a trick!

What is an S4 class in R? S4 slots are actually attributes and, in low level, S4 objects has a special S4 bit > attrscow <- attributes(cow) > madcow <- data > attributes(madcow) <- attrscow > ass4(madcow) Meielisalp col1 col2 flag* 2009-06-01 06:04:46-2.123 1.126 F 2009-06-02 06:04:46 2.833-0.721 F 2009-06-03 06:04:46 0.921-1.976 M 2009-06-04 06:04:46-0.917-1.425 F BUT! You have to promise that you will never use such a trick!

S4 Classes - Key functions setclass() new() setgeneric() setmethods() as() / setas() @ / slot() setvalidity() / validobject() getclass() / showmethods() / getmethod() define classes create objects define generics define methods convert objects access slots check object validity access registry

References I J.M. Chambers Software for data analysis: programming with R Springer, 2008. R Development Core Team?Clasees and?methods manual pages 2009.

> tolatex(sessioninfo()) R version 2.10.0 Under development (unstable) (2009-06-23 r48824), i686-pc-linux-gnu Locale: LC_CTYPE=en_US.UTF-8, LC_NUMERI... Base packages: base, datasets, graphics, grdevices, methods, stats, utils Other packages: timedate 2100.86 Loaded via a namespace (and not attached): tools 2.10.0

Hands on S4 Classes Yohan Chalabi ITP ETH, Zurich Rmetrics Association, Zurich Finance Online, Zurich R/Rmetrics Workshop Meielisalp June 2009