R Language Fundamentals
|
|
|
- Marvin Evans
- 10 years ago
- Views:
Transcription
1 R Language Fundamentals Data Types and Basic Maniuplation Steven Buechler Department of Mathematics 276B Hurley Hall; Fall, 2007
2 Outline Where did R come from? Overview Atomic Vectors Subsetting Vectors Higher-order data types (slightly)
3 Programming with Data Began with S The S language has been developed since the late 1970s by John Chambers and colleagues at Bell Labs as a language for programming with data. The language combines ideas from a variety sources (awk, lisp, APL, e.g.) and provides an environment for quantitative computations and visualization. Provides an explicit and consistent structure for manipulating, analyzing statistically, and visualizing data.
4 Programming with Data Began with S The S language has been developed since the late 1970s by John Chambers and colleagues at Bell Labs as a language for programming with data. The language combines ideas from a variety sources (awk, lisp, APL, e.g.) and provides an environment for quantitative computations and visualization. Provides an explicit and consistent structure for manipulating, analyzing statistically, and visualizing data.
5 S Becomes Software S-Plus is a commercialization of the Bell Labs framework. It is S plus graphics. R is an independent open source implementation originally developed by Ross Ihaka and Robert Gentleman at the University of Auckland in the mid-1990s. R comes before S. R is currently distributed under the GNU open software license and developed by the user community.
6 S Becomes Software S-Plus is a commercialization of the Bell Labs framework. It is S plus graphics. R is an independent open source implementation originally developed by Ross Ihaka and Robert Gentleman at the University of Auckland in the mid-1990s. R comes before S. R is currently distributed under the GNU open software license and developed by the user community.
7 R is Infinitely Expandable Applications of R normally use a package; i.e., a library of special functions designed for a specific problem. Hundreds of packages are available, mostly written by users. A user normally only loads a handful of packages for a particular analysis (e.g., library(affy)). Standards determine how a package is structured, works well with other packages and creates new data types in an easily used manner. Standardization makes it easy for users to learn new packages.
8 Bioconductor is a Set of Packages Bioconductor is a set R packages particularly designed for biological data analysis. Bioconductor Project sets standards used across packages, identifies needed packages and organizes development cycles and responsible parties. Biconductor project is headed by Robert Gentleman and located at Fred Hutchinson in Seattle.
9 Bioconductor is a Set of Packages Bioconductor is a set R packages particularly designed for biological data analysis. Bioconductor Project sets standards used across packages, identifies needed packages and organizes development cycles and responsible parties. Biconductor project is headed by Robert Gentleman and located at Fred Hutchinson in Seattle.
10 Bioconductor is a Set of Packages Bioconductor is a set R packages particularly designed for biological data analysis. Bioconductor Project sets standards used across packages, identifies needed packages and organizes development cycles and responsible parties. Biconductor project is headed by Robert Gentleman and located at Fred Hutchinson in Seattle.
11 Outline Where did R come from? Overview Atomic Vectors Subsetting Vectors Higher-order data types (slightly)
12 Fundamental Data Objects vector - a sequence of numbers or characters, or higher-dimensional arrays like matrices list - a collection of objects that may themselves be complicated factor - a sequence assigning a category to each index data.frame - a table-like structure (experimental results often collected in this form) environment - hash table. A collection of key-value pairs Classes of objects, like expression data, are built from these. Most commands in R involve applying a function to an object.
13 A Variable is Typed by What it Contains Unlike C variables do not need to be declared and typed. Assigning a sequence of numbers to x forces x to be a numeric vector. Given x, executing class(x) reports the class. This indicates which functions can be used on x.
14 Outline Where did R come from? Overview Atomic Vectors Subsetting Vectors Higher-order data types (slightly)
15 Atomic Data Elements In R the base type is a vector, not a scalar. A vector is an indexed set of values that are all of the same type. The type of the entries determines the class of the vector. The possible vectors are: integer numeric character complex logical integer is a subclass of numeric Cannot combine vectors of different modes
16 Creating Vectors and Learning R Syntax Assignment of a value to a variable is done with <- (two symbols, no space). > v <- 1 > v [1] 1 > v <- c(1, 2, 3) > v [1] > s <- "a string" > class(s) [1] "character"
17 Creating Vectors and accessing attributes Vectors can only contain entries of the same type: numeric or character; you can t mix them. Note that characters should be surrounded by. The most basic way to create a vector is with c(x 1,..., x n ), and it works for characters and numbers alike. > x <- c("a", "b", "c") > length(x) [1] 3
18 Vector Names Entries in a vector can and normally should be named. It is a way of associating a numeric reading with a sample id, for example. > v <- c(s1 = 0.3, s2 = 0.1, s3 = 1.1) > v s1 s2 s > sort(v) s2 s1 s > names(v) [1] "s1" "s2" "s3" > v2 <- c(0.3, 0.1, 1.1) > names(v2) <- c("s1", "s2", "s3")
19 Vector Arithmetic Basic operations on numeric vectors Numeric vectors can be used in arithmetic expressions, in which case the operations are performed element by element to produce another vector. > x <- rnorm(4) > y <- rnorm(4) > x [1] > x + 1 [1] > v <- 2 * x + y + 1 > v [1] The elementary arithmetic operations are the usual +, -, *, /, ^. See also log, exp, log2, sqrt etc., again applied to each entry.
20 More Vector Arithmetic Statistical operations on numeric vectors In studying data you will make frequent use of sum, which gives the sum of the entries, max, min, mean, and var(x) > var(x) [1] which is the same thing as > sum((x - mean(x))^2)/(length(x) - 1) [1] A useful function for quickly getting properties of a vector: > summary(y) Min. 1st Qu. Median Mean 3rd Qu Max
21 Generating regular sequences c - concatenate seq, :, and rep vector, numeric, character, etc. seq(1,30) is the same thing as c(1, 2, 3,..., 29, 30); and this is the same as 1 : 30. Functions in R may have mutliple parameters that are set as arguments to the function. seq is an example. > x1 <- seq(-1, 0, by = 0.1) > x1 [1] [10] > rep(x1, times = 2) [1] [10] [19]
22 Generating regular sequences vector, etc. a <- character(n) creates a character vector a of length n, with each entry. integer(n) and numeric(n) create integer and numeric vectors of length n with entries 0. The first command is shorthand for > a <- vector(mode = "character", length = 10) vector has greater applicability than just creating these common vectors.
23 Logical Vectors Working with data often involves comparing numbers. Comparisons in R output logical values TRUE, FALSE or NA, for not available. Just as with arithmetic operations, logical comparisions with a vector are applied to each entry and output as a vector of truth values; i.e. a logical vector. > v <- seq(-3, 3) > tf <- v > 0 > tf [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE
24 Logical Comparisons Logical vectors are largely used to extract entries from a dataset satisfying certain conditions. Illustrated soon. The logical operators are: <, <=, >, >=, ==, for exact equality and!= for inequality. If c1 and c2 are logical expressions, then c1 & c2 is their intersection ( and ), c1 c2 is their union ( or ), and!c1 is the negation of c1.
25 Missing Values One smart feature of R is that it allows for missing values in vectors and datasets; it denotes them as NA. You don t have to coerce missing values to, say 0, in order to have a meaningful vector. Many functions, like cor(), have options for handling missing values without just exiting. How to find NA values in a vector? > x <- c(1:3, NA) > x == NA [1] NA NA NA NA > is.na(x) [1] FALSE FALSE FALSE TRUE
26 Missing Values One smart feature of R is that it allows for missing values in vectors and datasets; it denotes them as NA. You don t have to coerce missing values to, say 0, in order to have a meaningful vector. Many functions, like cor(), have options for handling missing values without just exiting. How to find NA values in a vector? > x <- c(1:3, NA) > x == NA [1] NA NA NA NA > is.na(x) [1] FALSE FALSE FALSE TRUE
27 Outline Where did R come from? Overview Atomic Vectors Subsetting Vectors Higher-order data types (slightly)
28 Extracting Subsequences of a Vector Getting elements of a vector with desired properties is extremely common, so there are robust tools for doing it. An element of a vector v is assigned an index by its position in the sequence, starting with 1. The basic function for subsetting is [ ]. v[1] is the first element, v[length(v)] is the last. The subsetting function takes input in many forms.
29 Subsetting with Positive Integral Sequences > v <- c("a", "b", "c", "d", "e") > J <- c(1, 3, 5) > v[j] [1] "a" "c" "e" > v[1:3] [1] "a" "b" "c" > v[2:length(v)] [1] "b" "c" "d" "e"
30 Subsetting with Positive Integral Sequences > v <- c("a", "b", "c", "d", "e") > J <- c(1, 3, 5) > v[j] [1] "a" "c" "e" > v[1:3] [1] "a" "b" "c" > v[2:length(v)] [1] "b" "c" "d" "e"
31 Subsetting with Positive Integral Sequences > v <- c("a", "b", "c", "d", "e") > J <- c(1, 3, 5) > v[j] [1] "a" "c" "e" > v[1:3] [1] "a" "b" "c" > v[2:length(v)] [1] "b" "c" "d" "e"
32 Subsetting with Negated Integral Sequences This is a tool for removing elements or subsequences from a vector. > v [1] "a" "b" "c" "d" "e" > J [1] > v[-j] [1] "b" "d" > v[-1] [1] "b" "c" "d" "e" > v[-length(v)] [1] "a" "b" "c" "d"
33 Subsetting with Negated Integral Sequences This is a tool for removing elements or subsequences from a vector. > v [1] "a" "b" "c" "d" "e" > J [1] > v[-j] [1] "b" "d" > v[-1] [1] "b" "c" "d" "e" > v[-length(v)] [1] "a" "b" "c" "d"
34 Subsetting with Logical Vector Important! Given a vector x and a logical vector L of the same length as x, x[l] is the vector of entries in x matching a TRUE in L. > L <- c(true, FALSE, TRUE, FALSE, TRUE) > v[l] [1] "a" "c" "e" > x <- seq(-3, 3) > x >= 0 [1] FALSE FALSE FALSE TRUE TRUE TRUE TRUE > x[x >= 0] [1]
35 Subsetting with Logical Vector Important! Given a vector x and a logical vector L of the same length as x, x[l] is the vector of entries in x matching a TRUE in L. > L <- c(true, FALSE, TRUE, FALSE, TRUE) > v[l] [1] "a" "c" "e" > x <- seq(-3, 3) > x >= 0 [1] FALSE FALSE FALSE TRUE TRUE TRUE TRUE > x[x >= 0] [1]
36 Subsetting with Logical Vector More examples > names(x) <- paste("n", 1:length(x), sep = "") > names(x)[x < 0] [1] "N1" "N2" "N3" > y <- c(x, NA, NA) > z <- y[!is.na(y)] > z N1 N2 N3 N4 N5 N6 N
37 Subsetting by Names If x is a vector with names and A is a subsequence of names(x), then x[a] is the corresponding subsequence of x. > x N1 N2 N3 N4 N5 N6 N > x[c("n1", "N3")] N1 N3-3 -1
38 Assignment to a Subset A subset expression can be on the receiving end of an assignment, in which case the assignment only applies the subset and leaves the rest of the vector alone. > z <- 1:4 > z[1] <- 0 > z [1] > z[z <= 2] <- -1 > z [1] > w <- c(1:3, NA, NA) > w[is.na(w)] <- 0 > w [1]
39 Outline Where did R come from? Overview Atomic Vectors Subsetting Vectors Higher-order data types (slightly)
40 Factors Represent Categorical Data Just the basics Typically in an experiment samples are classified into one of a set group of categories. In R such results are stored in a factor. A factor is a character vector augmented with information about the possible categories, called the levels of the factor. > d1 <- c("m", "F", "M", "F", "F", "F") > d2 <- factor(d1) > d2 [1] M F M F F F Levels: F M
41 Factors (Continued) Still just the basics > table(d2) d2 F M 4 2 > ht <- c(73, 68, 70, 69, 62, 64) > htmeans <- tapply(ht, d2, mean) The data contained in a factor can be coded in a character vector, but there are many additional functions that can apply to a factor. Factors are used in ANOVA.
42 Lists An ordered collection of objects Remember that a vector can only contain numbers, characters or logical values. Frequently, though, we want to create collections of vectors or other data objects of mixed type. In R this is done with a list. The objects in a list are known as its components. Lists are often created quite explicitly: > Lst <- list(name = "Joe", height = "182", + no.children = 3, child.ages = c(5, + 7, 10)) Components are always numbered and can be referenced as such. Lst[[1]] is the first component (namely "Joe"); etc. to > Lst[[4]] [1] Since the last component is a vector you can extract the first entry of it as Lst[[4]][1].
43 List Length and Components For Lst a list, length(lst) is the number of components; names(lst) is the character vector of component names. Often the ordering of components is artificial. We want simple ways of getting the value of a component using the name. There are two ways: > Lst[["height"]] [1] "182" More commonly: > Lst$height [1] "182" > Lst$name [1] "Joe"
44 List Subsetting Versus component extraction For a list LL with n components and s a subsequence of 1:n, LL[s] denotes the sublist with components corresponding to the indices in s. > s <- 1:2 > L1 <- Lst[s] > L1 $name [1] "Joe" $height [1] "182" NOTE: LL[[1]] is different from LL[1]. LL[[1]] is the value of the first component; LL[1] is the list with one component whose value is LL[[1]].
45 > Lst[[1]] [1] "Joe" > class(lst[[1]]) [1] "character" > Lst[1] $name [1] "Joe" > class(lst[1]) [1] "list" List Subsetting Example Versus component extraction Further note: Lst[[m]] only makes sense when m is a single integer. Lst[[1:2]] produces an error. Lists can be concatenated like vectors.
Getting Started with R and RStudio 1
Getting Started with R and RStudio 1 1 What is R? R is a system for statistical computation and graphics. It is the statistical system that is used in Mathematics 241, Engineering Statistics, for the following
R: A self-learn tutorial
R: A self-learn tutorial 1 Introduction R is a software language for carrying out complicated (and simple) statistical analyses. It includes routines for data summary and exploration, graphical presentation
R Language Fundamentals
R Language Fundamentals Data Frames Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Tabular Data Frequently, experimental data are held in tables, an Excel worksheet, for
The R Environment. A high-level overview. Deepayan Sarkar. 22 July 2013. Indian Statistical Institute, Delhi
The R Environment A high-level overview Deepayan Sarkar Indian Statistical Institute, Delhi 22 July 2013 The R language ˆ Tutorials in the Astrostatistics workshop will use R ˆ...in the form of silent
Time Series Analysis AMS 316
Time Series Analysis AMS 316 Programming language and software environment for data manipulation, calculation and graphical display. Originally created by Ross Ihaka and Robert Gentleman at University
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
Moving from CS 61A Scheme to CS 61B Java
Moving from CS 61A Scheme to CS 61B Java Introduction Java is an object-oriented language. This document describes some of the differences between object-oriented programming in Scheme (which we hope you
Oracle SQL. Course Summary. Duration. Objectives
Oracle SQL Course Summary Identify the major structural components of the Oracle Database 11g Create reports of aggregated data Write SELECT statements that include queries Retrieve row and column data
Chapter 15 Functional Programming Languages
Chapter 15 Functional Programming Languages Introduction - The design of the imperative languages is based directly on the von Neumann architecture Efficiency (at least at first) is the primary concern,
Oracle Database: SQL and PL/SQL Fundamentals NEW
Oracle University Contact Us: + 38516306373 Oracle Database: SQL and PL/SQL Fundamentals NEW Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training delivers the
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
Oracle Database: SQL and PL/SQL Fundamentals
Oracle University Contact Us: 1.800.529.0165 Oracle Database: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn This course is designed to deliver the fundamentals of SQL and PL/SQL along
An Introduction to R. W. N. Venables, D. M. Smith and the R Core Team
An Introduction to R Notes on R: A Programming Environment for Data Analysis and Graphics Version 3.1.1 (2014-07-10) W. N. Venables, D. M. Smith and the R Core Team This manual is for R, version 3.1.1
1 Introduction. 2 An Interpreter. 2.1 Handling Source Code
1 Introduction The purpose of this assignment is to write an interpreter for a small subset of the Lisp programming language. The interpreter should be able to perform simple arithmetic and comparisons
R Language Definition
R Language Definition Version 3.2.3 (2015-12-10) DRAFT R Core Team This manual is for R, version 3.2.3 (2015-12-10). Copyright c 2000 2015 R Core Team Permission is granted to make and distribute verbatim
Clustering through Decision Tree Construction in Geology
Nonlinear Analysis: Modelling and Control, 2001, v. 6, No. 2, 29-41 Clustering through Decision Tree Construction in Geology Received: 22.10.2001 Accepted: 31.10.2001 A. Juozapavičius, V. Rapševičius Faculty
CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013
Oct 4, 2013, p 1 Name: CS 141: Introduction to (Java) Programming: Exam 1 Jenny Orr Willamette University Fall 2013 1. (max 18) 4. (max 16) 2. (max 12) 5. (max 12) 3. (max 24) 6. (max 18) Total: (max 100)
VISUAL GUIDE to. RX Scripting. for Roulette Xtreme - System Designer 2.0
VISUAL GUIDE to RX Scripting for Roulette Xtreme - System Designer 2.0 UX Software - 2009 TABLE OF CONTENTS INTRODUCTION... ii What is this book about?... iii How to use this book... iii Time to start...
An Introduction to R. W. N. Venables, D. M. Smith and the R Core Team
An Introduction to R Notes on R: A Programming Environment for Data Analysis and Graphics Version 3.2.2 (2015-08-14) W. N. Venables, D. M. Smith and the R Core Team This manual is for R, version 3.2.2
org.rn.eg.db December 16, 2015 org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank accession numbers.
org.rn.eg.db December 16, 2015 org.rn.egaccnum Map Entrez Gene identifiers to GenBank Accession Numbers org.rn.egaccnum is an R object that contains mappings between Entrez Gene identifiers and GenBank
Wave Analytics Data Integration
Wave Analytics Data Integration Salesforce, Spring 16 @salesforcedocs Last updated: April 28, 2016 Copyright 2000 2016 salesforce.com, inc. All rights reserved. Salesforce is a registered trademark of
Notes from Week 1: Algorithms for sequential prediction
CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking
Access Queries (Office 2003)
Access Queries (Office 2003) Technical Support Services Office of Information Technology, West Virginia University OIT Help Desk 293-4444 x 1 oit.wvu.edu/support/training/classmat/db/ Instructor: Kathy
Introduction to the R Language
Introduction to the R Language Functions Biostatistics 140.776 Functions Functions are created using the function() directive and are stored as R objects just like anything else. In particular, they are
Math Review. for the Quantitative Reasoning Measure of the GRE revised General Test
Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important
Basics of using the R software
Basics of using the R software Experimental and Statistical Methods in Biological Sciences I Juulia Suvilehto NBE 10.9.2015 Demo sessions Demo sessions (Thu 14.15 17.00) x 5 Demos, example code, and exercises
DRAFT. Algebra 1 EOC Item Specifications
DRAFT Algebra 1 EOC Item Specifications The draft Florida Standards Assessment (FSA) Test Item Specifications (Specifications) are based upon the Florida Standards and the Florida Course Descriptions as
Oracle Database: SQL and PL/SQL Fundamentals
Oracle University Contact Us: +966 12 739 894 Oracle Database: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training is designed to
Introduction to R June 2006
Introduction to R Introduction...3 What is R?...3 Availability & Installation...3 Documentation and Learning Resources...3 The R Environment...4 The R Console...4 Understanding R Basics...5 Managing your
Mathematical Induction. Lecture 10-11
Mathematical Induction Lecture 10-11 Menu Mathematical Induction Strong Induction Recursive Definitions Structural Induction Climbing an Infinite Ladder Suppose we have an infinite ladder: 1. We can reach
Package retrosheet. April 13, 2015
Type Package Package retrosheet April 13, 2015 Title Import Professional Baseball Data from 'Retrosheet' Version 1.0.2 Date 2015-03-17 Maintainer Richard Scriven A collection of tools
Psychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
Chapter One Introduction to Programming
Chapter One Introduction to Programming 1-1 Algorithm and Flowchart Algorithm is a step-by-step procedure for calculation. More precisely, algorithm is an effective method expressed as a finite list of
Arena 9.0 Basic Modules based on Arena Online Help
Arena 9.0 Basic Modules based on Arena Online Help Create This module is intended as the starting point for entities in a simulation model. Entities are created using a schedule or based on a time between
I PUC - Computer Science. Practical s Syllabus. Contents
I PUC - Computer Science Practical s Syllabus Contents Topics 1 Overview Of a Computer 1.1 Introduction 1.2 Functional Components of a computer (Working of each unit) 1.3 Evolution Of Computers 1.4 Generations
Introduction to Matlab
Introduction to Matlab Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. [email protected] Course Objective This course provides
Performing Queries Using PROC SQL (1)
SAS SQL Contents Performing queries using PROC SQL Performing advanced queries using PROC SQL Combining tables horizontally using PROC SQL Combining tables vertically using PROC SQL 2 Performing Queries
Sources: On the Web: Slides will be available on:
C programming Introduction The basics of algorithms Structure of a C code, compilation step Constant, variable type, variable scope Expression and operators: assignment, arithmetic operators, comparison,
Prof. Nicolai Meinshausen Regression FS 2014. R Exercises
Prof. Nicolai Meinshausen Regression FS 2014 R Exercises 1. The goal of this exercise is to get acquainted with different abilities of the R statistical software. It is recommended to use the distributed
Florida Math for College Readiness
Core Florida Math for College Readiness Florida Math for College Readiness provides a fourth-year math curriculum focused on developing the mastery of skills identified as critical to postsecondary readiness
The Epsilon-Delta Limit Definition:
The Epsilon-Delta Limit Definition: A Few Examples Nick Rauh 1. Prove that lim x a x 2 = a 2. (Since we leave a arbitrary, this is the same as showing x 2 is continuous.) Proof: Let > 0. We wish to find
The Not Quite R (NQR) Project: Explorations Using the Parrot Virtual Machine
The Not Quite R (NQR) Project: Explorations Using the Parrot Virtual Machine Michael J. Kane 1 and John W. Emerson 2 1 Yale Center for Analytical Sciences, Yale University 2 Department of Statistics, Yale
CS 241 Data Organization Coding Standards
CS 241 Data Organization Coding Standards Brooke Chenoweth University of New Mexico Spring 2016 CS-241 Coding Standards All projects and labs must follow the great and hallowed CS-241 coding standards.
Package uptimerobot. October 22, 2015
Type Package Version 1.0.0 Title Access the UptimeRobot Ping API Package uptimerobot October 22, 2015 Provide a set of wrappers to call all the endpoints of UptimeRobot API which includes various kind
Lines & Planes. Packages: linalg, plots. Commands: evalm, spacecurve, plot3d, display, solve, implicitplot, dotprod, seq, implicitplot3d.
Lines & Planes Introduction and Goals: This lab is simply to give you some practice with plotting straight lines and planes and how to do some basic problem solving with them. So the exercises will be
Object systems available in R. Why use classes? Information hiding. Statistics 771. R Object Systems Managing R Projects Creating R Packages
Object systems available in R Statistics 771 R Object Systems Managing R Projects Creating R Packages Douglas Bates R has two object systems available, known informally as the S3 and the S4 systems. S3
OVERVIEW OF R SOFTWARE AND PRACTICAL EXERCISE
OVERVIEW OF R SOFTWARE AND PRACTICAL EXERCISE Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-110012 1. INTRODUCTION R is a free software environment for statistical computing
COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries
COMP 5138 Relational Database Management Systems Week 5 : Basic COMP5138 "Relational Database Managment Systems" J. Davis 2006 5-1 Today s Agenda Overview Basic Queries Joins Queries Aggregate Functions
ECON 424/CFRM 462 Introduction to Computational Finance and Financial Econometrics
ECON 424/CFRM 462 Introduction to Computational Finance and Financial Econometrics Eric Zivot Savery 348, email:[email protected], phone 543-6715 http://faculty.washington.edu/ezivot OH: Th 3:30-4:30 TA: Ming
Microsoft Access 3: Understanding and Creating Queries
Microsoft Access 3: Understanding and Creating Queries In Access Level 2, we learned how to perform basic data retrievals by using Search & Replace functions and Sort & Filter functions. For more complex
For example, estimate the population of the United States as 3 times 10⁸ and the
CCSS: Mathematics The Number System CCSS: Grade 8 8.NS.A. Know that there are numbers that are not rational, and approximate them by rational numbers. 8.NS.A.1. Understand informally that every number
Variables, Constants, and Data Types
Variables, Constants, and Data Types Primitive Data Types Variables, Initialization, and Assignment Constants Characters Strings Reading for this class: L&L, 2.1-2.3, App C 1 Primitive Data There are eight
Analyzing Flow Cytometry Data with Bioconductor
Introduction Data Analysis Analyzing Flow Cytometry Data with Bioconductor Nolwenn Le Meur, Deepayan Sarkar, Errol Strain, Byron Ellis, Perry Haaland, Florian Hahne Fred Hutchinson Cancer Research Center
Introduction to SQL for Data Scientists
Introduction to SQL for Data Scientists Ben O. Smith College of Business Administration University of Nebraska at Omaha Learning Objectives By the end of this document you will learn: 1. How to perform
Rigorous Software Development CSCI-GA 3033-009
Rigorous Software Development CSCI-GA 3033-009 Instructor: Thomas Wies Spring 2013 Lecture 11 Semantics of Programming Languages Denotational Semantics Meaning of a program is defined as the mathematical
Object Oriented Software Design
Object Oriented Software Design Introduction to Java - II Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa September 14, 2011 G. Lipari (Scuola Superiore Sant Anna) Introduction
Financial Risk Models in R: Factor Models for Asset Returns. Workshop Overview
Financial Risk Models in R: Factor Models for Asset Returns and Interest Rate Models Scottish Financial Risk Academy, March 15, 2011 Eric Zivot Robert Richards Chaired Professor of Economics Adjunct Professor,
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
Answer Key for California State Standards: Algebra I
Algebra I: Symbolic reasoning and calculations with symbols are central in algebra. Through the study of algebra, a student develops an understanding of the symbolic language of mathematics and the sciences.
Programming Languages & Tools
4 Programming Languages & Tools Almost any programming language one is familiar with can be used for computational work (despite the fact that some people believe strongly that their own favorite programming
The F distribution and the basic principle behind ANOVAs. Situating ANOVAs in the world of statistical tests
Tutorial The F distribution and the basic principle behind ANOVAs Bodo Winter 1 Updates: September 21, 2011; January 23, 2014; April 24, 2014; March 2, 2015 This tutorial focuses on understanding rather
This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
BIO503 - Lecture 1 Introduction to the R language
BIO503 - Lecture 1 Introduction to the R language Bio503 January 2008, Aedin Culhane. I R, S and S-plus What is R? ˆ R is an environment for data analysis and visualization ˆ R is an open source implementation
Instant SQL Programming
Instant SQL Programming Joe Celko Wrox Press Ltd. INSTANT Table of Contents Introduction 1 What Can SQL Do for Me? 2 Who Should Use This Book? 2 How To Use This Book 3 What You Should Know 3 Conventions
Exercise 1: Python Language Basics
Exercise 1: Python Language Basics In this exercise we will cover the basic principles of the Python language. All languages have a standard set of functionality including the ability to comment code,
Chapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
Oracle Database 10g: Introduction to SQL
Oracle University Contact Us: 1.800.529.0165 Oracle Database 10g: Introduction to SQL Duration: 5 Days What you will learn This course offers students an introduction to Oracle Database 10g database technology.
The continuous and discrete Fourier transforms
FYSA21 Mathematical Tools in Science The continuous and discrete Fourier transforms Lennart Lindegren Lund Observatory (Department of Astronomy, Lund University) 1 The continuous Fourier transform 1.1
An Introduction to R. W. N. Venables, D. M. Smith and the R Core Team
An Introduction to R Notes on R: A Programming Environment for Data Analysis and Graphics Version 3.2.0 (2015-04-16) W. N. Venables, D. M. Smith and the R Core Team This manual is for R, version 3.2.0
Visual Basic Programming. An Introduction
Visual Basic Programming An Introduction Why Visual Basic? Programming for the Windows User Interface is extremely complicated. Other Graphical User Interfaces (GUI) are no better. Visual Basic provides
Cluster Analysis using R
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more similar (in some sense or another) to each other
Introduction to R software for statistical computing
to R software for statistical computing JoAnn Rudd Alvarez, MA [email protected] biostat.mc.vanderbilt.edu/joannalvarez Department of Biostatistics Division of Cancer Biostatistics Center for
2+2 Just type and press enter and the answer comes up ans = 4
Demonstration Red text = commands entered in the command window Black text = Matlab responses Blue text = comments 2+2 Just type and press enter and the answer comes up 4 sin(4)^2.5728 The elementary functions
VHDL Test Bench Tutorial
University of Pennsylvania Department of Electrical and Systems Engineering ESE171 - Digital Design Laboratory VHDL Test Bench Tutorial Purpose The goal of this tutorial is to demonstrate how to automate
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
Introduction to Parallel Programming and MapReduce
Introduction to Parallel Programming and MapReduce Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant
Exploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
Lecture 1: Systems of Linear Equations
MTH Elementary Matrix Algebra Professor Chao Huang Department of Mathematics and Statistics Wright State University Lecture 1 Systems of Linear Equations ² Systems of two linear equations with two variables
WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?
WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly
MATLAB Basics MATLAB numbers and numeric formats
MATLAB Basics MATLAB numbers and numeric formats All numerical variables are stored in MATLAB in double precision floating-point form. (In fact it is possible to force some variables to be of other types
1. Classification problems
Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification
MATH2210 Notebook 1 Fall Semester 2016/2017. 1 MATH2210 Notebook 1 3. 1.1 Solving Systems of Linear Equations... 3
MATH0 Notebook Fall Semester 06/07 prepared by Professor Jenny Baglivo c Copyright 009 07 by Jenny A. Baglivo. All Rights Reserved. Contents MATH0 Notebook 3. Solving Systems of Linear Equations........................
How To Find The Sample Space Of A Random Experiment In R (Programming)
Probability 4.1 Sample Spaces For a random experiment E, the set of all possible outcomes of E is called the sample space and is denoted by the letter S. For the coin-toss experiment, S would be the results
Data Mining with R: learning by case studies
Data Mining with R: learning by case studies Luis Torgo LIACC-FEP, University of Porto R. Campo Alegre, 823-4150 Porto, Portugal email: [email protected] http://www.liacc.up.pt/ ltorgo May 22, 2003 Preface
Introduction to R and UNIX Working with microarray data in a multi-user environment
Microarray Data Analysis Workshop MedVetNet Workshop, DTU 2008 Introduction to R and UNIX Working with microarray data in a multi-user environment Carsten Friis Media glna tnra GlnA TnrA C2 glnr C3 C5
Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson
Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
JavaScript: Introduction to Scripting. 2008 Pearson Education, Inc. All rights reserved.
1 6 JavaScript: Introduction to Scripting 2 Comment is free, but facts are sacred. C. P. Scott The creditor hath a better memory than the debtor. James Howell When faced with a decision, I always ask,
High-Level Programming Languages. Nell Dale & John Lewis (adaptation by Michael Goldwasser)
High-Level Programming Languages Nell Dale & John Lewis (adaptation by Michael Goldwasser) Low-Level Languages What are disadvantages of low-level languages? (e.g., machine code or assembly code) Programming
How To Write A Data Processing Pipeline In R
New features and old concepts for handling large and streaming data in practice Simon Urbanek R Foundation Overview Motivation Custom connections Data processing pipelines Parallel processing Back-end
Classes and Methods for Spatial Data: the sp Package
Classes and Methods for Spatial Data: the sp Package Edzer Pebesma Roger S. Bivand Feb 2005 Contents 1 Introduction 2 2 Spatial data classes 2 3 Manipulating spatial objects 3 3.1 Standard methods..........................
Analysing equity portfolios in R
Analysing equity portfolios in R Using the portfolio package by David Kane and Jeff Enos Introduction 1 R is used by major financial institutions around the world to manage billions of dollars in equity
