##
**Getting
started with R**

**R Console : Input**

Assignment
operator :

> x <- 5 > x ## auto-printing [1] 5 > print(x) ## explicit printing [1] 5 [1] indicates that x is vector 5 is the first element of vector. ## hash is used for comments > x <- 1:20 ## : operator is used to create integer sequences. > x [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 x is vector of value 20.

**R Data Types: Objects and Attributes**

**Data Types:**- atomic classes : numeric, logical, character, integer, complex
- vectors,list
- factors
- missing values
- data frames
- names

Vectors:

The most basic object is a vector. A vector can only contains objects of the same class. Empty vectors can be created with vector() function.

The most basic object is a vector. A vector can only contains objects of the same class. Empty vectors can be created with vector() function.

Numbers:

Numbers
in R a generally treated as numeric objects(double precision real
numbers) If
you explicitly want to an integer , you need to specify the L suffix.

e.g.
1L

Special
number Inf which represents infinity. NaN
not a number (represnets an undefined value)

Attributes
:

- name,dimnames
- dimensions(e.g. matrics, arrays)
- class
- length
- other user defined attibutes.

**Creating Vectors:**- c() function can be used to create vectors of objects.

**Mixing Objects:**> y <- c(1.7,'a') ##Character > y [1] "1.7" "a" > y <- c(TRUE,'a') ##Character > y [1] "TRUE" "a" > y <- c(TRUE,2) ##Numeric > y [1] 1 2

When different objects are mixed in vector, coercion occurs so that every element in the vector is of same class.

**Explicit Coercion :**- Objects can be explicitly coerced from one class to another using as. * functions, if available.

> x <- 0:6 > x [1] 0 1 2 3 4 5 6 > class(x) [1] "integer" > as.numeric(x) [1] 0 1 2 3 4 5 6 > as.logical(x) [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE > as.character(x) [1] "0" "1" "2" "3" "4" "5" "6" > x <- c("a","b","c") > as.numeric(x) [1] NA NA NA Warning message: NAs introduced by coercion > as.logical(x) [1] NA NA NA > as.complex(x) [1] NA NA NA Warning message: NAs introduced by coercion

**Lists:**- Lists are a special type of vector that can contain elements of different classes.

> x <- list(1,"a",T,1+4i) > x [[1]] [1] 1 [[2]] [1] "a" [[3]] [1] TRUE [[4]] [1] 1+4i

**Matrices:**- Matrices are vectors with a dimemnsion attribute.
- The dimension attribute is itself an integer vector of length 2 (nrow,ncol)

> m <- matrix(nrow = 2, ncol = 3) > m [,1] [,2] [,3] [1,] NA NA NA [2,] NA NA NA > dim(m) [1] 2 3 > attributes(m) $dim [1] 2 3 > m <- matrix(1:6, nrow=2,ncol=3) > m [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 > m <- 1:10 > m [1] 1 2 3 4 5 6 7 8 9 10 > dim(m) <- c(2,5) > m [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10 cbind-ing and rbind-ing > x<- 1:3 > y <- 10:12 > cbind(x,y) x y [1,] 1 10 [2,] 2 11 [3,] 3 12 > rbind(x,y) [,1] [,2] [,3] x 1 2 3 y 10 11 12

**Factors:**- Factors are used to represent categorical data.
- Factors can be unordered or ordered.
- One can think of a factor as an integer vector where each integer has a label.
- Factors are treated specially by modeling functions like lm() and glm()
- Factors with lables is better than using integers because factors are self-describing;
- e.g. “Male” and “Female” is better than a variable that has value 1 & 2

> x <- factor(c("yes","yes","no","yes","no")) > x [1] yes yes no yes no Levels: no yes > table(x) x no yes 2 3 > unclass(x) [1] 2 2 1 2 1 attr(,"levels") [1] "no" "yes" Order of the levels can be set using the levels argument to factor(). > x <- factor(c("yes","yes","no","yes","no"),levels = c("yes","no")) > x [1] yes yes no yes no Levels: yes no

**Missing Values:**- Missing values are denoted by NA or NaN for undefined mathematical operations.
- is.na() is used to test objects if they are NA
- is.nan() is used to test for NaN
- NA values have a class also, so there are integer NA, character NA, etc.
- A NaN value is also NA but the converse is not true.

> x <- c(1,2,NA,10,3) > is.na(x) [1] FALSE FALSE TRUE FALSE FALSE > is.nan(x) [1] FALSE FALSE FALSE FALSE FALSE > x <- c(1,2,NaN,NA,4) > is.na(x) [1] FALSE FALSE TRUE TRUE FALSE > is.nan(x) [1] FALSE FALSE TRUE FALSE FALSE

**Data Frames:**- Data frames are used to store tabular data.
- They are represented as a special type of list where every element of the list has to have the same length.
- Every elements of the list can be thought of as a column and length of each element of the list is the number of rows.
- Unlike matrices, data frames can store different classes of objects in each column (just like lists);
- Matrices must have every elemets be the same class.
- Data frames also have a special attributes called row.names
- Data frames are usually created by calling read.table() or read.csv()
- Data converted to matrix by calling data.matrix()

> x <-data.frame(foo = 1:4, bar = c(T,T,F,F)) > x foo bar 1 1 TRUE 2 2 TRUE 3 3 FALSE 4 4 FALSE > nrow(x) [1] 4 > ncol(x) [1] 2

**Names:**- R objects can also have names, which is very useful for writing readable code and self-describing objects.

> x <- 1:3 > names(x) NULL > names(x) <- c("foo","bar","nrof") > x foo bar nrof 1 2 3 > names(x) [1] "foo" "bar" "nrof" List can also have names. > x <- list(a=1,b=2,c=3) > x $a [1] 1 $b [1] 2 $c [1] 3 matrices > m <- matrix(1:4,nrow=2,ncol=2) > dimnames(m) <-list(c("a","b"),c("c","d")) > m c d a 1 3 b 2 4

## 0 comments:

## Post a Comment