-
Notifications
You must be signed in to change notification settings - Fork 3
/
3-Functions.Rmd
162 lines (154 loc) · 3.59 KB
/
3-Functions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
title: "3 - Functions"
author: "Joseph Rickert"
date: "September 28, 2015"
output: html_document
---
"A function is a group of instructions that takes inputs, uses them to compute other values, and returns the result" - Norm Matloff The Art of R Programming
## Get Some Data
```{r}
aq <- airquality[,1:4] # get the first four columns of the airquality data frame
dim(aq)
head(aq)
```
Make the aq column of airquality available in the global environment
```{r}
attach(aq)
```
Try to get a mean for Ozone.
```{r}
mean(Ozone) # Try to get a mean for Ozone
```
Eliminate NAs from the Ozone vetor and put the non-NA values into OZ
```{r}
Oz <- na.omit(Ozone)
length(Oz)
mean(Oz)
```
Get help with a function from the console
```{r}
help(mean) # getting help with an R function
#?mean # Same as the above
example(mean) # See an example of an R function
```
## Writing Functions
Let's write our own mean function
```{r}
jmean <- function(x){
m <- sum(x)/length(x) # where is m?
return(m)
}
jmean(Oz)
# R will through an error because m is not in the global environment
# m
```
Try again and improve the formatting
```{r}
jmean2 <- function(x){
m <- round(sum(x)/length(x),2)
return(m)
}
jmean2(Oz)
```
Why not give the user more control of the rounding process?
The magic 3 dots ... enable arguments to be passed to sub functions
```{r}
jmean3 <- function(x,...){
m <- round(sum(x)/length(x),...)
return(m)
}
#?round # look to see what parameters round is expecting
jmean3(Oz,3)
jmean3(Oz) # the default value for round is 0
jmean3(Oz,1)
```
## Functions calling Functions
How about giving the user a choice about which rounding function to use?
Look at the difference between round() and signif().
```{r}
pi
round(pi,4)
signif(pi,4)
```
This is the way to have one function call an other function
```{r}
jmean4 <- function(x,FUN,...){
m <- FUN(sum(x)/length(x),...)
return(m)
}
jmean4(Oz,round,4)
jmean4(Oz,signif,4)
```
## Functions with Defaults
One last try
- make round the default method of rounding
- make the default number of decimal digits 2
```{r}
jmean5 <- function(x,FUN = round,digits=3){
m <- FUN(sum(x)/length(x),digits)
return(m)
}
jmean5(Oz)
jmean5(Oz,FUN=signif)
jmean5(Oz,FUN=signif,digits=4)
```
## How to make a function return multiple values
A simple function that returns multiple values
```{r}
aq.noMV <- na.omit(aq)
mvsd <- function(x){
m <- sapply(x,mean)
v <- sapply(x,var)
s <- sapply(x,sd)
res <- list(m,v,s)
names(res) <- c("mean","variance","sd")
return(res)
}
mvsd(aq.noMV)
```
## Clean up the Global Environment
```{r}
detach(aq) # remove the aq data frame variables from the global environment
#
rm(aq) # remove aq from the global environment
#
#rm(list=ls()) # Get rid of everything. Be careful with this!!!
```
## Some Important, Helpful Functions
Missing Values
NA is the way to designate a missing value as we saw above.
```{r}
z <- c(1:3,NA)
z
```
But, the following logical expression is incomplete, and therefore undecideable, so everything is NA
```{r}
z == NA
```
Here is the proper way to search for NAs
```{r}
zm <- is.na(z)
zm
```
NaN is not a number
```{r}
z1 <- c(z,0/0) # NaN: not a number
z1
is.na(z1) # Finds NAs and NaNs
#
is.nan(z1) # Only finds NaNs
```
## Getting rid of NAs
na.omit removes the entire row containing an NA from a data frame
```{r}
a <- 1:10
b <- letters[1:10]
c <- LETTERS[11:20]
d <- 100:91
dF <- data.frame(a,b,c,d)
names(dF) <- c("v1","v2","v3","v4") # Give names to the variables in the data frame
dF$v1[3] <- NA
dF$v2[7] <- NA
dF
na.omit(dF)
```