<- runif(1, 0, 10) # From the Uniform Distribution
x
if(x > 3) {
<- 10
y else {
} <- 0
y
}
x
[1] 8.408442
y
[1] 10
if
/else
StatementsYou’ve seen this in Python.
They work the same way in both languages.
Here’s their syntax in R.
if(<condition>) {
## do something
}
if(<condition>) {
## do something
}
else {
## do something else
}
if(<condition1>) {
## do something
} else if(<condition2>) {
## do something different
} else {
## do something different
}
Generate a uniform random number:
<- runif(1, 0, 10) # From the Uniform Distribution
x
if(x > 3) {
<- 10
y else {
} <- 0
y
}
x
[1] 8.408442
y
[1] 10
You assign an if
statement to a variable.
<- if(x > 3) {
z 10
else {
} 0
}
z
[1] 10
You can stack if
blocks, too.
if(<condition1>) {
}
if(<condition2>) {
}
for
LoopsFor loops are straight-forward. The take an interator variable, e.g. i
, and assign it successive values from a sequence or vector.
For loops are often used to iteratE over the elements of an object (list, vector, etc.).
for(i in 1:10) {
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
According to Hadley Wickham, loops are pretty much the only looping construct that you will need in R.
The following three loops all have the same behavior.
<- c("a", "b", "c", "d")
x for (i in 1:4) {
# Print out each element of 'x'
print(x[i])
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
seq_along()
The seq_along()
function is commonly used in conjunction with for loops in order to generate an integer sequence based on the length of an object (in this case, the object x).
x
[1] "a" "b" "c" "d"
Generate a sequence based on length of ‘x’:
for(i in seq_along(x)) {
print(x[i])
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
It is not necessary to use an index-type variable.
for(letter in x) {
print(letter)
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
For one line loops, the curly braces are not strictly necessary.
for(i in 1:4) print(x[i])
[1] "a"
[1] "b"
[1] "c"
[1] "d"
for
loopsDor loops can be nested inside of each other.
<- matrix(1:6, 2, 3)
x for(i in seq_len(nrow(x))) {
for(j in seq_len(ncol(x))) {
print(x[i, j])
} }
[1] 1
[1] 3
[1] 5
[1] 2
[1] 4
[1] 6
Nested loops are used to generate multidimensional or hierarchical data structures (e.g. matrices, lists).
while
LoopsAs with Python, while loops start with a condition. It loops while the condition is true and stops when it is false.
Remembe, while loops can go on forever is the truth condition is never met.
<- 0
count while(count < 10) {
print(count)
<- count + 1
count }
[1] 0
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
repeat
Loopsrepeat
loops are using by R. They initiate an infinite loop right from the start.
The only way to exit a repeat loop is to call break on an internal condition.
next
and break
next is used to skip an iteration of a loop. Same as Python continue
.
for (i in 1:100) {
if (i <= 20) {
# Skip the first 20 iterations
next
# Do something here
} }
break
is used to exit a loop immediately.
for (i in 1:100) {
print(i)
if (i > 20) {
# Stop loop after 20 iterations
break
} }
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
[1] 11
[1] 12
[1] 13
[1] 14
[1] 15
[1] 16
[1] 17
[1] 18
[1] 19
[1] 20
[1] 21
Define some data
<- 5
x <- c(4, 6, 7, 8, 2, 11) xx
Now, define a function that does the following: - takes a value, vector of values as inputs - normalizes the value against the vector by subtracting the vector mean from value, and dividing by vector standard deviation.
<- function(val, vec) {
compute_zscore <- (val - mean(vec)) / sd(vec)
z
}
print(compute_zscore(x, xx))
[1] -0.4244764
If vector contains identical values, sd
is zero, and so the z-score is undefined.
print(compute_zscore(x, c(1, 1, 1, 1)))
[1] Inf
If vector contains missing values, the result will be NA
.
<- c(1, NA, 3, 5)
xx_na print(compute_zscore(x, xx_na))
[1] NA
A function returns 1 if passed value is odd, 0 if even
<- function(x){ if (x %% 2 == 1) {
is_odd return(1)
else {
} return(0)
} }
Call to test some cases:
is_odd(4)
[1] 0
is_odd(3)
[1] 1
Function arguments can use default values:
<- function(p, thresh = 0.5) {
threshold_vals # for each element in p, returns TRUE if value > thresh, else FALSE
> thresh
p }
threshold_vals(c(0.6, 0.4, 0.1, 1))
[1] TRUE FALSE FALSE TRUE
Now, pass a threshhold:
threshold_vals(c(0.6, 0.4, 0.1, 1), 0.7)
[1] FALSE FALSE FALSE TRUE
Assert important preconditions
<- function(x, y) {
add_vectors # assert the lengths of vectors x and y match
# if they do, sum elementwise, else throw error with stop()
if (length(x) != length(y)) {
stop("x and y must be the same length", call. = FALSE)
}+ y
x }
add_vectors(c(1, 2, 3), c(3, 3, 3))
[1] 4 5 6
add_vectors(c(1, 2, 3), c(3, 3, 3, 3)) # breaks
<- 4
z <- function(x) {
test_fcn ^z
x }
Now look at this:
test_fcn(2)
[1] 16
If z
isn’t defined in the function, how does this work?
R’s scoping rules are similar to Python’s.
Since z
isn’t in the function, R looks in the function’s environment for it.
For more on scoping rules, see Chapter 15: Scoping Rules of R in Peng’s R Programming For Data Science.