NB: Control Structures and Functions

if/else Statements

You’ve seen this in Python.

They work the same way in both languages.

Here’s their syntax in R.

if(<condition>) {
        ## do something
} 

if(<condition>) {
        ## do something
} 
else {
        ## do something else
}

if(<condition1>) {
        ## do something
} else if(<condition2>)  {
        ## do something different
} else {
        ## do something different
}

Generate a uniform random number:

x <- runif(1, 0, 10) # From the Uniform Distribution

if(x > 3) {
  y <- 10
} else {
  y <- 0
}

x
[1] 8.408442
y
[1] 10

You assign an if statement to a variable.

z <- if(x > 3) {
  10
} else { 
  0
}

z
[1] 10

You can stack if blocks, too.

if(<condition1>) {

}

if(<condition2>) {

}

Control Structures

for Loops

For loops are straight-forward. The take an interator variable, e.g. i, and assign it successive values from a sequence or vector.

For loops are often used to iteratE over the elements of an object (list, vector, etc.).

for(i in 1:10) {
  print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10

According to Hadley Wickham, loops are pretty much the only looping construct that you will need in R.

The following three loops all have the same behavior.

x <- c("a", "b", "c", "d")
for (i in 1:4) {
  # Print out each element of 'x'
  print(x[i])  
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"

seq_along()

The seq_along() function is commonly used in conjunction with for loops in order to generate an integer sequence based on the length of an object (in this case, the object x).

x
[1] "a" "b" "c" "d"

Generate a sequence based on length of ‘x’:

for(i in seq_along(x)) {   
  print(x[i])
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"

It is not necessary to use an index-type variable.

for(letter in x) {
  print(letter)
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"

For one line loops, the curly braces are not strictly necessary.

for(i in 1:4) print(x[i])
[1] "a"
[1] "b"
[1] "c"
[1] "d"

Nested for loops

Dor loops can be nested inside of each other.

x <- matrix(1:6, 2, 3)
for(i in seq_len(nrow(x))) {
  for(j in seq_len(ncol(x))) {
    print(x[i, j])
  }   
}
[1] 1
[1] 3
[1] 5
[1] 2
[1] 4
[1] 6

Nested loops are used to generate multidimensional or hierarchical data structures (e.g. matrices, lists).

while Loops

As with Python, while loops start with a condition. It loops while the condition is true and stops when it is false.

Remembe, while loops can go on forever is the truth condition is never met.

count <- 0
while(count < 10) {
  print(count)
  count <- count + 1
}
[1] 0
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9

repeat Loops

repeat loops are using by R. They initiate an infinite loop right from the start.

The only way to exit a repeat loop is to call break on an internal condition.

next and break

next is used to skip an iteration of a loop. Same as Python continue.

for (i in 1:100) {
  if (i <= 20) {
    # Skip the first 20 iterations
    next                
  }      # Do something here
}

break is used to exit a loop immediately.

for (i in 1:100) {
  print(i)
  if (i > 20) {
    # Stop loop after 20 iterations
    break  
  }     
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
[1] 11
[1] 12
[1] 13
[1] 14
[1] 15
[1] 16
[1] 17
[1] 18
[1] 19
[1] 20
[1] 21

Functions

Define some data

x <- 5                
xx <- c(4, 6, 7, 8, 2, 11)

Now, define a function that does the following: - takes a value, vector of values as inputs - normalizes the value against the vector by subtracting the vector mean from value, and dividing by vector standard deviation.

compute_zscore <- function(val, vec) {
  z <- (val - mean(vec)) / sd(vec)
}

print(compute_zscore(x, xx))
[1] -0.4244764

If vector contains identical values, sd is zero, and so the z-score is undefined.

print(compute_zscore(x, c(1, 1, 1, 1)))
[1] Inf

If vector contains missing values, the result will be NA.

xx_na <- c(1, NA, 3, 5) 
print(compute_zscore(x, xx_na))
[1] NA

Using conditions in functions

A function returns 1 if passed value is odd, 0 if even

%% is mod operator (returns remainder)

is_odd <- function(x){ if (x %% 2 == 1) { 
    return(1) 
  } else { 
    return(0)
  } 
}

Call to test some cases:

is_odd(4)
[1] 0
is_odd(3)
[1] 1

Function arguments can use default values:

threshold_vals <- function(p, thresh = 0.5) {
  # for each element in p, returns TRUE if value > thresh, else FALSE
  p > thresh
}
threshold_vals(c(0.6, 0.4, 0.1, 1))
[1]  TRUE FALSE FALSE  TRUE

Now, pass a threshhold:

threshold_vals(c(0.6, 0.4, 0.1, 1), 0.7)
[1] FALSE FALSE FALSE  TRUE

Assert important preconditions

add_vectors <- function(x, y) {
  # assert the lengths of vectors x and y match
  # if they do, sum elementwise, else throw error with stop()

  if (length(x) != length(y)) {
    stop("x and y must be the same length", call. = FALSE)
  }
  x + y
}
add_vectors(c(1, 2, 3), c(3, 3, 3))
[1] 4 5 6
add_vectors(c(1, 2, 3), c(3, 3, 3, 3)) # breaks

Scoping Rules

z <- 4
test_fcn <- function(x) {
  x^z
}

Now look at this:

test_fcn(2)
[1] 16

If z isn’t defined in the function, how does this work?

R’s scoping rules are similar to Python’s.

Since z isn’t in the function, R looks in the function’s environment for it.

For more on scoping rules, see Chapter 15: Scoping Rules of R in Peng’s R Programming For Data Science.