<- function(val, vec) {
compute_zscore - mean(vec)) / sd(vec)
(val }
NB: R Functions
Programming for Data Science
Functions are fundamental to R, as with most programming languages.
Syntactically, R functions are constructed by the function statement and assigned to a variable.
<- function(<args>) {
my.function <body>
return(<return_value>)
}
<args>
are arguments that may take default values.
Defaults are assigned with the =
operator, not <-
.
<body>
is code executed when the function is called.
<return_value>
is the value returned by the return()
.
Note the return()
is optional.
If not called, R will return the last variable in the body.
Let’s look at an example.
Here we define a function that computes Z-scores by doing the following:
First, it takes a value and a vector of values as inputs.
Second, it normalizes the value against the vector by subtracting the vector mean from value, and dividing by vector standard deviation.
Let’s test it with some sample data.
<- 5
x <- c(4, 6, 7, 8, 2, 11) xx
compute_zscore(x, xx)
Note that if vector contains identical values, sd
is zero, and so the z-score is undefined.
compute_zscore(x, c(1, 1, 1, 1))
Also, if a vector contains missing values, the result will be NA
.
<- c(1, NA, 3, 5)
xx_na compute_zscore(x, xx_na)
Here’s another example.
We write a function that returns \(1\) if passed value is odd, \(0\) if even.
Recall that %%
is modulus operator, which returns the remainder of a division operation.
<- function(x) {
is_odd if (x %% 2 == 1) {
return(1)
else {
} return(0)
} }
Call to test some cases:
is_odd(4)
is_odd(3)
Default Argument Values
Function arguments can use default values:
<- function(p, thresh = 0.5) {
threshold_vals > thresh
p }
Here we use the default thresh
.
threshold_vals(c(0.6, 0.4, 0.1, 1))
- TRUE
- FALSE
- FALSE
- TRUE
Now, pass a different threshold:
threshold_vals(c(0.6, 0.4, 0.1, 1), 0.7)
- FALSE
- FALSE
- FALSE
- TRUE
Error Trapping
You can assert important preconditions with stop()
.
Here, we assert that the lengths of vectors x and y match.
If they don’t. we throw an error with stop()
.
<- function(x, y) {
add_vectors if (length(x) != length(y)) {
stop("x and y must be the same length", call. = FALSE)
}+ y
x }
add_vectors(c(1, 2, 3), c(3, 3, 3))
- 4
- 5
- 6
Let’s see if it traps this error:
add_vectors(c(1, 2, 3), c(3, 3, 3, 3))
ERROR: Error: x and y must be the same length
Scoping Rules
Scoping rules for functions are similar to those in Python.
R uses the tinted glass model discussed earlier.
<- 4
z <- function(x) {
test_fcn ^z
x }
test_fcn(2)
Since z
isn’t in the function, R looks in the function’s environment for it.
Note that R handles potential scope conflicts differently to Python.
Recall that Python would not have allowed the following to run, since the function treats m
as both global and local.
<- 5
m <- function(x) {
test_2 print(m)
<- x**2
m print(m)
}
test_2(10)
[1] 5
[1] 100