NB: R Data Types and Operators

Programming for Data Science

R supports the following basic data types:

Let’s go over each briefly.

Numeric

Floating point numbers are called “numerics” in R.

It is the default data type.

If we assign a decimal value to a variable x, x will be of numeric type:

x <- 10.5      
x              
10.5

We can learn the type of data stored by a variable with the class() function.

class(x)      
'numeric'

Even if we assign an integer to a variable k, it is will still be saved as a numeric value.

k <- 1
k              
1
class(k)       
'numeric'

That k is not an integer can be confirmed with is.integer():

is.integer(k)  
FALSE

Integers

To create an integer variable in R, we use as.integer().

y <- as.integer(3) 
y              
3
class(y)       
is.integer(y)  
'integer'
TRUE

We can also declare an integer by appending an L suffix.

y <- 3L 

To see if a variable is an integer, you can use is.integer():

is.integer(y) 
TRUE

We can also coerce, or cast, a numeric value into an integer with as.integer().

as.integer(3.14)    
3

And we can parse a string for decimal values in much the same way.

as.integer("5.27")  
5

On the other hand, you can’t parse a non-decimal string.

as.integer("Joe")   
Warning message in eval(expr, envir, enclos):
“NAs introduced by coercion”
<NA>

We can convert booleans to numbers this way, too.

as.integer(TRUE)    
as.integer(FALSE)   
1
0

Math Operators

Numerics and integers are subject to the standard array of arithmetic operations.

Operator Description
+ addition
- subtraction
* multiplication
/ division
^ or ** exponentiation
x %% y modulus (x mod y) 5%%2 is 1
x %/% y integer division 5%/%2 is 2

Logical (Boolean)

Boolean data are called “logical” in R.

They take the values TRUE and FALSE, or T and F for short.

A logical value is often produced from the comparison between values.

x <- 1
y <- 2      
z <- x > y  
z           
FALSE
class(z) 
'logical'

Logical Operators

R supports the standard logical operations: & stands for “and”, | for stands for “or”, and ! stands for negation.

u <- TRUE
v <- FALSE
u & v
FALSE
u | v
TRUE
!u
FALSE

Again, you may use T and F instead of TRUE and FALSE.

a <- T
b <- F
a & b
FALSE

Characters

Strings are called “character” objects in R.

This may be confusing if you are coming from a language, such as Java, where “character” means an individual character, such as A.

We may convert non-character objects into characters with the as.character() function:

x <- as.character(3.14) 
x
'3.14'
class(x)       # print the class name of x 
'character'

paste()

Two character values can be concatenated with the paste() function.

fname <- "Joe"
lname <- "Smith" 
paste(fname, lname) 
'Joe Smith'

paste() takes a sep argument:

paste("A", "B", "C", sep="--")
'A--B--C'

Note that R does not overload the + operator.

fname + lname
ERROR: Error in fname + lname: non-numeric argument to binary operator

sprintf()

It is often convenient to create a readable string with the sprintf() function, which has a C language syntax.

sprintf("%s has %d dollars", "Sam", 100) 
'Sam has 100 dollars'

substr()

To extract a substring, we apply the substr() function.

Here is an example showing how to extract the substring between the third and twelfth positions in a string.

substr("Mary has a little lamb.", start=3, stop=12) 
'ry has a l'

sub()

And to replace the first occurrence of the word “little” by another word “big” in the string, we apply the sub() function.

This function can use regular expressions.

sub("little", "big", "Mary has a little lamb.") 
'Mary has a big lamb.'