<- 10.5
x x
NB: R Data Types and Operators
Programming for Data Science
R supports the following basic data types:
- Numeric
- Integer
- Complex
- Logical
- Character
Let’s go over each briefly.
Numeric
Floating point numbers are called “numerics” in R.
It is the default data type.
If we assign a decimal value to a variable x
, x
will be of numeric type:
We can learn the type of data stored by a variable with the class()
function.
class(x)
Even if we assign an integer to a variable k
, it is will still be saved as a numeric value.
<- 1
k k
class(k)
That k
is not an integer can be confirmed with is.integer()
:
is.integer(k)
Integers
To create an integer variable in R, we use as.integer()
.
<- as.integer(3)
y y
class(y)
is.integer(y)
We can also declare an integer by appending an L
suffix.
<- 3L y
To see if a variable is an integer, you can use is.integer()
:
is.integer(y)
We can also coerce, or cast, a numeric value into an integer with as.integer()
.
as.integer(3.14)
And we can parse a string for decimal values in much the same way.
as.integer("5.27")
On the other hand, you can’t parse a non-decimal string.
as.integer("Joe")
Warning message in eval(expr, envir, enclos):
“NAs introduced by coercion”
We can convert booleans to numbers this way, too.
as.integer(TRUE)
as.integer(FALSE)
Math Operators
Numerics and integers are subject to the standard array of arithmetic operations.
Operator | Description |
---|---|
+ | addition |
- | subtraction |
* | multiplication |
/ | division |
^ or ** | exponentiation |
x %% y | modulus (x mod y) 5%%2 is 1 |
x %/% y | integer division 5%/%2 is 2 |
Logical (Boolean)
Boolean data are called “logical” in R.
They take the values TRUE
and FALSE
, or T
and F
for short.
A logical value is often produced from the comparison between values.
<- 1
x <- 2
y <- x > y
z z
class(z)
Logical Operators
R supports the standard logical operations: &
stands for “and”, |
for stands for “or”, and !
stands for negation.
<- TRUE
u <- FALSE
v & v u
| v u
!u
Again, you may use T
and F
instead of TRUE
and FALSE
.
<- T
a <- F
b & b a
Characters
Strings are called “character” objects in R.
This may be confusing if you are coming from a language, such as Java, where “character” means an individual character, such as A
.
We may convert non-character objects into characters with the as.character()
function:
<- as.character(3.14)
x x
class(x) # print the class name of x
paste()
Two character values can be concatenated with the paste()
function.
<- "Joe"
fname <- "Smith"
lname paste(fname, lname)
paste()
takes a sep
argument:
paste("A", "B", "C", sep="--")
Note that R does not overload the +
operator.
+ lname fname
ERROR: Error in fname + lname: non-numeric argument to binary operator
sprintf()
It is often convenient to create a readable string with the sprintf()
function, which has a C language syntax.
sprintf("%s has %d dollars", "Sam", 100)
substr()
To extract a substring, we apply the substr()
function.
Here is an example showing how to extract the substring between the third and twelfth positions in a string.
substr("Mary has a little lamb.", start=3, stop=12)
sub()
And to replace the first occurrence of the word “little” by another word “big” in the string, we apply the sub()
function.
This function can use regular expressions.
sub("little", "big", "Mary has a little lamb.")