100
100
Programming for Data Science
Everything in Python is an object.
When we say ‘type,’ we mean object type.
Data types and data structures are both types of object.
Data types and structures are created by the way they are written.
When we write the raw values of a data type, we call these literals, meaning their literal value.
An integer is a sequence of one or more unquoted roman numerals.
100
100
643523453323
643523453323
-0
0
A float is a sequence of unquoted numerals with one and only one period.
3.14
3.14
Note that a period as a suffix or prefix will convert an integer to a float.
1, 1., .1
(1, 1.0, 0.1)
Note also that we are separating these numbers with commas.
Strings are sequences of quoted characters of any kind, i.e. numbers, letters, punctuation, etc.
Quotes may be be single '
or double"
.
The type of quote does not matter, but they must be straight quotes, not so-called “smart quotes” that some word processors use.
"foo"
'foo'
"1"
'1'
'foo'
'foo'
Note how Python’s internal representation of a string uses single quotes.
Note also that there is no explicit character type as in Java and other languages.
Some languages, such as Java and C, reserve a data type for single characters and create strings by putting these types into an array or some other data structure.
Boolean values are represented by the following unquoted reserved words:
True, False
(True, False)
Note that these are case-sensitive. The following will not work:
TRUE
NameError: name 'TRUE' is not defined
false
Python has a reserved data type for situations where there is no value to represent.
It evaluates to nothing!
None
print(None)
Complex numbers are created by combining imaginary numbers with other numbers.
Imaginary numbers are floats or integers with a j
suffix.
5 + 0j
5.0 / 100 + .1j
5.0 ** .1j
You can always find out what object type you are working with by calling the type()
function.
type(3.14)
type("foo")
type('foo')
type(True)
type(None)
Data are assigned to variables using the assignment operator =
, like so:
= 8
integerEx = 22000000000000000000000
longIntEx = 2.2
floatEx = "Hello"
stringEx = True
booleanEx = None noneEx
The variable name is always on the left, the value assigned to it on the right.
This is not the same as mathemtical equality.
When assigned a value, a variable inherits the object type of the assigned value,
In other words, variables are assigned types dynamically.
This is in contrast to static typing, where you define variables by asserting what kind of data values they can hold.
Python figures out what type of data is being set to the variable and implicitly stores that info.
Variables names may contain any combination of letters, numerals, and underscores.
The may not begin with a numeral.
Note that type()
returns the type of the value that a variable holds, not the type “variable”.
type(integerEx)
del()
When you assign a variable in Python, it stores that variable in memory along with its current value.
If you want to remove the variable and its value from memory, you can delete it with the del()
function.
= 101.25 x
x
del(x) # delete the variable x
x
Note that you cannot delete literal values!
del("foo")
id()
This function returns the identity of an object.
You can think of this number as the unique identifier of the variable in the table that Python uses to store variables and values in memory.
You can also think of it as the address of the object in memory.
This number that is guaranteed to be unique and constant for this object during its lifetime, i.e. during the program session.
= 55 integer_example
id(integer_example)
It is possible to convert between types.
For example, you may want to convert a float into an integer to save memory.
The process of converting a data of one kind into another is a called casting.
Sometimes conversions are “lossy” – you lose information in the process
int()
This converts a number or string into an integer where it makes sense to do so.
Float to Int
= 3.8
val type(val) val,
= int(val)
val_int type(val_int) val_int,
float()
This converts a string or integer into a float.
String to Float
= '3.8'
val_str type(val) val_str,
= float(val_str)
val_int type(val_int) val_int,
Note that converting string decimal to integer will fail.
= int(val_str)
val_int type(val_int) val_int,
ord()
This converts a character to it’s code point.
A code point is the internal number associated with each character in the character set used by Python.
The character set is called Unicode.
ord('a'), ord('A')
If variables are nouns, and values meanings, then operators are verbs.
Each data type is associated with a set of operators that allow you to manipulate the data in way that makes sense for its type.
For exampple, numeric data types are subject to mathematical operations, booleans to logical ones, and so forth.
The relationship between data types and operators is a microcosm of the relationship betweed data structures and algorithms.
Data structures imply algorithms and algorithms assume data structures.
Python suppports all the basic operations:
x + y
Addition
x - y
Subtraction
x * y
Multiplication
x / y
Division
Here some others:
//
Returns the result of a divions without the remainder.
5 // 2
-5 // 2
5.5 // 2
Note the data types of the returned values.
%
Returns the remainder
5 % 2
odd integers % 2 = 1
even integers % 2 = 0
Look at this …
5.5 / 2, 5.5 // 2, 5.5 % 2
**
Raises one number to the power of another.
5**3
+
When used with strings, the plus operator joins strings into larger strings.
The plus sign is an overloaded operator in Python.
= 'This: ' my_string
= my_string + ' Hello, world!' my_2nd_string
my_2nd_string
*
This joins a string to itself as many times as specified.
print('-' * 80)
= 'I will not skateboard in the halls' bart_S1E3
print((bart_S1E3 + '\n') * 5)
See them all :-)
=
We’ve used this already, but it too is an operator.
= 20 epoch
print('epoch:', epoch)
Comparisons are True
or False
questions based on comparing two values.
==
0 == (10 % 5)
'Boo' == 'Hoo'
>
, <
, >=
, and <=
10 < 5, 5. < 100 - 90
Note that we can compare strings, too:
'A' + 'B' == 'AB'
'A' < 'B'
We can compare relative magnitude because we are comparing their code points:
ord('A'), ord('B')
Watch out when comparing floats, though!
This works:
.5 == 1/2
But this does not:
= 0.1 + 0.2
x = 0.3 y
== y x
This is because the two values are represented differently internally:
x, y
You can overcome this by rounding both values, like so:
round(x, 2) == round(y, 2)
But note this fails:
round(x, 20) == round(y, 20), round(x, 20), round(y, 20)
This is because of how Python (and computers in general) handle floating point numbers.
The best soluation is to do this:
import math
math.isclose(x, y)
!=
5/9 != 0.5555
Python uses words where other languages will use other symbols.
and
, or
, not
Note the we group comparisons with parentheses.
= 10
x
% 10 == 0) or (x < -1) (x
% 10 == 0) and (x < -1) (x
not x == 5
is
The is
keyword tests if two variables refer to the same object.
Depending on object types involved, variable can sometimes point to the same object when we assign one variable to another.
The test returns True
if the two objects are the same object and False
if they are not.
Use the ==
operator to test if two variables store equal values.
is
= 'foo'
x = 'foo'
y = 'bar' z
is y, x is z x
not
not
flips the value of a boolean.
not True, not False, not 0, not 1, not 1000, not None
is not
tests if two variables do not point to the same object.
is not y, x is not z x
Python offers a short-cut for most operators. When updating a variable with an operation to that variable, such as:
= my_var + 1 # Incrementing my_var
You can do this:
+= 1 my_var
Python supports many operators this way. Here are some:
-= a
a = a
a \= a
a \\%= a
a *= a
a **= a a
Variables, literal values, and operators are the building blocks of ebxpressions.
For example, the following combines three operators and four variables:
1 + 2 * 3 / 2
Python employs operator precedence when evaluating expressions:
P – Parentheses
E – Exponentiation
M – Multiplication
D – Division
A – Addition
S – Subtraction
You can use parentheses to group them to force the order of operations you want:
1 + 2) * (3 / 2) (
Variables and literal values can be combined:
= 5
y = 2.5
m = 10 b
= m * 10 + b
y y
= m * 5 + b
y y
Expresssion can be very complex.
Expressions evaluate to a value, just as single variables do.
Therefore, they can be put anywhere a value is accepted.
int((y + 10) ** 8)