class Ferrari458:
"This is a Ferrari 458 object"
= 8
cylinders
def print_origin(self):
"Returns a string"
return 'I was built in Italy!'
NB: Introducing Classes
Programming for Data Science
Introduction
Classes are a way of organizing code into bundles of variables and functions called attributes and methods.
Each class models some thing — a thing in the world, a process, a model, or just some convenient way of grouping code.
For example, a logistic regression model would have attributes like:
- weights
- an optional intercept term
- the maximum number of iterations
These attributes help describe the object; they give the object’s state.
The logistic regression model would have functionality such as:
- the optimization routine used in training
- a prediction function
The behavior, or functionality, is supported by methods, which are functions included in the class.
Here are a couple of other ways to think of a class:
- It provides a template for creating an object and for working with the object.
- It constitutes a kind of definition of something in the world.
A First Example
Ok, let’s look at examples, starting with a very small, simple class.
The class contains:
- a name
Ferrari458
- a docstring for a quick description
- an attribute, which is number of cylinders in the engine
- a method
You can learn about the class by printing the docstring:
Ferrari458.__doc__
'This is a Ferrari 458 object'
You can also get detailed help like this:
help(Ferrari458)
Help on class Ferrari458 in module __main__:
class Ferrari458(builtins.object)
| This is a Ferrari 458 object
|
| Methods defined here:
|
| print_origin(self)
| Returns a string
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| cylinders = 8
Next, we create an object from the class (also called an instance of the class).
It is called like a function.
The process is called instantiation.
= Ferrari458() myferrari
We can show the number of myferrari
cylinders by using the object.attribute
format:
myferrari.cylinders
8
We can call its method .print_origin()
to learn where the car was built:
myferrari.print_origin()
'I was built in Italy!'
Note that the method takes self
as its first argument.
By doing this, the method can use the self.attribute
and self.method()
pattern to access the attributes and methods contained in other parts of the class.
Here is an example, with the method .get_cylinders()
:
class Ferrari458_v2:
"""This is a Ferrari 458 object"""
= 8
cylinders
def print_origin(self):
return 'I was built in Italy!'
def get_cylinders(self):
return self.cylinders
= Ferrari458_v2()
myferrari myferrari.get_cylinders()
8
For the method .get_cylinders()
to see the attribute cylinders
, it has to access the attribute as a property of self
.
This is because the scope inside the class doesn’t behave like global scope in a module.
Each method can only see what is inside of it, or what is global to the code that defined the class.
Instead of having globals, it has the shared variable self
that methods can use to share information.
The Meaning of self
self
stands for the intantiated object itself.
It is a proxy in the template for an actual instance of the class.
So, to repeat what was said earlier, if you want your method to access the other attributes and methods of an object, you need to put self as its first argument.
Note that when you use the method with an instance, you don’t pass the object name as an argument:
myferrari.get_cylinders()
The object name myferrari
is passed implicitly by Python.
You can use any valid name you want for the name of the object itself, but the convention is to use self
.
Note that self
is only used within the methods of a class, not outside of it in the rest of the class definition.
Attributes defined outside of methods but inside a class are implicitly attached to self
.
The self
variable is the mechanism that allows methods to share data without having to pass and return a bunch of variables.
Think of self
as a data structure that stores the program itself
The .__init__()
method
There is a special method called .__init__()
that will initialize the state of an object when you create it.
Use it to supply more context-dependent information about your instance.
Let’s look at another version of the class with __init__()
.
class Ferrari458_v3:
"""this is a Ferrari 458 object"""
= 8
cylinders
def __init__(self, color):
self.color = color
def print_origin(self):
return 'I was built in Italy!'
def get_color(self):
return self.color
By adding the .__init__()
method, we can create objects if we pass the color.
If we don’t pass this parameter, there will be an error.
This is because we did not define a default value for the color argument in our initialization method.
= Ferrari458_v3() ferr1
TypeError: __init__() missing 1 required positional argument: 'color'
This works:
= Ferrari458_v3("red") ferr1
We can access the initialized attribute using the dot operator, just as if it were declared at the top of the class:
ferr1.color
'red'
Or we can call the accessor method that we created.
ferr1.get_color()
'red'
Note that even though we initialized the car object with “red”, we can always change it:
= "Cobalt" ferr1.color
ferr1.get_color()
'Cobalt'
Instance vs Class Attributes
Notice the difference between the cylinders
and the color
attributes.
class Ferrari458_v3:
"""this is a Ferrari 458 object"""
= 8
cylinders
def __init__(self, color):
self.color = color
The first is a class attribute.
It is defined outside of any method.
Its value will apply to all instances of the class, unless the instance overrides it.
The second is an instance attribute.
It is defined inside of a method.
Its value is meant to be changed with each instance.
Look what happens if we change the value of cylinders
in the class:
= 12 Ferrari458_v3.cylinders
ferr1.cylinders
12
The value will be changed with all of the instances created from the class.
Now, if we change the instance variable, the class is unaffected.
= 4 ferr1.cylinders
Ferrari458_v3.cylinders
12
Summary and Additional Info
An object is a self-contained bundle of methods and attributes.
- Methods are basically functions.
- Attributes are basically variables.
A class definition is a template for creating objects.
- Objects are class instances.
- Classes are object types.
Objects have their own scope, like functions.
When objects are first created, they often expected to have data passed to them.
- This is called initializing the object.
- These data are handled internally by the
.__init__()
method. - Data that are passed this way may be overridden by accessing the attributes they assigned to.
The methods of a class begin with self
as the first argument.
- This stands for the instance itself.
- All methods and attributes are available to all other methods in the object through the
self
object.
If a method does not have self
as its first argument, it cannot access the internal state or methods of the object.
- The internal state is just the attributes and their current values.
- These are called static methods.
- Static methods are useful in providing functions to the environment in which their containing object is instantiated.
There is a lot more to the subject, but this is good enough to get started!