NB: Namespaces and Paths

Programming for Data Science

What is a Namespace?

You can see that a python module acts as a unit for organizing a collection of elements or attributes:

  • functions
  • constants
  • classes
  • etc.

All of these things are accessed by the formula:

module.attribute

The module name in this formula defines the namespace for the attributes it contains.

So, a namespace holds a collection of currently defined names being used by a program.

You can think of it as something like a Python dictionary.

The keys are the module or package names and the values are the attribures contained by them.

It’s a way of making sure variable and function names do not collide or get confused with each other.

So, two functions with the same name from different modules can be used in the same program if they are called with their namespaces:

module1.my_function()
module2.my_function()

Namespace Levels

Python has four namespace levels:

B: Built-In: Contains the names of all of Python’s built-in objects.

G: Global: Contains any names defined at the level of the main program.

  • A global namespace is also created for any module that your program imports.
  • In other words, global refers to the top-level namespace within a module or file.

E: Enclosing: The namespaces of a function for any functions defined within that function.

  • We saw this with nonlocal when going over functions and scope.

L: Local: Contains any names defined inside of a function.

To know the context in which a name has meaning, Python searches namespaces from the inside out.

L -> E -> G -> B

image.png

Here is a demonstration of namespaces:

g = 100

def foo():
    x = y = z = 1
    print("Locals in foo:", locals())
    
    def bar():
        a = b = c = 2
        print("Locals in bar:", locals())
        print("Global g:", globals()['g'])
        
    bar()
foo()
Locals in foo: {'x': 1, 'y': 1, 'z': 1}
Locals in bar: {'a': 2, 'b': 2, 'c': 2}
Global g: 100

Notice how namespaces are related to scope.

How Python Finds Things

How does Python know where to find modules?

The interpreter keeps a list of places that it looks for modules or packages when you do an import.

You can access this list from the path attribute in the sys module.

import sys

sys.path
['/sfs/qumulo/qhome/rca2t/Documents/MSDS/DS5100/repo-book/notebooks/M09_PythonModules',
 '/apps/software/standard/core/jupyterlab/3.6.3-py3.11/lib/python311.zip',
 '/apps/software/standard/core/jupyterlab/3.6.3-py3.11/lib/python3.11',
 '/apps/software/standard/core/jupyterlab/3.6.3-py3.11/lib/python3.11/lib-dynload',
 '',
 '/home/rca2t/.local/lib/python3.11/site-packages',
 '/apps/software/standard/core/jupyterlab/3.6.3-py3.11/lib/python3.11/site-packages']

You can edit that list to add or remove paths to let python find modules on a new place.

sys.path.append(some_local_dir)

Relative vs Absolute Paths

You will sometimes see a dot . used in the import statements found in package intitialization files.

It is used in the context of a from statement.

For example:

from . import funniest

or

from .funniest import joke

The dot is used to create relative path names to packages and modules.

This is in contrast to absolute path names, which is how Python accesses things by default.

With absolute path names, Python will interpret package and module paths from the perspective of the project directory.

The project directory contains the file that is importing the module, sometimes called the main file.

When you are just importing a module that is in your directory, the project directory is the directory of your script.

When you are importing a module that has been installed using setup.py, the project directory is the directory that contains the setup file.

We’ll look at setup files in another notebook.

With relative path names, when you import modules in an __init__.py file within a package directory, the dot . refers to the current package or module’s namespace, not the calling file’s.

For example, consider a package structure like this:

myproject/
    main.py
    mypackage/
        __init__.py
        module1.py
        module2.py

Imagine that main.py is your program, and inside of it you want to import the modules in mypackage to do some things.

Also imagine that __init__.py contains an import statement to pre-import module1 and module2:

from mypackage import moudle1, module2

So, you might do this from main.py:

import mypackage as mp

mp.module1.function1()

Now, when myproject is imported into main, the path in __init__.py is interpreted from the perspective of the calling module, i.e. main.py.

This is why even though __init__.py is inside of mypackage, it will include mypackage in the import path, as if mypackage were below the directory that contains __init__.py.

To override this behavior, and have __init__.py use paths relative to itself, you can use a dot . to stand for the current directory.

So, if you import module1 using a relative import with a dot ., it would look like this:

from . import module1, module2

This will probably not make much sense to you now.

We will learn more about paths when we go over creating your own packages.