NB: On Iterables and Iterators

Purpose - Define iterables and iterators - Using two methods, show how iterators can be used to return data from sets, lists, strings, tuples, dicts: - for loops
- iter() and next()

Specific Topics - iterable objects or iterables - iterators - iteration - sequence - collection

Defining Iterables and Iterators

Iterable objects or iterables can return elements one at a time.

An iterator is an object that iterates over iterable objects such as sets, lists, tuples, dictionaries, and strings.

Iteration can be implemented: - with a for loops - with the next() method

Next, we show examples for various iterables.

Lists

iterating using for

tokens = ['living room', 'was', 'quite', 'large']

for tok in tokens:
    print(tok)
living room
was
quite
large

iterating using iter() and next()

iter() gets an iterator. Pops out a value each time it’s used.

next() gets the next item from the iterator

tokens = ['living room','was','quite','large']
myit = iter(tokens)
print(next(myit)) 
print(next(myit)) 
print(next(myit)) 
print(next(myit)) 
living room
was
quite
large

Calling next() when the iterator has reached the end of the list produces an exception:

print(next(myit))
StopIteration: 

Next, look at the type of the iterator, and the documentation

type(myit)
list_iterator
# help(myit)
help(next)
Help on built-in function next in module builtins:

next(...)
    next(iterator[, default])
    
    Return the next item from the iterator. If default is given and the iterator
    is exhausted, it is returned instead of raising StopIteration.

Note that for implicitly creates an iterator and executes next() on each loop iteration. This is best way to iterate through a list-like object.

Sequences and Collections

We iterated over a list. Next we will illustrate for other iterables: str, tuple, set, dict

lists, tuples, and strings are sequences. Sequences are designed so that elements come out of them in the same order they were put in.

Sets and dictionaries are not sequences, since they don’t keep elements in order. They are called collections. The ordering of the items is arbitrary.

NOTE: This has changed for dictionaries in Python 3.7: > the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.
What’s New in Python 3.7

Sets

iterating using for

princesses = {'belle','cinderella','rapunzel'}

for princess in princesses:
    print(princess)
cinderella
belle
rapunzel

iterating using iter() and next()

princesses = {'belle','cinderella','rapunzel'}

myset = iter(princesses) # note: set has no notion of order
print(next(myset))
print(next(myset))
print(next(myset))
cinderella
belle
rapunzel

Strings

iterating using for

strn = 'data'

for s in strn:
    print(s)
d
a
t
a

iterating using iter() and next()

st = iter(strn)

print(next(st))
print(next(st))
print(next(st))
print(next(st))
d
a
t
a

Tuples

iterating using for

metrics = ('auc','recall','precision','support')

for met in metrics:
    print(met)
auc
recall
precision
support

iterating using iter() and next()

metrics = ('auc','recall','precision','support')

tup_metrics = iter(metrics)
print(next(tup_metrics))
print(next(tup_metrics))
print(next(tup_metrics))
print(next(tup_metrics))
auc
recall
precision
support

Dictionaries

iterating using for

courses = {'fall':['regression','python'], 'spring':['capstone','pyspark','nlp']}
# iterate over keys
for k in courses:
    print(k)
fall
spring
# iterate over keys, using keys() method
for k in courses.keys():
    print(k)
fall
spring
# iterate over values
for v in courses.values():
    print(v)
['regression', 'python']
['capstone', 'pyspark', 'nlp']
# iterate over keys and values using `items()`
for k, v in courses.items():
    print(f"{k}:\t{', '.join(v)}")
fall:   regression, python
spring: capstone, pyspark, nlp

Alternatively, keys and values can be extracted from the dict by: - looping over the keys - extract the value by indexing into the dict with the key

# iterate over keys and values using `key()`.
for k in courses.keys():
    print(f"{k}:\t{', '.join(courses[k])}") # index into the dict with the key
fall:   regression, python
spring: capstone, pyspark, nlp

Ranges

iterating using for

If you just want to iterate for a known number of times, use range().

for i in range(10):
    print(str(i+1).zfill(2), (i+1)**2 * '|')
01 |
02 ||||
03 |||||||||
04 ||||||||||||||||
05 |||||||||||||||||||||||||
06 ||||||||||||||||||||||||||||||||||||
07 |||||||||||||||||||||||||||||||||||||||||||||||||
08 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
09 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
10 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Get iteration number with enumerate()

Very often you will want to know iteration number you are on in a loop.

This can be used to name files or dict keys, for example.

enumerate() will return the index and key for each iteration.

courses
{'fall': ['regression', 'python'], 'spring': ['capstone', 'pyspark', 'nlp']}
for i, semester in enumerate(courses):
    course_name = f"{str(i).zfill(2)}_{semester}:\t{'-'.join(courses[semester])}"
    print(course_name)
00_fall:    regression-python
01_spring:  capstone-pyspark-nlp

Nested Loops

Iterations can be nested!

This works well with nested data structures, like dicts within dicts.

This is basically how JSON files are handled, BTW.

Be careful, though – these can get deep and complicated.

for i, semester in enumerate(courses):
    print(f"{i+1}. {semester.upper()}:")
    for j, course in enumerate(courses[semester]):
        print(f"\t{i+1}.{j+1}. {course}")
1. FALL:
    1.1. regression
    1.2. python
2. SPRING:
    2.1. capstone
    2.2. pyspark
    2.3. nlp

iterating using iter() and next()