Iterators

Iterators in Rust represent a venture into the functional programming paradigm. We've already used them before, it's what we go through in for-loops. They are objects which are used to iterate over collections in an abstract manner, so we're not dependent on the internal organisation of data when using a specific collection. But of course, we can go much further than that.

When trying to understand iterator methods, it needs a slight shift in your mental model compared to standard "procedural" loops. Java calls its equivalent the Stream API, and I think the name suits the mental model well. Instead of getting an item and then describing what we want to do to it, we can describe how the "stream" should manipulate any item that is coming through. Iterator methods provide a standardised structure for common patterns when iterating over items, and thus have a higher layer of abstraction. As a consequence, we need to define less structure (control flow and variables) ourselves, giving our code less space for errors to happen in. When learned, iterator methods are also easier to understand than parsing a whole loop and deducting what is going on. For me, especially selecting items bears a striking resemblance to designing SQL queries.

But as always, this was very theoretical and we want to see what it actually means. Taking the filter method for example, if you have a loop with an if statement that wraps the entire loop body, you can just replace it with filter like so:

// Example: Take a list of number, print even ones

fn example_procedural(values: &[usize]) {
  for &value in values {
    if value % 2 == 0 {
      println!("{}", value);
    }
  }
}

fn example_functional(values: &[usize]) {
  for &value in values
    .into_iter()
    .filter(|&&item| item % 2 == 0)
  {
    println!("{}", value);
  }
}

Note how we used a closure to tell the iterator method what we want to filter for. Also note that the example code is a special case: While we haven't called an iterator method on the first example function, it calls into_iter implicitly, which gives us an iterator consuming the value. As we're iterating over a reference (in form of a slice) and references implement the Copy trait, it uses copy semantics and the slice is still usable after the loop. We're then iterating over references of the inner elements though, so you have to account for that. Because the filter closure again expects a reference to the item that we're iterating over, we get the weird && notation.

If you try to iterate over an owned Vec instead, the vector will not be usable after the for-loop, but we will be iterating over the values itself, instead of references to it. Since usize has the copy trait, we can also transform the iterator from our example to use actual values instead, just like iterating over Vec would do. You can replace the head of our second for-loop with this, without any change in behaviour:

let data = vec![0, 1, 2, 3];
let values = &data[..];
for value in values.into_iter().copied().filter(|&item| item % 2 == 0)
{}

Note that iterators are lazy, that means they won't do anything until they yield elements or are consumed. Because of this, you should never use an iterator method purely for side effects.

There are also methods consuming an iterator to produce a single value, which can be used instead of iterating over the items using a for-loop. We will use some of them now.

To try and manifest the required mental model, I will now try to write my approach to a problem a friend of mine tried to solve: Given an extremely long number as a string (we will use &str), find the highest sum of 13 neighbouring digits.

First off, a string won't do for calculation, we need a collection of digits, so we want to convert the string into one. We look into the standard library documentation and find the str method chars, which returns us an iterator over each character. We will then try and convert them into an integer. For our intention, the map method looks just right. The char type also has a to_digit method for conversion. Finally, we want to collect our result into a vector. The collect method can collect items into any type implementing FromIterator, but type inference wouldn't work because of this, so we need to manually specify the collection we want.

After trying around a bit, we come up with this:

fn highest_sum_of_neighbouring_digits(input: &str) {
  // the element type can be inferred
  let digits: Vec<_> = input
    // iterator over chars
    .chars()
    // converts each char to a u32
    // can unwrap because we guarantee our string
    // to be a decimal number
    .map(|character| character.to_digit(10).unwrap())
    // create a new vector with the resulting values
    .collect();
}

Now we have our digits. For the second bit, we try to slice the requirement into multiple steps:

  1. We need all combinations of 13 neighbouring digits
  2. We need to calculate the sum for each combination
  3. We need to select the maximum of those sums

We consult the standard library again, and we find the windows method on slices. Since a vector can also act like a slice, we can also use this method. It returns an iterator over each possible collection of neighbouring values, completely fulfilling our first step.

The second step consists of condensing each combination down to the sum of each digit inside it. We are already iterating over them, so we can convert them using the map method again. As for how to calculate the sum, after consulting the standard library yet again, it provides us with the sum method as a specialised way of handling this. It's very handy since it consumes the iterator and spits out a single value for further processing. But at this point, we have to pay attention: Since each item returned by the Windows iterator consists of an entire window of 13 digits, we have to iterate over them inside the map method, so we're having an iterator inside an iterator!

let digits: Vec<u32> = vec![];
// step 1 and 2
// needs an element type because the inner `sum` can't infer it
let sums: Vec<u32> = digits
  // collections of neighbouring digits
  .windows(13)
  // map each collection to its sum
  .map(|neighbouring_digits| {
    // using copied because iterating over a slice
    // yields references instead of values
    neighbouring_digits.into_iter().copied().sum()
  })
  // create a new vector with the resulting values
  .collect();

Now for the final step, finding the maximum of those sums. Of course, the standard library gives us just the right iterator method for it again: max.

let sums: Vec<u32> = vec![0, 1, 2, 3];
// step 3
// can unwrap because we guarantee the string to contain
// at least one window
let highest_sum = sums.iter().copied().max().unwrap();

We now can finally put our function together! If we're especially fancy, we can even combine our last three steps into one, arriving at this final solution:

fn highest_sum_of_neighbouring_digits(input: &str) -> u32 {
  let digits: Vec<_> = input
    .chars()
    .map(|character| character.to_digit(10).unwrap())
    .collect();

  // last expression inside the function, therefore
  // evaluates to its return value
  digits
    .windows(13)
    .map(|neighbouring_digits| {
      neighbouring_digits.into_iter().copied().sum()
    })
    // directly looking for the maximum of the mapped sums
    .max()
    .unwrap()
}