What’s in a Scala For Comprehension?

When I first started learning Scala the “For Comprehension” threw me for a bit of a loop (pun intended). On the surface it just seems like your every day for-each statement, however it does much more than that.  A For Comprehension not only offers the ability to iterate over something like a List, it also provides filtering options and the ability to generate new Lists1.

Not just looping…

For Comprehensions are able to actually “yield” results if so desired.  For example:

val sqlStatements = List[String]("select * from...", "selec...")
val resultSets = for (sql <- sqlStatements) yield new ResultSet(sql)

If you read my last article, you already know about the “map” function. As it turns out, Scala will “boil down” a For Comprehension containing a yield statement into a map function.  I know this because scalac, the Scala compiler, has a -Xprint option which shows you how the syntactic sugar used in every-day-Scala code “boils away”.

Using the command:

scalac -Xprint:typer <filename>.scala

[2]
scalac will show you that the For Comprehension in our example boils down to roughly[3]:

val resultSets: List[ResultSet] = 
   sqlStatements.map[ResultSet, List[ResultSet]](
      (sql: String) => new ResultSet(sql)
   )

As you can see, scalac did two things here:

  1. Added in all the type information you so lazily omitted
  2. Converted the body of the yield statement into a function it can pass to the higher-order map function

Ah hah!

It was at this point in my learning of Scala that I started to grok the power of Functional Programming.  Simple concepts like the map function can form the fundamental building blocks underlying the broader syntax which is Scala.

Back to our example…
What about For Comprehensions that don’t yield results?

for (sql <- sqlStatements) {
   println(sql)
}

One might wonder how the above statement “boils down” to code which uses map as map returns a value and our example renders no such value in absence of a yield statement. In this case, scalac will produce code which uses the foreach function instead:

sqlStatements.foreach[Unit](
  (sql: String) => scala.this.Predef.println(sql)
)

Filtering

Another trick For Comprehensions can do is filtering of data.  For instance:

val empList = List(
        Employee("Craig", 1000000),
        Employee("Amir", 50001),
        Employee("John", 40000)
    )

    /* Find all employees with a salary over 50,000        */
    val layoffs = for (emp <- empList
        if (emp.salary > 50000)) yield emp

In this case Craig and Amir wind up in the layoffs List and scalac boils down the statement to4:

val layoffs = empList.withFilter(
   (emp) => emp.salary>(50000)
).map(...)

What you are witnessing here is Scala unwinding the if statement in the For Comprehension and feeding the condition to a method called “withFilter”.  Essentially withFilter will produce a List which contains only the elements that have passed the condition.  It is this “filtered” List that map is then executed against.

But why didn’t Scala use the “filter()” method on List instead? The short answer is: several if statements within a For Comprehension (yes you can have more than one) will generate several chained calls to the filter function.  That sounds inefficient because you’ll wind up with statements being generated like so:

empList.filter(cond1).filter(cond2)...filter(condn).map(...)

The first call to filter will iterate the entire empList, the second call will iterate the list generated by the first call to filter, the third will iterate the list generated by the second call to filter, etc…

Scala uses withFilter instead because it delays evaluation of the filter conditional until absolutely necessary.  The List that withFilter generates and passes on down the chain is often referred to as a “View” a “Projection” or a “Non-Strict” List.  Instead of withFilter iterating all the elements in the List it is to act on and applying the conditional to each element, it simply returns a type-compatible wrapper around the List it was passed. Anytime something tries to iterate the wrapped list, the filter is applied to each element to see if it “exists” based on the conditional that withFilter was originally passed. Very clever indeed.

So while a chain of m filter() methods executing over a list of n elements would execute up to ntimes, in contrast a chain of withFilter methods executes n times and only at the time when you need to iterate the List; in our case when map is called.

Conclusion

To really understand why Scala is considered a Functional Programming language it’s well worth it to glimpse how Scala reasons about high level concepts in a functional manner.  Give the -Xprint:typer option a spin, just keep in mind you might have to factor out a lot of the “cruft” which is injected into your Scala code to help the compiler better reason about your code and it’s high-level syntax.

 

Footnotes

  1. For Comprehensions not only work for Lists, but in the context of an “Intro to For Comprehensions” we won’t use its generalized form
  2. For a list of phases you can use with -Xprint, type: scalac -Xshow-phases. typer is a common phase to use when you want to boil away most of the syntactic sugar
  3. I say “roughly” because there is a lot of cruft which gets added to the code to help the compiler out, I’ve omitted some of it and focus more at the core concept of map.
  4. Omitting the types here as they distract from the point at hand

Author: craig

Share This Post On
  • Pingback: This week in #Scala (04/11/2011) « Cake Solutions Team Blog

  • Vaughan

    Great article! I’ve been looking for a simple explanation of the Scala `for comprehension` for ages. The `-Xprint:typer` is a great tip too – I had no idea.

    I’ll be back for more Scala posts, keep em coming!

    Nice blog style too.

  • :::Blair Davidson:::

    Agree with Vaughn I just wanted a detailed explanation which highlights how it works. Great Article.