Thursday, September 4, 2014

Python Idiom: First Occurence

Finding the first occurrence in a collection of data is a common problem. 

# Non Idiomatic
found_line = None
for line in logfile:
   if regex.match(line):
      found_line = line
return found_line

Compared to

# Idiomatic
return next(line for line in logfile if regex.match(line), None)


# Idiomatic (thanks to Suresh V)
from itertools import dropwhile
return next(dropwhile(lambda x: not regex.match(x), logfile), None)

The idiomatic solution is not only more compact, but it reads better.



  1. Idiomatic doesn't mean "pack as much into one expression as you can". I'd say instead that fixing your first example to avoid the unnecessary pieces:

    for line in logfile:
    ....if regex.match(line):
    ........return line
    return None

    makes it just as nearly idiomatic as your second version.

    1. I am not "packing as much into one expression as " I can. It is an often used and readable expression.

      Changing the example to use returns presumes that we don't want to use the iterator after. It also seems unnecessary to extract this logic in it's own function unless there is a significant number of additional operations occurring in the function.

  2. Evenn better, use itertools.dropwhile

  3. Your 'idiomatic' example does not work unless you wrap the generator expression in an extra set of parentheses.