Showing posts with label nested-lists. Show all posts
Showing posts with label nested-lists. Show all posts

Wednesday, March 19, 2014

Flatten a list of lists with a list comprehension

By Vasudev Ram

Recently, I saw this comment thread on Hacker News:

Python Language Features and Tricks (sahandsaba.com), which was about this post:

30 Python Language Features and Tricks You May Not Know About

One of the comments on the HN thread was this:

Except from it:

"I don't understand how this one to flatten lists works:
a = [[1, 2], [3, 4], [5, 6]]
[x for l in a for x in l]
Can somebody explain what the order of operations is here and what the variables refer to in the various stages of evaluation?"

Another HN user answered the question, and I also commented on it.

But the interesting part was where another user posted below, in the same thread, this comment:

"So that flattening can also be written as:
x = [[1,2], [3,4], [5,6]]
    [x for x in x for x in x]
"
The interesting part here (to me) was not just the use of a list comprehension to flatten the list, but the fact that the same variable is used throughout the list comprehension; in fact it is used 5 times in the expression.

I tried it out, and it works.

Then I thought of checking whether the same kind of code would work for a triply-nested list, and found that it did work:

# Flatten a list that is nested to three levels with a list comprehension.
x = [[[1,2], [3,4]], [[5,6], [7,8]]]
print "before flatten, x =", x
y = [x for x in x for x in x for x in x]
print "after flatten, y =", y

That raises the question of how the same variable name (x) can be used multiple times in the same scope to mean different things, as in the above two code fragments (for the doubly- and triply-nested lists). The explanation I can think of is that the different uses of x in the list comprehension are actually in different scopes, even though part of the same expression. Not sure if that is the right answer or not.

Update: I tried to check my explanation more by writing this - a normal for loop, not a list comprehension, which also uses the name x multiple times:
x = range(5)
for x in x:
    print x
And it worked. So it looks like my explanation (that there are different scopes) may be right, or at least close.

The HN user (jamesdutc) who posted the code with "[ x for x in x for x in x ]" has a blog here with more Python posts that look interesting:

seriously.dontusethiscode.com

Read other Python posts on my blog.

- Vasudev Ram - Dancing Bison Enterprises

Contact Page