The inner beauty of tree traversals

Posted by Tom Moertel Fri, 27 Jan 2012 04:24:00 GMT

This has been done a million times before, but if you haven’t seen it, it’s pretty neat. Let’s say you have a simple recursive data structure like a binary tree:

module Tree where

data Tree a = Empty
            | Node a (Tree a) (Tree a)
            deriving (Eq, Show)

-- some binary trees

t0 = Empty
t1 = Node 1 Empty Empty
t3 = Node 2 (Node 1 Empty Empty) (Node 3 Empty Empty)

Now say you want to traverse the nodes, in “preorder.” So you write a traversal function. It folds a binary operator f through the values in the tree, starting with an initial accumulator value of z. First, it visits the current node, then the left subtree, and finally the right subtree, combining each value it encounters with the current accumulator value to get the next accumulator value. The final accumulator value is the result.

Spelling out the steps, it looks like this:

preorder_traversal f z tree = go tree z
  where
    go Empty        z = z
    go (Node v l r) z = let z'   = f v z     -- current node
                            z''  = go l z'   -- left subtree
                            z''' = go r z''  -- right subtree
                        in z'''

But if you wanted an “inorder” traversal instead, you’d write a slightly different function, visiting the left subtree before the current node:

inorder_traversal f z tree = go tree z
  where
    go Empty        z = z
    go (Node v l r) z = let z'   = go l z    -- left subtree
                            z''  = f v z'    -- current node
                            z''' = go r z''  -- right subtree
                        in z'''

To see how the two traversal functions work, let’s use both to flatten the 3-node tree t3 we defined earlier.

flatten traversal = reverse . traversal (:) []

test0i = flatten inorder_traversal t3   -- [1,2,3]
test0p = flatten preorder_traversal t3  -- [2,1,3]

Great.

But now you start thinking about post-order traversal. Do you want to write a third traversal function that’s almost the same as the first two?

So you go back to the inorder traversal and start staring real hard at that let statement. Can you rewrite it? Yes:

inorder_traversal2 f z tree = go tree z
  where
    go Empty        z = z
    go (Node v l r) z = go r . f v . go l $ z

And now you’ve got it. Just by reordering the applications of (go l), (f v), and (go r), you can control the traversal.

Which leads to a general traversal function that lets some other function control that ordering:

traverse step f z tree = go tree z
  where
    go Empty        z = z
    go (Node v l r) z = step (f v) (go l) (go r) z

Doesn’t that look beautiful?

Now you can just spell out your desired traversals:

preorder   = traverse $ \n l r -> r . l . n
inorder    = traverse $ \n l r -> r . n . l
postorder  = traverse $ \n l r -> n . r . l

A test drive:

test1p = flatten preorder t3   -- [2,1,3]
test1i = flatten inorder t3    -- [1,2,3]
test1o = flatten postorder t3  -- [1,3,2]

And you’re happy.

But can you go further? What if your trees are binary search trees and you just want to find the minimum or maximum value? Can you just traverse to the left or right?

Yes:

leftorder  = traverse $ \n l r -> l . n
rightorder = traverse $ \n l r -> r . n

treemin = leftorder min maxBound
treemax = rightorder max minBound

test2l = treemin t3 :: Int  -- 1
test2r = treemax t3 :: Int  -- 3

Isn’t that neat?

Posted in
Tags , , , ,
5 comments
no trackbacks
Reddit Delicious

Fun with statistics: estimating blog readership (a do-it-yourself recipe)

Posted by Tom Moertel Thu, 23 Aug 2007 01:34:00 GMT

As everybody knows, statistics is fun. Is there anything cooler than crushing a heap of seemingly uninteresting numbers into gleaming jewels of meaning? Of course not! Models, data-visualization plots, and fat data sets are way cool. So, let’s find an excuse to play with them.

Here’s an excuse – I mean, an important and highly relevant question that many of us share: How many people actually read our blogs? To answer the question, we will need to use statistics, data, and cool plots. Further, if you’ve got the raw data for your blog, you can follow along with your own analysis. Even more fun!

We’ll start with a simple inspection of common web-log data, using command-line tools. After developing a rough understanding of what useful information we can extract, we’ll analyze the raw data using a series of successively more sophisticated techniques. In the end, we will derive a simple formula for estimating readership from easily obtainable data.

Sound good? Then let’s get rocking.

But first, a preemptive strike on would-be poo-pooers: I know all about FeedBurner. I know they will track my blog’s subscribers and use their mystical powers to infer the number of “real” subscribers I have. I know it’s all so easy. But easy isn’t the point. I want to understand what’s going on. Just taking somebody’s word for it isn’t nearly as satisfying as figuring it out yourself – nor as fun.

OK. For real this time, let’s get rocking.

Read more...

Posted in
Tags , , , ,
5 comments
no trackbacks
Reddit Delicious

I'm going to be a published photographer!

Posted by Tom Moertel Thu, 31 May 2007 04:01:00 GMT

Earlier today I received an email from the editor of a Pennsylvania-based magazine. (I won’t mention the name of the magazine in case what I’m about to write next amounts to a spoiler.) He asked if I would allow the magazine to publish one of my photographs of British soldier lichen in an upcoming issue.

Of course, I said yes. (I’m always looking for ways to spread the word about British soldier lichen.)

My fee? I asked for a free issue of the magazine when it goes to press. They said it will be in my mailbox.

Cool.

Posted in
Tags , , ,
no comments
no trackbacks
Reddit Delicious