Posted by: leppie | March 5, 2009

Tree traversal extension methods

public static class TreeExtensions
{
  public static IEnumerable<R> TraverseDepthFirst<T, R>(
    this T t,
    Func<T, R> valueselect,
    Func<T, IEnumerable<T>> childselect)
  {
    return t.TraverseDepthFirstWithParent(valueselect, childselect).Select(x => x.Key);
  }

  public static IEnumerable<KeyValuePair<R, T>> TraverseDepthFirstWithParent<T, R>(
    this T t,
    Func<T, R> valueselect,
    Func<T, IEnumerable<T>> childselect)
  {
    return t.TraverseDepthFirstWithParent(default(T), valueselect, childselect);
  }

  static IEnumerable<KeyValuePair<R, T>> TraverseDepthFirstWithParent<T, R>(
    this T t,
    T parent,
    Func<T, R> valueselect,
    Func<T, IEnumerable<T>> childselect)
  {
    yield return new KeyValuePair<R, T>(valueselect(t), parent);

    foreach (var i in childselect(t))
    {
      foreach (var item in i.TraverseDepthFirstWithParent(t, valueselect, childselect))
      {
        yield return item;
      }
    }
  }

  public static IEnumerable<R> TraverseBreadthFirst<T, R>(
    this T t,
    Func<T, R> valueselect,
    Func<T, IEnumerable<T>> childselect)
  {
    return t.TraverseBreadthFirstWithParent(valueselect, childselect).Select(x => x.Key);
  }

  public static IEnumerable<KeyValuePair<R, T>> TraverseBreadthFirstWithParent<T, R>(
    this T t,
    Func<T, R> valueselect,
    Func<T, IEnumerable<T>> childselect)
  {
    yield return new KeyValuePair<R, T>(valueselect(t), default(T));

    List<T> children = new List<T>();

    foreach (var e in childselect(t))
    {
      children.Add(e);
      yield return new KeyValuePair<R, T>(valueselect(e), t);
    }

    while (children.Count > 0)
    {
      foreach (var e in new List<T>(children))
      {
        children.Remove(e);
        foreach (var c in childselect(e))
        {
          children.Add(c);
          yield return new KeyValuePair<R, T>(valueselect(c), e);
        }
      }
    }
  }
}

// usage
interface INode<T>
{
  T Value { get; }
  IEnumerable<INode> Children { get; }
}

INode<int> root = ...

foreach (var i in root.TraverseDepthFirst(x => x.Value, x => x.Children))
{
  Console.WriteLine(i);
}

Responses

  1. What license is this under? Would you mind licensing under MIT/X11? I would like to add something like this to Mono.Rocks, which is MIT/X11.

  2. It’s free for all, public domain, BSD, whatever πŸ™‚

    Just stick my nick or name on top, all I need to be happy πŸ™‚

    I have developed a slightly better version, that I’ll update tomorrow, that supports having the ‘parent’ node passed into the IEnumerable too as a KeyValuePair for now.

  3. If you’re still up for improving it, I would suggest two further improvements:

    1. Use a “Select” prefix instead of “Traverse”, as this is more consistent with the framework.

    2. Make it an extension method on IEnumerable, not T. This is mostly for VB.NET support (which can’t invoke extension methods on object, which I assume means it can’t call extension methods on T). This would also better fit with a “Select” prefix.

  4. 1. I could do that πŸ™‚ Or you, I dont mind.

    2. I have a problem with that. A tree only has one root node, hence T. This also allows one to provide very abstract usability. It implies T can be anything, which it can and should be.

    Actually based on you last comment, I must say I prefer Traverse over Select πŸ™‚ You could always provide additional IEnumerable extensions that will follow better ‘Select’ semantics.

  5. Who should I attribute the author to? leppie? xacc.ide?

    • leppie is fine πŸ™‚ Sorry for delay been a bit busy at work.

  6. Update post, with extra methods, and fixed up the missing generics.

  7. Why do TraverseDepthFirstWithParent() and TraverseBreadthFirstWithParent() return a KeyValuePair (Result as Key, Parent as Value) instead of a KeyValuePair (Parent as Key, Result as Value)? Or is there a reason?

  8. Heh, no reason, coin flip, it’s not really a KeyValuePair. Just a pair of values, and I didnt feel like defining yet another class for that, and the order does not matter.

    I guess best will be to make a NodePair struct with Value and Parent properties.

    Feel free to make any changes. πŸ™‚

  9. The tree traversal methods have been added to Mono.Rocks:

    http://anonsvn.mono-project.com/viewvc/trunk/rocks/Mono.Rocks/Object.cs?view=markup

    The only major changes were using a KeyValuePair instead of a KeyValuePair, parameter validation, and the introduction of Create*Iterator() methods (so that the `yield’ statements are separate from the parameter validation logic, otherwise parameter validation would have delayed execution semantics).

  10. Silly question time: what are the semantics supposed to be of the Traverse*WithParent() methods? Specifically, what is the ‘parent’ suposed to be — the node that yielded the resulting value, or the parent of the node that yielded the resulting value.

    Because these two different definitions seem to be used interchangably: in TraverseBreadthFirstWithParent() you have:

    yield return new KeyValuePair(valueselect(e), t);

    in which the parent is the node that the value came from, and later in the same method you have:

    yield return new KeyValuePair(valueselect(c), e);

    in which the parent is the parent of the node that the value comes from.

    This seems quite inconsistent, and makes it difficult to reasonably document the functionality of these methods (how is the user supposed to know which set of semantics is in play, especially when it changes mid-method?).

    It makes me think that the *WithParent() methods should really be *WithInfo() methods, and return and IEnumerable, with:

    class TreeNodeTraversalInfo
    {
    public readonly TResult Value;
    public readonly TSource Node;
    public readonly TSource[] Path;
    }

    but maintaining Path is something that I’m not sure about — it may be too much overhead, as it’ll be a required array allocation for each returned node (yech).

    Though Path could instead be an IList, which would allow some sharing of the underlying array (but would still require a unique instance per item returned).

  11. The ‘parent’ is the node that contains node that the value is selected from.

    The easiest way to see what I mean is to simply use the identity function for the valueselector.

    In most trees, the child node will have no reference to it’s parent node. The with parent methods are simply to gather this information during the enumeration.

    I am not sure how you mean the semantics change mid-method. Had a quick look, but could not see anything wrong. Will have another look tomorrow. πŸ™‚

  12. I think the problem was a sudden inability to read code on my part. I’m no longer sure what I was thinking. 😦

    /me shakes his head to clear his thinking…

  13. πŸ™‚

    I did find a bug though with the Depth first traversal.

    i.TraverseDepthFirstWithParent(i, valueselect, childselect)

    should be

    i.TraverseDepthFirstWithParent(t, valueselect, childselect)

    Wrong parent! Maybe that caused some confusion, sorry if it did! πŸ™‚

  14. Fixed in code now too.


Leave a reply to leppie Cancel reply

Categories