As far back as during the design for C# 3.0 the design team contemplated a possible language feature to solve the problem of quadratic iterators (the yield foreach syntax) that was never implemented for a variety of reasons and put aside, and while it has been discussed quite a bit by the community on and off it has never gathered enough traction to move forward within the design team. I took it upon myself during my holiday break (2021) to investigate deeper into this problem area and its possible solutions. I present my journey here in my typical rambling fashion.
Iterators help me quicky write methods that represent sequences of values for data that I don’t already have in a single collection, without requiring me to write a complicated state machine in the form an IEnumerator implementation.
I just sprinkle a few yield return statements into my method and the compiler generates that state machine for me.
I’ve written myself a humdinger of a binary tree class where each node in the tree has a possible reference to a left and right node, etc. I’ve supplied public properties for both Left and Right, but I feel compelled for some inexplicable reason to define a property called LeftAndRight that is a collection of both left and right property values. So, I write it an iterator painlessly using the existing C# feature, and everything looks right and works right.
public IEnumerable<Node> LeftAndRight
{
get
{
if (Left != null)
yield return Left;
if (Right != null)
yield return Right;
}
}
However, the next day my demanding teammates come to me and say, that’s nice, but what we really could use is a DescendantsAndSelf property that iterates the entire sub-tree in infix order. No problem, I say, that should be just as easy. But then I do the only thing I can, and write a quick-and-dirty recursive iterator like this:
public IEnumerable<Node> DescendantsAndSelf
{
get
{
if (Left != null)
{
foreach (var ld in Left.DescendantsAndSelf)
yield return ld;
}
yield return this;
if (Right != null)
{
foreach (var rd in Right.DescendantsAndSelf)
yield return rd;
}
}
}
The teammates are happy, all the tests pass, we ship the product, and all is right in the world. Until a customer with an unexpectedly large dataset shows up with a concern about the product’s poor performance because it appears to freeze for minutes every time they press the mission critical do-it-now button.
What went wrong?
The recursive iterator that seemed so simple and obvious to write, is the problem. Each node’s DescendantsAndSelf method keeps repeatedly yielding the same descendant nodes all the way back up from the journey down the tree. The recursive iterator is operating with non-linear complexity. In its worst case, the complexity is O(n^2). It has become quadratic.
Everything I’ve ever learned about computer science is telling me this is a very bad place to be. But there is no easy way to write this kind of iterator without implementing my own IEnumerator state machine, a daunting task (or maybe a nice challenge when I have a lot more time.)
Note: A balanced binary tree would have performance more like O(n log n) and most real world scenarios would not actually be truly quadratic. What we are really fighting against is algorithms that are non-linear, when they could otherwise be.
Eric Meijer did a lot of thinking about this problem back during the development of C# 3.0, and at the time he was advocating for the team to solve this problem with a new syntax that would have allowed me to yield entire collections at a time instead of repeatedly iterating them at each level.
He eventually cowrote a paper on iterators where he describes and solves the recursive iterator problem here: specsharp-iterators.pdf. His solution described in the document requires a different translation to IL that the C# language uses for a typical non-recursive iterator.
As far as I can recall, this non-trivial change to the code generator was the major stumbling block for moving forward with this feature. It was considered to be a large undertaking and the benefit seemed marginal at best. People writing special-cased iterators for recursive data structures like binary trees were few, and they tended to be the sort of developer that might not have minded the challenge of writing a stack-based non-recursive iterator or even a custom state machine IEnumerator to get the best performance.
Yet as it turns out, it was not all that difficult to find myself in a position of wanting to write a recursive iterator, because it’s not too uncommon to need a data model for an everyday scenario that ends up being recursive in nature.
For example, imagine I am building a HR application that needs to represent employees in a typical hierarchy that I have fetched from a database. I have defined an Employee record that contains a DirectReports property that is a collection of all the employees that directly report to this manager.
public class Employee
{
public string Name { get; }
public List<Employee> DirectReports { get; }
}
I can easily navigate the whole tree of reports for any managing employee using the DirectReports property but to do this, I need to write recursive functions, something fictional me dreads to do.
I would rather give the employee class an AllReports property that returns the set of all employees that report through to the one that is the overall manager, allowing me to just use a foreach statement for straight-forward iteration instead of recursion.
Yet, I am now faced with the exact same problem that I had with the binary tree. The only difference is that I at least have a collection to start with, but it’s not the collection I want the property to return, so I write this instead:
public IEnumerable<Employee> AllReports
{
get
{
foreach (var r in DirectReports)
{
yield return r;
foreach (var rr in r.AllReports)
{
yield return rr;
}
}
}
}
Which causes the same quadratic problem as before, but now I may not be at liberty to expend the effort to write a performant solution for just this one property. After all, I get paid to get projects done, not navel gaze.
If C# had Erik’s solution, I would have been able to write this to get the better performance and it would have been more concise to boot.
public IEnumerable<Employee> AllReports
{
get
{
foreach (var r in DirectReports)
{
yield return r;
yield foreach r.AllReports;
}
}
}
The compiler would have figured it out and generated the iterator logic so as not to be so brain-dead quadratic. Do what I want, not what I say!
Before I started this investigation, I imagined I would begin by taking a stab at modifying the compiler to actually generate the logic that Erik’s paper suggests. Yet when I sat down to actually do it, it seemed a bit daunting. I’m not shy about making large code changes, but the thought of committing my entire holiday break to it seemed a bit excessive. I posited exploring alternatives first would probably be a much better use of my time, in case there were possible solutions that did not even need a compiler change. Library methods are always a good place to start.
For a simple contrived example like the Employee record, where I have a nice single collection exposed on the class, I imaged a function existing, like a new kind of recursive LINQ operator, called Flatten that would flatten a hierarchy, if I told it how to do so.
So, I made one, and used it in the Employee class like this:
public IEnumerable<Employee> AllReports =>
this.DirectReports.Flatten(r => r.DirectReports);
It works like a recursive SelectMany. It walks the tree for me and is linear (non-quadratic). It is the state machine I no longer need to write each time, because I did it once and can reuse it. If I packaged up a function like this for general consumption, then no one would ever need to have the compiler do anything special in order to get a non-quadratic iterator, because I did not even need to write an iterator method at all.
Or so, as long as the recursion is via just this one nested collection.
Yet what if instead of a single collection of employees, I had two or more. Maybe there are different kinds of reports, some are regular staff, others are contingent, and for some reason I chose to place them into separate collections. Do I need the compiler to help me now?
No, not really. I was also able to imagine an advanced form of the Flatten method that takes multiple selector functions, where each one returns a collection of employees that is then recursed upon.
Of course, I made one just to be sure, and used it like this:
public IEnumerable<Employee> AllReports =>
this.DirectReports.Flatten(r => r.DirectReports, r => r.ContingentReports);
But why stop there?
Surely, not all iterators I’ve ever written boil down to just a set of collections and items that can be enumerated in a straight sequence. If the iterator has such advanced logic as loops, or just an occasional if-then-else then a simple library function like I’ve defined would be quite insufficient.
Given any particular iterator method, I might be able to imagine a special case Flatten function that would do the job, for it alone, or possibly for a well-defined class of methods. But, of course, that would mean writing custom Flatten functions for each exotic iterator, which would be the equivalent of writing custom iterators, which I am trying to avoid.
It was seeming like there was no way to continue putting off having to experiment with modifying the compiler. Which I still did not want to do. So, instead I went looking for more insight.
Returning to Erik’s paper and painstakingly picking it apart, I concluded that what it was attempting to do is not as complicated as it first seemed, and definitely not as complicated as the suggested change to compiler generated logic would imply. Each kind of yield (yield return or yield foreach) simply yields a single thing, be it, in my example, an employee or a collection of employees. But what if instead of new code gen, I just used the existing code gen together with a library function?
Instead of writing an iterator of employees, what if I just wrote an iterator of a combined type that can be either an employee or a sequence of employees. That would let me yield both using the existing yield return syntax.
Alas, C# does not have union types yet, so I had to make do with my own poor man’s union type that I defined to hold either of those two things. I called it Chunk<T> for reasons.
With this, I was able to declare an iterator that returns chunks containing either a single employee or a collection of employees.
IEnumerable<Chunk<Employee>> ChunkyAllReports
{
get
{
foreach (var r in this.DirectReports)
{
yield return r;
yield return r.AllReports;
}
}
}
And then after implementing a new version of Flatten that operates on sequences of chunks, I exposed the AllReports property like this:
public IEnumerable<Employee> AllReports =>
ChunkyAllReports.Flatten();
The Flatten method can now peel apart the nested sequences inside the chunks that are returned without them being iterated an additional time inside the ChunkyAllReports method.
Maybe instead of Flatten, I should have called the method Smooth.
I was now able to use the iterator method feature of C# to write any iterator and not have to invent the complicated part that enables it to be non-quadratic each time. It can just be dependent on a library function and the compiler does not need to add any new features.
Or so I thought.
Unfortunately, there is a still a tiny problem. It’s still quadratic. Can you spot the problem? I’ll give you a minute.
You see, this chunking trick, as currently stated, only works one level deep. Once the flatten method iterates on the collection stored in a chunk, that collection itself will recursively start a new iteration, which will lead to a new call to Flatten and so on. There are still unnecessary levels of re-iteration of the nested collections. I saved myself maybe half of them, but it is still n-squared.
What I really needed to do instead was to have the chunk capable of holding a sequence of chunks, not a sequence of T’s. That way, when the first Flatten function is called, it can operate over the whole tree, unpacking each chunked collection as they are encountered causing the deeper collections to bubble upward without additional yielding.
So, I changed the definition of Chunk<T> to allow for T’s, IEnumerable<T>’s, and IEnumerable<Chunk<T>>’s. And now I can write the iterator like this:
public IEnumerable<Employee> AllReports =>
ChunkyAllReports.Flatten();
IEnumerable<Chunk<Employee>> ChunkyAllReports
{
get
{
foreach (var r in this.DirectReports)
{
yield return r;
yield return r.ChunkyAllReports;
}
}
}
Which is much better. It is finally non-quadratic and can encompass any iterator logic I want to write without needing to write complicated stack or state machines.
Not so fast, you say. My Flatten method solution works great if I always have access to the other collections in their chunky forms within the iterator method, which is true when the iterator only ever recurses down upon the same type that it itself is a member of. Yet, what if my iterator needs to drill down through a completely different type that also exposes the same kinds of collections through some symbiotic multi-type recursive hierarchy: A’s hold B’s and B’s hold A’s, etc.
I don’t want to have to expose the chunky member as part of the public model. It should just be an implementation detail, just my preferred way to have my automatic iterator code-gen and eat my non-quadratic cake too.
What would be better, is if the returned type from the iterator could quietly expose the fact that it can be iterated as chunks, so I don’t need to have access to that additional method or property that exposes it. That way, the consuming iterator logic would not need to know anything. It could just yield the collection like I tried in the unfortunately still quadratic example, and the Flatten function could via the interface find and use the chunks instead.
I defined a IChunkEnumerable<T> interface with a GetChunkEnumerator() method, and then a FlattenedEnumerable<T> class that wraps my chunky sequence and pretends that it’s a normal sequence with an automatic call to Flatten when you enumerate it.
public interface IChunkEnumerable<T>
{
IEnumerator<Chunk<T>> GetChunkEnumerator();
}
public class FlattenedEnumerable<T> : IEnumerable<T>, IChunkEnumerable<T>, ... {
private readonly IEnumerable<Chunk<T>> chunky;
public IEnumerator<T> GetEnumerator() =>
chunky.Flatten().GetEnumerator();
IEnumerator<Chunk<T>> IChunkEnumerable<T>.GetChunkEnumerator() =>
chunky.GetEnumerator();
…
}
And now my iterator looks like this:
public IEnumerable<Employee> AllReports =>
new FlattenedEnumerable<T>(ChunkyAllReports);
IEnumerable<Chunk<Employee>> ChunkyAllReports
{
get
{
foreach (var r in this.DirectReports)
{
yield return r;
yield return r.AllReports;
}
}
}
Of course, if I actually wrote an example with A’s and B’s instead it might seem more obvious that being able to write something like r.AllReports instead of r.ChunkyAllReports is useful. It allows the iterator to yield references to any kind of collection, and if that collection can be iterated as chunks, then the overall solution is non-quadratic without the logic of the iterator that I write needing to know the difference.
It is now what you might call opportunistically non-quadratic.
You might think that I’ve made a compelling argument not to have a compiler feature at all. A library with a few helper methods and types makes it possible to write non-quadratic iterators with ease without any language changes. I’ve solved it. We can all go home now.
However, a case for a compiler feature can still be made.
The biggest, most compelling reason to still have the compiler feature is that with the existence of the library, the change to the code gen is now very minor. Instead of a just an iterator method (and the types and methods that get generated for it), now the compiler just needs to generate an additional property or method with a trivial body. Its basically just syntax sugar.
Wouldn’t you rather write this:
public IEnumerable<Employee> AllReports
{
get
{
foreach (var r in this.DirectReports)
{
yield return r;
yield foreach r.AllReports;
}
}
}
Instead of this:
public IEnumerable<Employee> AllReports =>
new FlattenedEnumerable<T>(ChunkyAllReports);
IEnumerable<Chunk<Employee>> ChunkyAllReports
{
get
{
foreach (var r in this.DirectReports)
{
yield return r;
yield return r.AllReports;
}
}
}
It might just be syntax sugar, but it’s syntax sugar that lets the compiler offer a solution to the quadratic iteration problem. I don’t need to write two members each time and know about chunkiness. I just need to write an iterator and use the shorthand of the ‘yield foreach’ whenever I want to yield out the values of a collection. I don’t even have to know I’m doing it to avoid a quadratic explosion in a recursive scenario. I can do it for all my iterators that need to yield items stored in a collection, even if its not recursive and I don’t have a performance problem. The solution is opportunistically non-quadratic. So, while I can solve the problem using a library. It’s a better solution with a compiler feature using the library for me.
The discussion over quadratic behavior for an iterator revolves around the number of times any particular item is yielded. The quadratic part comes into existence when after an item is yielded from a collection deep down in a hierarchy, it is yielded again and again on its way back up to the top due to the recursive nature of iterators invoking other iterators, etc. We call this repeated yielding quadratic when the depth of the hierarchy itself is a function of N, the number of items, giving us the dreaded O(N^2). If the depth of the hierarchy was not dependent on the number of items in it, then it would not be considered quadratic. It would still be plagued with unfortunate redundant yields, but linear.
Yet, the proposed non-quadratic solution has another kind of yielding going on. The collections themselves are also being yielded. So, truly, the total number of yields happing in any iteration is more like N + C, where C is the number of collection yields. In this example, the number of collection yields is exactly the same as the number of item yields, since each item has a single collection. So, there are actually 2N yields happening, which is still linear. Even in a hypothesized case with each item having multiple different collections, there will be many more collection yields than item yields, but it also just be a fixed multiple, and so still linear. Lastly, if the number of collection yields has nothing to do with the number of items, then it will be just an unrelated fact and not contribute to the linearity either.
So, all in all, collection yields do not influence the linearity of the solution. They just add a bit of overhead that might be avoided in a custom solution.
While I did my due diligence by implementing tests for a variety of structures and captured interesting data like yield counts and allocations, I did not actually measure performance. I assumed that the performance would be relative to the other counts. It might be worthwhile to verify if this is true.
While unrelated to the quadratic or non-quadratic nature of an iterator, unnecessary allocations can still play a significant role in performance. Whenever someone iterates a collection, typically an enumerator is allocated. This can be avoided with custom built struct-based enumerators. Iterator methods are not any better, since they require allocation of an enumerable object and enumerator. A performance optimization allows these to be the same object, but there will always be an allocation.
This is generally not a problem unless the collection or iterator is being enumerated often in a performance critical section of code. However, this problem is exacerbated by recursive iteration over an entire hierarchical data structure where that one iteration is actually iterating all the nested collections as well, even if the iteration algorithm is non-quadratic. If each item contains a collection that must also be iterated, each time you iterate over the items in the structure you are allocating N enumerators, if not more. So, even in a not-so performance critical section of code, this can still be a problem with large amounts of data.
However, this allocation problem is not new. The original quadratic iterator example also had the same problem, since it directly re-iterated each nested collection. So, with respect to solving the quadratic yielding problem, this is not really a consideration. But it is still worth looking at.
Of course, a custom enumerator designed specifically to walk over a given data structure, especially one with public collections with indexers, could probably be written to avoid any extra allocations. But the problem space does involve iterator methods that can have arbitrary logic that cannot be so trivially indexed, even if a compiler generated solution could be designed otherwise.
I did investigate ways to potentially pool and reuse enumerators, by simulating a code generated iterator that pools its enumerator and reuses them when asked to construct one. Given that enumerators already have a dispose method I could adapt the Flatten function to actually respect that and call the Dispose method which would return the enumerator back to the pool. This works because calls to GetEnumerator are expected to new enumerator instances, so the caller is free to call Dispose on the enumerator as long as it can reasonably guarantee that the reference does not escape its control. Using this technique, I was able to eliminate the enumerator allocations entirely, except for the few that went into the pool.
However, there is no dispose like convention in use for IEnumerable because the interface is typically implemented by a collection that is expected to outlive its enumeration. No one expects it to stop working as a dispenser for enumerators just because it has already been enumerated once. And if that’s the case, then it’s definitely not expected that it become a dispenser for an entirely different collection after its first use, which is what would happen if it were to be pooled and reused. Unfortunately, iterator methods return IEnumerable instances. If they had been designed to return IEnumerator instances instead, then there might be some wiggle room to allow pooling.
Note, it may appear that enumerators also have this same pooling problem via the Reset method that is defined to allow the enumerator to be enumerated over and over. However, this feature has been disavowed since nearly the beginning of .Net and is unlikely to be relied upon. Instead, consuming code generally operates the enumerators exactly once and then forgets about them. Better yet, only code that actually calls the Dispose method would be in danger of enumerators being reused out from under them. No code should expect a disposable item to be usable after a call to Dispose.
The IEnumerable instances do not have this saving grace. At least not in a solution that uses a helper method like Flatten. Since the Flatten method only discovers IEnumerables as it is iterating other enumerators, it cannot safely know the extent or lifetime of those IEnumerables and so could not, even if the IEnumerable is also IDisposable, be allowed to dispose them. The only code that would be allowed to make the decision to dispose of the IEnumerable, would be the code that caused it to be allocated in the first place.
Maybe there is some pooling solution out there for IEnumerable, but I have not been able to find it. I did build a variation of helpers that assumed it could be pooled and did so, and that removed almost all allocations. But as it stands, its is not safe, so it is a dead end.