Troubleshooters.Com and Python Patrol present:
Recursion in Python
Copyright (C) 2016 by Steve Litt -- Legal
See the Troubleshooters.Com Bookstore.
CONTENTS:
A few days ago, I facilitated a discussion on "Recursion in Python" at the Orlando DIYPython group. It was well received, and the suggestion was made to put my sample code on the Internet. Well, if the sample code is there for people who weren't at the discussion, there needs to be some explanation too. This document is written to achieve both.
All code in this document is written and tested using Python 3. At this point in time, there's not a reason to use Python 2 except to maintain old code. The shebang line for each program looks like the following:
#!/usr/bin/python3
If you don't have an executable called python3, make one using a symlink or shellscript or whatever. If you don't have Python 3 installed on your computer, install it. It runs just fine in parallel with Python 2. Python 2 will stop being supported in 2020, according to PEP 373. At this point, any code you write should be built to survive more than three years, so all new construction should be done in Python 3.
All the code, as opposed to the narrative, on this web page document is licensed via the Expat license at http://directory.fsf.org/wiki/License:Expat. Feel free to copy it into your code, modify it as you want, and use it how you will, and distribute it within the very lax and permissive Expat license.
All the narrative is free-beer, not behind a paywall, proprietary copyrighted text. You can read it, link to it, suggest it to others, but please don't copy it elsewhere.
The most general definition of recursion is a subroutine calling itself. It can call itself directly, or it can call another subroutine that then calls it. Most people think of recursion as a program directly calling itself, and all the examples in this document use direct self-calls.
Unlike many languages, Python is not optimized for recursion, either in syntax or performance. This means that when writing a Python program, you should not use recursion unless it gives you a significant benefit. For example, if you want a function to count upwards from 1 to its argument, don't do the following:
def countup(num): if num < 1: return num countup(num - 1) print(num) return num countup(10)
The preceding code is a substitution of recursion for iteration, and although that's the way you do things in Scheme and some other languages, it's not what you do in Python. In Python you just do the following:
def countup(num): for i in range(1, num+1): print(i) countup(10)
There's one exception to the rule I stated above: By all means substitute recursion for iteration if you're doing it only for the purpose of learning. Recursion is a mindbending activity when you start, so do the simplest exercises you can, even if such exercises would be silly in the real world.
Here's a rule of thumb: If you can see a clear path to completing the task using iteration, non-recursive function calls, objects and the like, stay away from recursion. If iteration, function calls, objects and the like start looking like an endless mess of details, explore the possibility of recursion. And if you ever find yourself needing to put a stack in your program, that's a sure sign that you just might benefit from recursion, so explore the possibility. Stacks are how you iteratively imitate recursion. As a matter of fact, behind the scenes, Python's subroutine stack is keeping track of recursion so you don't have to.
Beyond the rule of thumb, there are a few problem domains that suggest recursive solutions. Read the next subsection...
Sometimes, using recursion, you can simply write the definition of the problem, instead of being forced to provide instructions for your computer to follow. This happens a lot with mathematical concepts, but it's not restricted to them. As an example, consider the mathematical definition of a factorial number:
Now put that definition in a recursive function called get_fact():
#!/usr/bin/python3 def get_fact(N): if N == 1: return 1 return N * get_fact(N-1) print(get_fact(4)) # Should be 24
Copy, paste, and run the preceding code, and you'll see that it works. You defined "factorial", and recursion did the rest.
Such pure "paste in the definition" problems are rare, but often you'll find a programming problem that seems to spiral off into dozens of minute details. When that happens, consider using recursion and defining the obvious problem in the recursive function.
Hierarchies surround us:
When the underlying problem involves a hierarchy, it's best to at least consider a recursive solution. With hierarchies, recursion is often the easiest and simplest technique.
When programming any kind of hierarchy, a tree-structured data store is usually the best advice. Whenever you have a tree-structured data store, recursion is almost always the best way to navigate it, often the best way to modify it, and sometimes the best way to create it. What I often do is capture the incoming data in a tree, any tree, and then run successive tree-walking recursive subroutines to shape the tree to the exact needs of the situation. As the originator of both the VimOutliner project and the UMENU project, as well as someone who has written programs to convert XHTML to full eBooks, I work with trees often.
Recursive program is often used to reduce the scope of a monumental task down to a manageable scope, particularly if the monumental task is more a ton of details than a complex concept. Often you can use recursion to solve the human-recognizeable problem, and kick the details down the road by recursing to the same function, with different inputs, to do the rest. This is how the Quicksort is done:
process quicksort (A, lo, hi) #Array, element 0, and element len(A) - 1 If lo < hi Pick a random element to be the "pivot" Partition the array so all left of the pivot is less than pivot, and all right is more. quicksort(A, lo, pivot) quicksort(A, pivot+1, hi)
What's going on is that you keep splitting the array, not necessarily in half, and with the pivot function making it so everything on one side of the random pivot is less than the pivot, and everything on the other side is bigger than the pivot. This output of a pivot run is by no means sorted, but as lo and hi get next to each other, those two are sorted, and once all of those are done, the whole array is sorted. The human-easy part is the pivot, and then you make the recursive quicksort() function keep splitting sections and subsections of the array. This would be a monster if you had to approach it iteratively.
In most successful uses of recursion, you'll see a reduction of scope, where you solve the human-easy problem for part of the problem, and then pass the complications on to further recursed calls.
Trial and error happens a lot in brute-force computerization of simple games, and can play somewhat of a part in more complex games. For instance, the following is the pseudo-code of a recursive function to run a maze:
function move(direction) if position == destination return success else if position == wall return failure else if position == start return failure for dir in (left, straight, right) result = move(dir) if result == success record_this_location return result
Once again, notice the reduction in scope of the problem. All you need to program into the program is that if you hit a wall or the start position you failed, if you hit the destination you succeeded, and if neither of those, no problem, just try left, try straight and try right, secure in the knowledge that eventually, in a nested subroutine call, something will hit one of those and report back.
Contemplate what you'd need to do to run a big maze, via trial and error, without recursion.
Recursive subroutines always have two features:
def countup(num): if num < 1: return num countup(num - 1) print(num) return num countup(10)
In the preceding too-trivial-for-recursion code, the base case occurs when num is lower than 1, and the rule is achieved in the call to countup() with argument num-1. If you're wondering why the preceding code doesn't count down from the argument to 1, instead of up from 1 to the number, read on...
The word "subroutine" is used throughout this document. Why?
Throughout the coding world, there are entities of source code whose caller line number is stored before they execute, and when their execution is finished, this source code entity transfers control back to the line after the line that made the call to the entity. These entities have many names: subroutines, functions and procedures being a few. Different computer languages call them different things.
C calls all such entities, even when the entity returns nothing and its entire purpose is to create side effects like printing something, writing a database, making a sound, or changing a global variable. The same is true for Python.
And yet according to mathematics, a function returns a result based only on the information passed to it, and the function has no side effects. Examples include sine, cosine, log, etc. Moreover, in functional languages, an entity is a function only if it returns a result based only on the information passed it, and it causes no side effects. Try programming in Scheme and you'll see what I mean.
The Pascal language is a bit more discriminating: If the entity passes back a return value, it's called a function, but if it doesn't pass a return result, it's a procedure. However, Pascal remains mute when a function returning a value also creates huge levels of side effect.
It's important to know that recursion can happen either with or without a passed back value, either with or without side effects. Therefore, I use the word "subroutine" to mean either.
Contemplate a yo-yo. You throw it downward or outward and it unwinds until it can't go down any farther, and then it comes back up. If you put seven stripes on the string, starting with your finger loop and ending at the completely extended yo-yo, and those stripes (starting at the finger) were Red, Orange, Yellow, Green, Blue, Indigo and Violet (ROYGBIV for short), then as the yo-yo goes out, they're revealed in order ROYGBIV. But when the yo-yo comes back up, they're hit by the yo-yo in opposite order, VIBGYOR. Going out: ROYGBIV. Coming back, VIBGYOR. Give a couple minutes consideration to the action of the yo-yo.
Now imagine the countup function, with a starting argument of 3:
def countup(num): if num < 1: return num countup(num - 1) print(num) return num countup(3)
The base case on the preceding is num<1. Imagine the base case as the point where the yo-yo runs out of string and starts coming back. Now notice also that the print(num) statement comes after the recursive call. In other words, it's on the way back, not on the way out. That's why it counts in the direction it does, even though you're subtracting. See the following ascii art to see how it works:
calls calls calls countup(3)-------->countup(2)-------->countup(1)-------->countup(0)-----, | | Yo-yo goes out --------> | <-------- Yo-yo comes back | | | returns to returns to returns to | countup(3)<--------countup(2)<--------countup(1)<-----------------------' | | | prints 3 prints 2 prints 1
If you wanted to have it count up instead of down, you'd put the print(num) statement before the recursive call instead of after it, so the print is done on the way out rather than the way back. It would look like the following:
def countdown(num): if num < 1: return num print(num) countdown(num - 1) return num
The sole difference is whether the work is done on the way out or the way back.
You'll hear people say that for performance sake you should do everything on the way out and nothing on the way back. They'll speak of something called "tail recursion." They're telling the truth, but not for Python, because Python doesn't optimize tail calls. So in Python, don't prefer doing the work on the way out to the way back or vice versa. Do whichever is easier. When you code in Scheme, you can worry about tail recursion. Tail recursion is beyond the scope of this document, but you can read about tail calls and tail recursion at https://en.wikipedia.org/wiki/Tail_call, but please remember that Python doesn't optimize tail calls, so you don't need to know about them for your Python work.
This section is titled "The Magic Yo-Yo." Yo-yos have certainly been discussed here, but what about the magic? The magic comes when you design a simple algorithm for the situation right in front of you, and by recursion that simple algorithm manages to solve the whole problem. The more recursive code you write, the more you'll understand. Suffice it to say that when you do recursion right on the kind of problem handled well by recursion, you'll be amazed how such simple algorithms solve massive problems.
The mathematical definition of factorial numbers is the following:
You can drop that definition right into Python, as follows:
#!/usr/bin/python3 def factorial(N): if N == 1: return 1 return N * factorial(N - 1) print(factorial(5))
This factorial function is an extreme case of programming by definition, but it's surprising how much recursive programming boils down to stating the rules and then recursing.
Look at any introduction to recursive programming, and you'll see recursive code to calculate a Fibonacci number. The mathematical definition of Fibonacci Numbers is the following:
Dropping the preceding definition into Python, we get the following code:
#!/usr/bin/python3 def fibonacci(N): if N < 2: return 1 return fibonacci(N-2) + fibonacci(N-1) for i in range(1, 10): print('{}: {}'.format(str(i), str(fibonacci(i))))
Can't get much cleaner and simpler than that. The fibonacci() function is the poster-child for recursion. Umm, except for one thing: It's a fork bomb. Each call makes two more calls, each of which makes two more, onward and onward until the computer slows to a crawl and then, perhaps hours later, the program runs out of stack space. If you give the preceding fibonacci() function an N of 40, it will likely take hours to complete, if it completes at all.
As aesthetically beautiful as the preceding function is, to get a function that works at high values requires more than four lines of code. One way is to cache former results in a dictionary, as follows:
def fib_dict(N, dic): if N < 2: return 1 if N - 2 in dic: n2 = dic[N-2] else: n2 = fib_dict(N - 2, dic) if N - 1 in dic: n1 = dic[N-1] else: n1 = fib_dict(N - 1, dic) newfib = n2 + n1 dic[N] = newfib return newfib print(fib_dict(7, {})
The preceding records every returned Fibonacci number for lookup in a fast-lookup dictiionary so recursive calls aren't made to already looked up values. This algorithm is fast.
Another speedy Fibonacci algorithm is the brute force algorithm, which counts up instead of counting down, passes former values as arguments, and for all practical purposes wraps an iterative algorithm in a recursive package:
def fib_bruteforce(N, a2=1, a1=1): if N > 2: return fib_bruteforce(N - 1, a1, a2 + a1) else: return a2 + a1 print(fib_bruteforce(7)) #Note the default values of a2 and a1
But as was mentioned earlier in this document, iteration shouldn't be needlessly replaced in Python. The real way to calculate Fibonacci numbers in Python is good old iteration:
def fib_iter(N): n2=1 n1=1 for i in range(1,N): n0 = n2 + n1 n2 = n1 n1 = n0 return(n0) print(fib_iter(7))
For illustrating the utility of recursion, a Fibonacci calculator is priceless. To do actual calculations, it's needless.
This section has code you wouldn't want to write any way except recursively. When you're dealing with a hierarchical tree structure that not only descends but also branches horizontally, recursion is the only thing that make sense. Sure, you could use a stack and calculate levels and loop through the thing, but it would be an unholy mess.
This example is a simple, no frills binary tree sorter. It's not an efficient algorithm: There are no provisions for balancing the tree. A sorted list of 100 numbers would yield a linked list 100 nodes long. But it does work, and works well enough to use in working code for number sets with less than 1000 elements, and it's simple enough to understand. Here's the entire program:
#!/usr/bin/python3 import json def pushnode(num, node): # Acquire and order nodes made from incoming numbers if not 'num' in node: # Tree's head node = {'num': num} elif num < node['num']: if not 'l' in node: node['l'] = {'num': num} else: pushnode(num, node['l']) elif num > node['num']: if not 'r' in node: node['r'] = {'num': num} else: pushnode(num, node['r']) else: # same number as node's num if not 'r' in node: node['r'] = {'num': num} elif not 'l' in node: node['l'] = {'num': num} else: pushnode(num, node['l']) return node def readtree(node): # Walk tree down-left to center to down-right, yielding sort if 'l' in node: readtree(node['l']) print(node['num']) if 'r' in node: readtree(node['r']) inputt = [44, 22, 77, 33, 99, 41, 23, 90, 50] head = {} for n in inputt: head=pushnode(n, head) print(json.dumps(head, indent=3, sort_keys=True)) readtree(head)
The program has two recursive subroutines: One that accepts a single number and stores it in a tree, and another that walks that tree, printing out the numbers as they occur from bottom left to center to bottom right at rising levels. The main routine defines an array to be sorted, called inputt, defines the tree's head as an empty dictionary, and iterates through the array, calling pushnode() to properly store each number.
The main routine then uses JSON to pretty-print the tree, and then calls readtree() to walk the tree and print the numbers in sorted order.
Each node in the tree is a dictionary with up to three elements: One called 'num' that contains the number, one called 'l' that contains a node with a lower (more to the left) number, and one called 'r' that contains a node with a higher number, to the right of this node. 'r' and 'l' might not exist for particular nodes, but all the nodes have 'num', except for before the first number is added.
Let's explain readtree() first, because it's much simpler. If the current node has an 'l' node, readtree() is called on its 'l' node. When that function call returns (remember the yo-yo), or if this node has no 'l' node, it prints this node's 'num' value. Finally, if there's a 'r' node, it calls readtree() on the 'r' node. Thus, it reads left to center to right, slowly climbing toward the top of the tree.
The pushnode() function is a little more complicated. Arguments are a number to store (num), and a node to try to put it in. If there's no 'num' element, that means no numbers have been placed in the tree yet and this node is the top of the tree, so the num element is assigned to the node's 'num' element. That happens only once at the beginning.
From then on, if the num argument is less than the node's 'num' value, if this node currently has no 'l' element, create a new node whose 'num' is the argument num, and place it in this node's 'l' position. If this node already has an 'l' element, call pushnode() on that 'l' element. So the base case occurs when there's no 'l' element, otherwise, recursion occurs.
If the num argument is greater than the current node's 'num' element, the analogous process takes place, where a new node is added if there's currently no 'r' on the node, or else pushnode() is called on the 'r' element.
There's also some code to handle cases where the num argument is equal to the current node's 'num' value. This extra code is an attempt at better balance than just making a num/'num' equality always go to the right, using a >= operator.
Run this code, examine it, feel it, make it your own. This is a valid use of recursion.
When working with tree structures, there's a genre of recursive subroutines called "walkers." The main activity of a walker function is to visit every node in the tree, and once there, do something. I like to call the visits of every node "traversal", and what it does at each node as "the payload." The payload could be to output something. The payload can also, depending on criteria, move the current node somewhere else, or put the current node in a list or dictionary. The main thing to remember about walkers is that they have two functions: To traverse the tree, and to execute their payload on every node. Here's an example from the last section:
def readtree(node): # Walk tree down-left to center to down-right, yielding sort if 'l' in node: readtree(node['l']) print(node['num']) if 'r' in node: readtree(node['r'])
In the preceding code, readtree() walks from lower left to center to right, stopping at center to execute its payload: Printing its value. This is called a Left Center Right walk, and is graphically shown below:
Left Center Right walks are uncommon. Most walks turn out to be Center Left Right, as shown graphically below:
Interestingly, these two walks are remarkably similar, because, unless the nodes have parent pointers that are used, they both traverse the tree in Center Left Right order. The difference is that the Center Left Right executes its payload immediately, which means on the Center node, whereas the Left Center Right delays its payload until its left traversal has been completed. Remember the yo-yo? It's printed on the way back from the left traversal. This means that although nodes are traversed in the same order, payloads are executed in a very different order.
Walkers are very important. Simple walkers that do one thing and do it right are trivially easy to write. A series of simple walkers can make a series of modifications to the tree to craft the exact tree you want, and then a final output walker can output the tree as desired. A series of simple walkers sounds inefficient, but in fact usually you're not short enough on clock cycles to worry about it. Unless the bottleneck of the whole thing is faster than typing speed, or unless the tree is huge (perhaps over 5000 nodes), a series of ten walkers will be just fine. Once the system is designed with a series of simple walkers, if performance is an issue, that's the time to see what walkers you can combine or what other efficiency tactics you can use.
Outlines are an innately human way to express hierarchies. They were used to design books long before there were computers. I doubt anybody can point out a time before outlines were used for task planning. A human glances at a tab-indented outline and sees a hierarchy.
But a computer looks at that same outline and sees only a sequence of strings, and oh-by-the-way the strings have varying numbers of tabs at their beginnings. It turns out that converting a tab-indented outline to a hierarchical tree structure is more difficult than any human would suspect.
As the originator of both the Vimoutliner project (an outline processor) and the UMENU project (a menu system whose input is an outline), you can imagine I've had to do this conversion many times. I've always done it with a stack and lots of iteration logic. But for this document, I did it recursively. It took me ten hours to figure out how to do it recursively, but when I had it done, it was actually very simple and made a lot of sense.
In turning a generic outline into a tree, every line on the outline becomes a node. Each node has zero to several children. On a generic outline, all lines are equal: None has a special meaning. Therefore, a great way to represent a node in the tree is something like this:
{ 'txt': <line text without initial tabs>, 'children': [], 'level': <number of tabs preceding this text line in outline>, 'rawss': <line number in original outline, for error messages>, }
There are other ways of representing it, but the preceding, especially having the 'children' array, makes things easy and preserves the outline's original order.
The following is a program to turn a tab-indented outline, fed in via stdin, into a node tree, and then walk that node tree to once again produce an outline (but this time space-indented):
#!/usr/bin/python3 import re import sys def tabcount(line): arr = list(line) tabs=0 for c in arr: if c == '\t': tabs += 1 else: break return tabs def nextline(f): EOFflag = False rawlineno = 0 line = '' while line.strip() == '': rawlineno += 1 line = f.readline() if not line: EOFflag = True break if EOFflag: level = -10 line = 'EOFEOFEOF' else: level = tabcount(line) return line.strip(), level def buildtree(f): parent = {'children': [], 'isparent': 'isparent'} line, level = nextline(f) pushnode(line, level, parent, f) return(parent['children'][0]) def pushnode(line, level, parent, f): # Make this node from incoming line and level node = {'data': line, 'level': level, 'parent': parent, 'children': []} parent['children'].append(node) prevlevel = level # This node's read line, level = nextline(f) # Loop thru this node's children while level > prevlevel: if level > prevlevel + 1: print('FATAL Outliner error: Double indent, aborting') sys.exit(1) else: line, level = pushnode(line, level, node, f) return line, level def walk(node): spc = (2 * node['level']) * ' ' print(spc + node['data']) for c in node['children']: walk(c) def main(): f = sys.stdin head = buildtree(f) f.close() walk(head) if __name__ == '__main__': main()
A lot of the preceding program is just infrastructure. Function nextline() with its assistant tabcount() simply deliver the next non-blank line, and that line's level in the outline. The buildtree() function is just scaffolding for the pushnode() recursive function, providing the necessary initialization and finalization to the top level call to pushnode().
After about 10 hours trying to design pushnode(), I was surprised how simple it turned out to be. It turned out to be a great example of "just program what's right in front of you, and let recursion take care of all the details."
What didn't appear at all complex was the walker function in this program, called walk(). It does just what you'd expect: For each of the current node's children in the 'children' array, it calls itself. It also has logic for space indentation based on the current node's level. If nodes hadn't contained a 'level' component, this still could have been done by having a level argument to the function, which would be sent as level+1 upon each recursion.
One of the undemonstrated beauties of this program is that once you have your outline in a node tree, whose nodes each has an array of children, is that you can now have your way with that tree, using simple walker functions. If certain line texts have special meanings, they can be handled by a walker. For instance, in the UMENU2 outline to directory tree converter, lines whose text is "data" mark the head of a special tree that isn't a child at all: It's metadata for the current node. No problem: That subtree is just copied to node['data'] with appropriate processing, and its subscript in node['children'] is recorded for later deletion, because deleting within a loop can do strange things. After all of node['children'] has been iterated, the subscript for the former data child is deleted.
This is an example that complex tree processing is often made easy by running successive simple walkers. Although this might be less time-efficient than having one walker do all the work, design of the simple walkers is often one or two orders of magnitude quicker to design and simpler to understand than would be a do-everything walker. So the first implementation is best done with a succession of simple walkers. Then, if and only if the time consumption of the successive walkers is shown to be a problem (and it usually won't be), you can refactor to do more with each walker.
Copy this section's code and run it. Test it out with sets of numbers big and small, coming in through STDIN. Try improving it, try breaking it, experiment with it, and really make it yours.
I got my you-know-what handed to me while trying to design my recursive outline to node tree program. Three or four dead ends. Confusion trying to keep all the details in my mind. Frustration, disappointment, and doubt. Would I be able to do it at all?
Take a look at the last section's pushnode() function:
def pushnode(line, level, parent, f): # Make this node from incoming line and level node = {'data': line, 'level': level, 'parent': parent, 'children': []} parent['children'].append(node) prevlevel = level # This node's read line, level = nextline(f) # Loop thru this node's children while level > prevlevel: if level > prevlevel + 1: print('FATAL Outliner error: Double indent, aborting') sys.exit(1) else: line, level = pushnode(line, level, node, f) return line, level
Are you kidding me? That simple solution took me ten hours of frustration, disappointment, and doubt to create? I know a couple people who could have come up with that recursive solution, or a better one, all designed and coded in 20 minutes. But those people are very used to thinking recursively. I know bunches and bunches of people who would have taken ten hours, just like I did.
The problem is the seemingly endless details of this conversion. Every line on the outline gets exactly one node. The function is a combination of iteration (through all the children) and recursion (to each child). Through out all of this, every node must get exactly one line read from the outline file. Not easy. Worse yet, you can't decide what to do with a line until you compare its level to the one that came before. How do you keep everything in sync, so extra lines aren't read and discarded, or too few lines are read and the thing infinite loops on you? It's obvious that you recurse to each child, but what do you do with a sibling? At what point do you append a child node to its parent's node['children'] array? What do you do with a higher level line indicating that a new parent is being installed? It can drive you nuts. It drove me nuts for hours, and when I finally solved it, it was much simpler than I thought possible.
When you first contemplate the problem, it seems easy. A human can look at an outline and intuitively see a tree. You basically rotate it 90 degrees clockwise. What could be hard?
My first attempt was to loop through all the current node's children and siblings, with a read at the bottom of the loop. Any lines higher in level number than the current node were recursed, and when they returned their node, their node was appended to this node's children. Sounds like a likely winner, right? Bzzzzt! It worked as long as I kept recursing to lower level (higher level number) nodes, but the first time the loop actually continued, everything went wrong. I started putting all sorts of logic around the various reads, put reads in different places, took siblings out of the loop, put them back, but no matter what I did, it failed.
Slowly, the key distinction began to occur to me as a fuzzy feeling. If I had a read inside the loop, and each recursed child had its own read, it was almost impossible to keep reads and nodes synchronized. I worked on such synchronization for a couple hours, but doing so bent my mind. I was trying to operate beyond my IQ, and it wasn't working at all. I took a shower, and while taking my shower, the key distinction got clearer. Every single call to pushnode() already contained a read on the outline. Within the loop, lines were already retrieved by recursive pushnode() calls, but the line text and level of those reads weren't available in the loop. So the trick was, I needed to make the last-read line available in the loop and not do a read within the loop. That's easy, just have pushnode() return the line and level. As far as appending the child to the parent's node['children'], it would be simpler to pass the current node as an argument to pushnode() so that the child could make itself, and then append itself to its parent. My last important realization was that my loop didn't need to worry about the current node's siblings: Return from recursion would take care of that. So the loop could be constructed to only last as long as the level number (indentation) was greater than the current node's level number.
I took my time in the shower, going over all the details. Everything fit: Everything worked. Previously, my problem had been that the last read line wasn't available in the loop if the loop had spawned recursion. With the return of line and level by pushnode(), that problem was gone. I wasn't sure, but I had a feeling that any exdents (opposite of indents) or even multiple exdents in the outline would be magically handled by return from recursion.
I finally emerged from the shower, confident and exited, and refactored pushnode(). 45 minutes later it worked.
Those of you who have read my troubleshooting books know that after solving every problem, I take pride in my solution, and part of that taking pride is asking myself how I could do it even better next time. As I took pride for the next hour, an answer appeared: Next time I'm building a tree from a file and the lines and nodes have a 1 to 1 relationship, I'll be sure that from the very beginning I match reads to nodes. That would have saved me hours.
As I write this article, it's 24 hours after I solved that problem, and the situation has gained some clarity. The answer to "how can I do it even better next time" has gotten a little more general. While building a recursive program, it's my responsibility to make sure that the current invocation of the function has the identical situation to the invocation I recurse to. I must have every piece of information I need to do the current invocation's job, whether it's from arguments to the current invocation, or function return of a recursion, and I must make sure such information is available when it's needed. To the best of my ability, I must solve every problem of the current location in the current invocation. To the best of my ability, for simplicity, I must consider the current invocation to be a unified state. For instance, in my working pushnode(), everything in the current invocation pertains to the current node, with the very conscious exceptions that I 1) let the recursed child append itself to the current node's 'children' array by passing the recursed child the current invocation's node, and 2) receive any further lines and levels from the recursed child's return. Other than that, observe how things work: The current node is constructed from this invocation's line and level arguments. The current child's loop checks that incoming lines are at a higher level number (more indented), and if so, recurses them. When the loop terminates, the line and level the caller needs are returned.
This thought leads to a series of questions I will ask myself the next time I write a recursive function:
Recursion occurs when a subroutine calls itself. Python is not optimised for recursion, so in Python, recursion should be used only when it makes writing the program significantly easier. Such significant gains tend to happen in data trees, hierarchical situations, trial and error, and reducing scope.
As a subroutine keeps calling itself over and over again, it starts to seem like a yo-yo, with tasks performed by invocations executed in call order on the way out, and in the opposite order on the way back in. Tasks executed after the recursion call happen on the way back in.
When you use recursion on the right problem, in the right way, the simplicity of the solution often seems like magic. Although it isn't used often in Python, it can save a fortune in time on those tasks that should be done with Python.