The following program is an example in Scheme:[8] This is not written in a tail-recursive style, because the multiplication function ("*") is in the tail position.
This allows an interpreter or compiler to reorganize the execution which would ordinarily look like this:[8] into the more efficient variant, in terms of both space and time: This reorganization saves space because no state except for the calling function's address needs to be saved, either on the stack or on the heap, and the call stack frame for fact-iter is reused for the intermediate results storage.
This also means that the programmer need not worry about running out of stack or heap space for extremely deep recursions.
Tail recursion modulo cons is a generalization of tail-recursion optimization introduced by David H. D. Warren[9] in the context of compilation of Prolog, seen as an explicitly set once language.
The following Prolog fragment illustrates the concept: Thus in tail-recursive translation such a call is transformed into first creating a new list node and setting its first field, and then making the tail call with the pointer to the node's rest field as argument, to be filled recursively.
The tail-recursive implementation can now be converted into an explicitly iterative implementation, as an accumulating loop: In a paper delivered to the ACM conference in Seattle in 1977, Guy L. Steele summarized the debate over the GOTO and structured programming, and observed that procedure calls in the tail position of a procedure can be best treated as a direct transfer of control to the called procedure, typically eliminating unnecessary stack manipulation operations.
In Scheme, a Lisp dialect developed by Steele with Gerald Jay Sussman, tail-call elimination is guaranteed to be implemented in any interpreter.
The language specification of Scheme requires that tail calls are to be optimized so as not to grow the stack.
Tail calls can be made explicitly in Perl, with a variant of the "goto" statement that takes a function name: goto &NAME;[12] However, for language implementations which store function arguments and local variables on a call stack (which is the default implementation for many languages, at least on systems with a hardware stack, such as the x86), implementing generalized tail-call optimization (including mutual tail recursion) presents an issue: if the size of the callee's activation record is different from that of the caller, then additional cleanup or resizing of the stack frame may be required.
[15][16][17] Though the given language syntax may not explicitly support it, the compiler can make this optimization whenever it can determine that the return types for the caller and callee are equivalent, and that the argument types passed to both function are either the same, or require the same amount of total storage space on the call stack.
[19] For compilers generating assembly directly, tail-call elimination is easy: it suffices to replace a call opcode with a jump one, after fixing parameters on the stack.
From a compiler's perspective, the first example above is initially translated into pseudo-assembly language (in fact, this is valid x86 assembly): Tail-call elimination replaces the last two lines with a single jump instruction: After subroutine A completes, it will then return directly to the return address of foo, omitting the unnecessary ret statement.
The generated code thus needs to make sure that the call frame for A is properly set up before jumping to the tail-called subroutine.
Many implementations achieve this by using a device known as a trampoline, a piece of code that repeatedly calls functions.
It is possible to implement trampolines using higher-order functions in languages that support them, such as Groovy, Visual Basic .NET and C#.
Following this, the stack is unwound ("popped") and the program resumes from the state saved just before the garbage collection.
Baker says "Appel's method avoids making a large number of small trampoline bounces by occasionally jumping off the Empire State Building.
However, this approach requires that no C function call ever returns, since there is no guarantee that its caller's stack frame still exists; therefore, it involves a much more dramatic internal rewriting of the program code: continuation-passing style.