Inline expansion

Giving this degree of control to the programmer allows for the use of application-specific knowledge in choosing which functions to inline.

Ordinarily, when a function is invoked, control is transferred to its definition by a branch or call instruction.

With inlining, control drops through directly to the code for the function, without a branch or call instruction.

In the context of functional programming languages, inline expansion is usually followed by the beta-reduction transformation.

A programmer might inline a function manually through copy and paste programming, as a one-time operation on the source code.

The direct effect of this optimization is to improve time performance (by eliminating call overhead), at the cost of worsening space usage[a] (due to duplicating the function body).

In lower-level imperative languages such as C and Fortran it is typically a 10–20% speed boost, with minor impact on code size, while in more abstract languages it can be significantly more important, due to the number of layers inlining removes, with an extreme example being Self, where one compiler saw improvement factors of 4 to 55 by inlining.

[6] This is most significant if, prior to expansion, the working set of the program (or a hot section of code) fit in one level of the memory hierarchy (e.g., L1 cache), but after expansion it no longer fits, resulting in frequent cache misses at that level.

At the highest level this can result in increased page faults, catastrophic performance degradation due to thrashing, or the program failing to run at all.

This last is rare in common desktop and server applications, where code size is small relative to available memory, but can be an issue for resource-constrained environments such as embedded systems.

[6] Inlining hurting performance is primarily a problem for large functions that are used in many places, but the break-even point beyond which inlining reduces performance is difficult to determine and depends in general on precise load, so it can be subject to manual optimization or profile-guided optimization.

One of the parameters might be an option to alternatively generate a one-time separate subroutine containing the sequence and processed instead by an inlined call to the function.

Commonly, inliners use profiling information about the frequency of the execution of different code paths to estimate the benefits.

[11] In addition to profiling information, newer just-in-time compilers apply several more advanced heuristics, such as:[4] Inline expansion itself is an optimization, since it eliminates overhead from calls, but it is much more important as an enabling transformation.

That is, once the compiler expands a function body in the context of its call site—often with arguments that may be fixed constants—it may be able to do a variety of transformations that were not possible before.

Bjarne Stroustrup, the designer of C++, likes to emphasize that macros should be avoided wherever possible, and advocates extensive use of inline functions.

Although it can lead to larger executables, aggressive inlining has nevertheless become more and more desirable as memory capacity has increased faster than CPU speed.

[4] Other languages provide constructs for explicit hints, generally as compiler directives (pragmas).