Self-modifying code

The method is frequently used for conditionally invoking test/debugging code without requiring additional computational overhead for every input/output cycle.

In later operating systems for programs residing in protected storage this technique could not be used and so changing the pointer to the subroutine would be used instead.

The pointer would reside in dynamic storage and could be altered at will after the first pass to bypass the OPEN (having to load a pointer first instead of a direct branch & link to the subroutine would add N instructions to the path length – but there would be a corresponding reduction of N for the unconditional branch that would no longer be required).

Suppose a DOS script (or "batch") file MENU.BAT contains the following:[4][nb 1] Upon initiation of MENU.BAT from the command line, SHOWMENU presents an on-screen menu, with possible help information, example usages and so forth.

[6] In the early days of computers, self-modifying code was often used to reduce use of limited memory, or improve performance, or both.

[9] Self-modifying code can be used for various purposes: Pseudocode example: Self-modifying code, in this case, would simply be a matter of rewriting the loop like this: Note that two-state replacement of the opcode can be easily written as 'xor var at address with the value "opcodeOf(Inc) xor opcodeOf(dec)"'.

Self-modifying code was used to hide copy protection instructions in 1980s disk-based programs for systems such as IBM PC compatibles and Apple II.

They avoid the danger of catastrophic self-rewrites by making sure that self-modifications will survive only if they are useful according to a user-given fitness, error or reward function.

[14] The Linux kernel notably makes wide use of self-modifying code; it does so to be able to distribute a single binary image for each major architecture (e.g. IA-32, x86-64, 32-bit ARM, ARM64...) while adapting the kernel code in memory during boot depending on the specific CPU model detected, e.g. to be able to take advantage of new CPU instructions or to work around hardware bugs.

[15][16] To a lesser extent, the DR-DOS kernel also optimizes speed-critical sections of itself at loadtime depending on the underlying processor generation.

[10][11][nb 2] Regardless, at a meta-level, programs can still modify their own behavior by changing data stored elsewhere (see metaprogramming) or via use of polymorphism.

Generating code for specific tasks allows the Synthesis kernel to (as a JIT interpreter might) apply a number of optimizations such as constant folding or common subexpression elimination.

The resulting lack of portability has prevented Massalin's optimization ideas from being adopted by any production kernel.

Paul Haeberli and Bruce Karsh have objected to the "marginalization" of self-modifying code, and optimization in general, in favor of reduced development costs.

In some cases short sections of self-modifying code execute more slowly on modern processors.

The cache invalidation issue on modern processors usually means that self-modifying code would still be faster only when the modification will occur rarely, such as in the case of a state switching inside an inner loop.

PC processors must handle self-modifying code correctly for backwards compatibility reasons but they are far from efficient at doing so.

[citation needed] Because of the security implications of self-modifying code, all of the major operating systems are careful to remove such vulnerabilities as they become known.

One mechanism for preventing malicious code modification is an operating system feature called W^X (for "write xor execute").

[citation needed] Other systems provide a 'back door' of sorts, allowing multiple mappings of a page of memory to have different permissions.