Ershov numbers, named after Andrey Petrovich Yershov,[1] are used in code optimization to minimize the amount of register allocations. Ershov numbers can be used in methods to optimally select registers when there is only one expression in a code block. Given an expression E = E1 op E2 the goal is to generate code so as to either minimize the number of registers used, or, if an insufficient number of registers is available, to minimize the number of nonregister temporaries required.
Definition
The Ershov number n of a node in given expression tree is defined as follows:[2][3]
- Every leaf has n = 1.
- For a node with one child, n is the same as child's.
- For a node with two children, n is defined as:
The Ershov number of a node represents the minimum number of registers required to evaluate the subexpression whose root is that node. The idea is that we evaluate the child with the larger Ershov number first, then the other child, then perform the operation at the root.
Example
Suppose we have an expression tree with a '+' operation at the root, and the left and right subtrees have Ershov numbers of 3 and 4, respectively. The Ershov number of this node is 4, so we should be able to generate code for the expression using 4 registers.
- Generate code to evaluate the right child using registers r1, r2, r3, and r4. Place the result in r1.
- Generate code to evaluate the left child using registers r2, r3, and r4. Place the result in r2.
- Issue the instruction
ADD r1, r1, r2
, which adds r1 and r2 and stores the result in r1.
Code generation
The general procedure for generating code using a minimal number of loads and stores from memory is as follows:
- Generate code for the child with the largest Ershov number first
- Issue an instruction to store the result in a temporary register, or, if none is available, a temporary location in memory
- Generate code for the child with the smaller Ershov number
- Issue an instruction to load the temporary variable back into a register
- Issue an instruction to perform the operation at the root
In the ideal case, if there are n registers and the first subexpression requires n registers and the next subexpression requires n - 1 registers, a single register can be used to store the result of the first expression, and there will still be n - 1 registers available to compute the next subexpression, therefore requiring no loads or stores from memory at all.[1]
If the Ershov number of the root of the expression tree is greater than the number of registers available, the Ershov number can also be used to determine the amount of additional temporary memory space required, for example on the stack.[1]
See also
- Strahler number, the minimum number of registers needed to evaluate an expression without any external storage
- Sethi–Ullman algorithm, basically the same concept
References
- 1 2 3 "Notes on Code generation" (PDF). Department of Computer Science, University of Calgary. 14 September 2007. Retrieved 30 May 2022.
- ↑ "Optimal Code Generation (for Expressions) and Data-Flow Analysis" (PDF). Carleton University. Retrieved 30 May 2022.
- ↑ "Code Generation, Chapter 8" (PDF). Western Michigan University. Retrieved 30 May 2022.