File Intelx86.html    Author McKeeman    Copyright © 2007    index

Intel x86

Memory

Memory is addressed to the byte; it is little-endian. To avoid over-complicated explanations, suppose that there are three different instructions to load a value from memory into a 32-bit register

       LOAD8  addr
       LOAD16 addr
       LOAD32 addr

... and suppose the bytes of memory at addr contain

       123456789...

... then the loaded register will contain

       0001
       0021
       4321

The consequence is that looking at a memory dump gives a different byte order than looking at a register. It takes some getting used to and requires little-endian emitters to put word-values into memory in the right order.

Registers

number name use
EIPprogram counter
0EAXint32 accumulator
1ECXint32 accumulator
2EDXint32 accumulator
3EBXint32 accumulator
4ESPint32 stack top
5EBPint32 frame base
6ESIpointer
7EDIpointer
EFLAGSstatus bits

The register names drip with history. This general architecture started as a 4-bit machine, then grew to 8, 16, 32, and 64. The x86 is the 32-bit stopover. There were previous names such as A, AH, AL, AX for what now is the EAX, or Extended Accumulator eXtended. The old names are still used for various reasons, one of which is backward compatibility. What happens in the registers is 32-bit 2's complement arithmetic. What the values are used for is integer arithmetic and memory addressing.

One can do a little research project to find out why EAX, EBX... are not in alphabetical order. And one can try to guess without looking how the 32-bit names are extended again for the 64-bit CPU.

EIP is the program counter. It addresses 232 bytes of memory. Instructions are variable length (1 byte to 10 or more). EIP is always pointing at the next thing to do. It clicks on all by itself until a branch happens. Branch addresses are normally self-relative. It is easy to make a mistake and be off by one. It is usually exciting to branch into the middle of some other instruction.

Accumulators

The registers 0-3 are used by xcom as accumulators. EAX sometimes is "special," so the code generator has to dance around a bit while using it. The registers 4-7 are even more "special."

Calling Sequence

Registers ESP and EBP (4,5) are used exclusively for subroutine call and return. They contain the information for maintaining the hardware run stack. As it happens, the run stack 'grows' toward address zero so the 'top' of the stack is at the lowest address. ESP points at the top of the hardware run stack. EBP points at the base of the current hardware stack frame.

Each subroutine call needs a new frame. Memory beyond ESP is free to use, so (the old) EBP is pushed onto the stack and reset to ESP. The new frame is then using free stack space. The called routine must then set ESP to point beyond the new stack frame, again resuming is role in pointing to the next available memory.

The state of the CPU has to be preserved across the calling sequence. The buzz-words are caller save and callee save. What they mean is either that the caller saves things it needs so that it can restore them after the call returns, or that the subroutine saves things immediately upon entry, then restores them just before returning. There are advantages to both in avoiding unneccessary saves. In xcom, callee save/restore is implemented.

Data Pointers

Registers ESI and EDI (6,7) are normally used to deal with blocks of memory; xcom hijacks ESI as a pointer to the (malloc-d) X frame.

Flags

The flag bits change every time the status changes, perhaps after every instruction. The flag bits get set, for example, by comparison instructions. The flag bit values have to be used "right away" to avoid clobbering them.

The flags used by xcom are CF, SF, ZF and PF meaning carry flag, sign flag, zero flag and parity flag. The zero flag ZF is 1, for instance, if the last result was zero.

Floating Point Unit

IEEE floating point computations are executed in the FPU. It used to be a separate piece of hardware which had to communicate off-chip with the instruction execution unit. The communication instructions are still used even though it is all on one chip now. The FPU has eight 80-bit wide registers organized as a stack and 8 1-bit tags. Tag value 1 means useful data is in the register. The top of the FPS is always numbered R0. The FPS is accessed with Reverse Polish instructions. Here is a picture with two numbers on the FPS.

value tag
R01.111
R10.00731
R20
R30
R40
R50
R60
R70

Both the significand and exponent carry more bits in the FPU than in memory. The FPU state can be set to ignore the extra significand bits, but not the extra exponent bits. FPS overflow discards the bottom value. Don't cause it to happen. xcom in fact only uses two of the FPS stack entries.

FPU Status Word

Corresponding to the flags, there are status bits in the FPU. They are needed for conditional branches. To get them into the flags, there are instructions that move the FPU status bits into EAX, and then into the flags. This is one of the places where EAX is "special".

Instructions

The Intel names for instructions are just a little too abstract for direct use in the xcom assembler. It is important to know the addressing mode. Thus there is an xcom convention shown in the following examples.

fldM floating load from memory (ESI relative)
fstMp floating store into memory (ESI relative) and pop FPS
faddp floating add and pop FPS
fcompp floating compare and two pops of FPS
fldA floating load absolute (literal x86 address)
movRM move from memory (ESI relative) to a general register
movRC move a constant into a general register
movMR move a general register into memory (ESI relative)
addRR add register to register

The official documents from Intel: