Instruction Format Design

CIS-77 Home http://www.c-jump.com/CIS77/CIS77syllabus.htm

Instruction Format Design

Instruction Format Design
Encoding The Opcodes
Encoding The Opcodes, Cont.
The Goal to Keep Opcodes Small
Variable-length Opcodes
Variable-length Opcodes, Cont.
Example: One-byte Opcodes
Example: Two-byte Opcodes
Example: Three-byte Opcodes
Opcode Length Trade-offs
Planning for the future
Selecting Instruction Set
Instruction Groups
Encoding Instructions
Opcode Design Trade-offs
Reducing x86 ISA to a Simplified Version
The MOV Instruction
Arithmetic and Logical Instructions
Simplified Instruction Encoding (not x86!)
Simplified Instruction Encoding, Cont. (not x86!)
Simplified Instruction Encoding Example (not x86!)
Simplified Multibyte Instructions (not x86!)
Simplified Multibyte Instructions Cont. (not x86!)
Simplified Special Opcode Instructions (not x86!)
Simplified Jump Instructions (not x86!)
Simplified Conditional Jump Instructions (not x86!)
Simplified Instructions Reserved Opcode (not x86!)
Simplified Zero-Operand Instructions (not x86!)
Extending the Simplified Instruction Set (not x86!)
Problem with Extending the Simplified Instruction Set (not x86!)
Prefix-Extending the Simplified Instruction Set (not x86!)
Prefix-Extending the Simplified Instruction Set Example (not x86!)

1. Instruction Format Design

Since opcodes are submitted to the decoder circuits, encoding opcodes is quite more involved rather than just assigning numbers.

Most important feature of instruction set design -
- make opcodes easy to decode.
The easiest way to do this -
- break up the opcode into several different bit fields.

Opcode fields:

Each field is contributing part of the information necessary to execute the full instruction.
The smaller the bit fields, the easier it is for hardware to decode them.

2. Encoding The Opcodes

Suppose we decided to design a brand-new CPU with a set of 7-bit opcodes.
- With an opcode of this size we could encode 2⁷ = 128 different instructions.
- Decoding individual instructions requires a 7-line to 128-line decoder - an expensive piece of circuitry.
If you have 128 truly unique instructions, there's little you can do other than to decode each instruction individually.
However, assuming our instructions contain certain patterns, we could reduce the hardware cost by replacing this large decoder with a few smaller decoders.

3. Encoding The Opcodes, Cont.

For example, on the x86 CPUs the opcodes for

    mov eax, ebx    ; copy data from EBX register to EAX register

and

    mov ecx, edx    ; copy data from EDX register to ECX register

are different, but both instructions are related: they both move data from one register to another.

The only difference between the two MOVs is the source and destination operands.
This suggests that we could encode instructions like MOV with a sub-opcode and encode the operands using other bits within the opcode.

4. The Goal to Keep Opcodes Small

Another important criteria: keep instruction sizes within a reasonable range.
CPU with unnecessarily long instructions will consume extra memory for programs in memory.
Long instructions hurt overall CPU performance.
Using encoding with n-bit size opcodes leaves us with 2ⁿ different instructions.
With n bits, it seems like you can't do it with any fewer but 2ⁿ opcodes.

5. Variable-length Opcodes

We can make some opcodes longer than n bits...
- ...and that is the secret to reducing the size of a typical program on the CPU!
(This strategy is acceptable only for CISC processors; RISC^(*) processors prefer uniform 32-bit or 64-bit instructions.)
Assuming that CPU is capable of reading byte-sized quantities from memory, each opcode must be some even multiple of 8-bits long.
Another point to consider is the space for instruction operands:
- RISC designers include all operands in their opcode.
- CISC designers, including x86, place constants and address displacements (offsets) apart from the opcode.

______________
^(*) CISC stands for complex instruction set computer design,
while RISC is a reduced instruction set computer.

6. Variable-length Opcodes, Cont.

Most of processors predating the 8086 had 8-bit opcodes, allowing 256 different instructions.
A two-byte opcode would allow 65,536 different instructions...
- ...but from a practical standpoint,
  - most-frequently-used instructions continue to have 8-bit opcodes,
  - less-frequently-used instructions have two-byte opcodes,
  - three (or more) byte opcodes are mostly for the rarely-used-instructions.
Such strategy makes a typical program significantly shorter, compared to a uniform two-byte opcode.

7. Example: One-byte Opcodes

Assume that two high-order bits of an imaginary opcode are not 00, and the opcode size is strictly one byte long.
The 6-bit field marked xxxxxx provides 2⁶ = 64 unique bit patterns.
Together with three non-zero high-order combinations (01, 10, and 11), 192 different one-byte instructions can be encoded:
- 64 × 3 = 192

8. Example: Two-byte Opcodes

Assume that if three high-order bits of the opcode are equal 001, it signals that the opcode size is two bytes.
If so, the remaining 13 bits of the total 16-bit opcode let us encode
- 2¹³ = 8192
different instructions.

9. Example: Three-byte Opcodes

If three high-order bits of the opcode are equal 000, the imaginary opcode is three bytes long.
If so, the remaining 21 bits of the total 24-bit opcode let us encode two million (2²¹) different instructions.

10. Opcode Length Trade-offs

Although we are able modify opcode sizes to have smaller programs, it comes at a price:
- decoding the instructions is a bit more complicated.
- Before decoding opcode field, the CPU must first decode the instruction size.
- This extra step hurts the performance.
These are the reasons, along with some others, why most popular RISC architectures avoid variable-sized instructions.
However, x86 uses the variable-length opcodes, since saving memory is such an admirable goal.

11. Planning for the future

There will be a need for new instructions in the future.
Reserving some opcodes specifically for that purpose is a really good idea.
For example, reserving a block of 64 one-byte opcodes may seem extravagant, but could also be a rewarding foresight!

12. Selecting Instruction Set

Keep in mind that it's much easier to add an instruction later than to remove it.
For starters, it's better to stick with simpler design rather than a more complex one.
First step: let's choose some generic instruction types for a brand-new CPU.
For example, most processors will have instructions like the following:
- Data movement instructions (e.g., MOV)
- Arithmetic and logical instructions (e.g., ADD, SUB, AND, OR, NOT)
- Comparison instruction, CMP
- A set of conditional jump instructions JE, JNE, etc., generally used after the compare instructions.
- Input/Output instructions GET and PUT.
The bottom line: allow programmers to efficiently write programs using as few instructions as possible.

13. Instruction Groups

Once the initial instruction set is determined, next step is to assign opcodes for them.
To do so, instructions are separated into groups with common characteristics:
- For example, an ADD instruction supports exact same set of operands as the SUB instruction.
- NOT instruction requires a single operand, so does the NEG instruction.
- etc.
Once all the instructions are grouped by their respected categories, the next step is to encode the actual opcodes.

14. Encoding Instructions

Some bits are needed to identify:
- instruction group
- instruction code
- operand types: registers, memory locations, constants.
All of the above has a direct impact on the instruction size.
For example, 8-bit opcode could be split into
- one 3-bit iii field to describe instruction and its group, and
- two fields, rr and mmm, (5 bits together) to specify where the instruction operands could be found.

An opcode byte:

15. Opcode Design Trade-offs

Encoding operands is always a problem, because instructions have large number of operand combinations.
For example, an x86 MOV instruction requires a two-byte opcode.
However, Intel noticed that two instructions
```
    mov memory, eax   ; store register in a variable
    mov eax, memory   ; load register from a variable
```
occur very frequently. These instructions store EAX register into memory and load EAX from memory.
As a result, x86 provided a special one-byte versions of dedicated MOV instructions to reduce program clutter.
Note that Intel did not remove the two-byte version of these instructions, but compiler or assembler would always emit the shorter of the two instructions.
By doing so, Intel has made an important trade-off with the MOV instruction encoding:
- giving up extra opcodes in order to provide a shorter version of the MOV sub-family.
Intel used this trick all over the place to make decoding instructions shorter and easier.
This decision dates back to 1978. Today's design could use extra bytes, but the cost of memory was high in 1978!

16. Reducing x86 ISA to a Simplified Version

Advances in computer architecture technology since 1978 made encoding of x86 instructions quite complex and somewhat illogical.
Despite the lack of simplicity, x86 ISA well deserves studying its design and encoding.
To cope with x86 complexity, let's pretend that we deal with a simplified version of the CPU:
- there are only four 16-bit registers: AX, BX, CX, and DX
  - (therefore, register operands can be encoded with just two bits.)
- the address bus is 16-bit with a maximum of 65,536 bytes of addressable memory.
- there are only 20 instructions:
  - MOV (with two forms), ADD, SUB, CMP, AND, OR, NOT,
    JE, JNE, JB, JBE, JA, JAE, JMP,
    BRK, IRET, HALT, GET, and PUT.

17. The MOV Instruction

The two forms of the simplified MOV instruction could have the following forms:
```
        mov reg, reg/memory/constant    ; load register EAX
        mov memory, reg                 ; store register in memory
```
where
- mov is instruction mnemonic
- reg is any of AX, BX, CX, or DX,
- constant is a numeric constant (using hexadecimal notation),
- memory is an operand specifying a memory location.

18. Arithmetic and Logical Instructions

The arithmetic and logical instructions could take the following forms:

    add reg, reg/memory/constant

    sub reg, reg/memory/constant

    cmp reg, reg/memory/constant

    and reg, reg/memory/constant

    or  reg, reg/memory/constant

    not reg/memory

1-operand instructions modify its operand.
2-operand instructions store the result in the destination operand:

19. Simplified Instruction Encoding (not x86!)

Three high-order bit field, iii, defines the instruction and allows 8 unique bit combinations.
(Since we decided to encode 20 different instructions, we cannot encode them with three bits, so we'll have to pull some tricks to handle all of the instructions.)

Consider one-byte opcode with an optional two-byte constant value:

20. Simplified Instruction Encoding, Cont. (not x86!)

There are three iii encoding groups:
1. Special instruction class iii=000 is reserved for instruction set expansion in the future.
2. Two forms of the MOV instruction include:
  - iii=110 specifies rr field is the destination,
  - iii=111 specifies mmm field is the destination.
3. Remaining codes belong to ADD, SUB, CMP, AND, and OR instructions:

Another opcode field, rr, contains the destination register,
...except for MOV whose iii = 111, in which case rr specifies the source register.
Third bit field, mmm, encodes the source operand (again, except MOV whose iii = 111.)

21. Simplified Instruction Encoding Example (not x86!)

For example, encoding of instruction
```
        mov ax, bx
```
consists bit fields
1. iii=110 is the encoding for MOV REG, REG.
2. rr=00 specifies that AX is the destination operand.
3. mmm=001 specifies that BX is the source operand.

Simplified opcode structure:

The encoding produces one-byte opcode 110 00 001, or 0C1h.

22. Simplified Multibyte Instructions (not x86!)

Instruction

        mov ax, [1000h] ; load AX register from memory location 1000h

loads the AX register from memory location 1000h.

The encoding for the opcode is 110 00 110, or 0C6h.
Another encoding,
```
        mov ax, [2000h] ; load AX register from memory location 2000h
```
is exact same 0C6h, because none of the opcode fields store the memory address.
To accommodate 16-bit address or a constant value, we must add two more bytes to the instruction opcode.

23. Simplified Multibyte Instructions Cont. (not x86!)

To encode immediate constant values and address modes such as
```
  0xxxxh          ; immediate mode
  [ 0xxxxh ]      ; direct mode
  [ 0xxxxh + bx ] ; fixed base + reg
```
we add two bytes of 16-bit address or constant value to the opcode:
- low-order byte immediately follows the opcode byte in memory, and
- high-order byte comes after that.

Simplified multibyte instruction encoding:
Three-byte encoding for MOV AX, [1000h] instruction becomes
```
    C6 00 10
```
and the three-byte encoding for MOV AX, [2000h] is
```
    C6 00 20
```

24. Simplified Special Opcode Instructions (not x86!)

The special opcode [7:5]=000 allows our imaginary CPU to expand the set of available instructions.
This opcode handles several zero- and one-operand instructions.

Imaginary Single-Operand Instruction Encoding:

There are four possible one-operand instruction classes, specified by 2-bit ii[4:3] field:
1. The first encoding ii[4:3]=00 further expands the instruction set with a set of zero-operand instructions
2. The second opcode ii[4:3]=01 is also an expansion opcode codifies all of the simplified jump instructions.
3. The third opcode ii[4:3]=10 is the NOT instruction, a bitwise logical not operation that inverts all the bits in the destination register or memory operand.
4. The fourth opcode ii[4:3]=11 is currently unassigned:
  - Any attempt to execute unassigned opcode will halt the processor with an illegal instruction error.
  - CPU designers often reserve unassigned opcodes to extend the instruction set at a future date. Intel did so when moving from the 80286 processor to the 80386.

25. Simplified Jump Instructions (not x86!)

There are seven jump instructions in the simplified x86 instruction set. They all take the following form:
```
        jxx address
```
The JMP instruction copies the 16-bit value (address) following the opcode into the IP register.

Imaginary jump instruction encodings:

Therefore, the CPU will fetch the next instruction from this new target address.
Effectively, the program jumps from the point of the JMP instruction to the instruction at the target address.
The JMP instruction is called an unconditional jump instruction, it always transfers control to the target address.

26. Simplified Conditional Jump Instructions (not x86!)

There are six conditional jump instructions:

    JA  - jump if greater than (above)
    JAE - jump if greater than or equal
    JB  - jump if less than (below)
    JBE - jump if less than or equal
    JE  - jump if equality
    JNE - jump if inequality

You would normally execute JE or similar instruction immediately after a CMP instruction, since it sets the less than and equality flags in the CPU for conditional jump instructions to look at.

Conditional jump instruction mechanics are:
1. Test some condition, and then jump, but only if the condition was true.
2. Fall through to the next instruction if the condition was false.

Conditional jumps test the results of the preceeding CMP instruction. For example,

            cmp     bx, 0           ; Is BX = 0?
            je      is_zero         ; Jump if so
            ...
    is_zero:

27. Simplified Instructions Reserved Opcode (not x86!)

Note that there are eight possible jump opcodes, but so far we needed only seven of them.
The eighth opcode, mmm=111 should then be another illegal opcode.

Imaginary jump instruction encodings:

28. Simplified Zero-Operand Instructions (not x86!)

The last group of instructions is the zero operand instructions.
Three of these instructions are illegal instruction opcodes.
The BRK (break) instruction pauses the CPU until the user manually restarts it.
- This is useful for pausing a program during execution to observe results.
The IRET (interrupt return) instruction returns control from an interrupt service routine.
- (We will discuss interrupt service routines later.)

Zero Operand Instruction Encodings:

The HALT program terminates program execution.
The GET instruction reads a hexadecimal value from the keyboard and returns this value in the AX register.
The PUT instruction prints the value in the AX register.

29. Extending the Simplified Instruction Set (not x86!)

The simplified CPU architecture design does provide the capability for expansion.
The ability to accomplish this exists in the instruction set through undefined/reserved/illegal opcodes.

Imaginary single-operand instruction encoding:

The first method is to directly use the undefined opcodes to define new instructions
- (this works best when there are undefined bit patterns within an opcode group and new instruction falls into that same group.)
For example, opcode "000 11 mmm" falls into the same group as the NOT instruction.
If you decided to add NEG (negate, take the two's complement) instruction,
using opcode "000 11 mmm" makes a lot of sense.
NEG instruction uses the same syntax and decoding, as the NOT instruction.

30. Problem with Extending the Simplified Instruction Set (not x86!)

Unfortunately, the simplified CPU doesn't have that many illegal opcodes available.

For example, to add new single-operand instructions

    SHL - shift left
    SHR - shift right
    ROL - rotate left
    ROR - rotate right

Imaginary single-operand instruction encoding:

There is insufficient space in the single operand instruction opcodes.
Currently there is only one open opcode: 000 11 mmm.

31. Prefix-Extending the Simplified Instruction Set (not x86!)

A common way to handle the opcode shortage (one the Intel designers have employed) is to use a prefix opcode byte:
Prefix opcode expansion scheme uses an opcode prefix byte as follows:
- Decode prefix byte in memory.
- Read and decode the next byte in memory as the actual opcode.
- However, the second opcode byte uses a completely different encoding scheme.
Therefore, prefix lets you specify as many new instructions as you can encode in that byte (or bytes, if you prefer).

32. Prefix-Extending the Simplified Instruction Set Example (not x86!)

Using a prefix byte to extend the instruction set:
For example, the opcode 0FFh is illegal, since it corresponds to a
```
        mov const, dx ; error: attempt to modify immediate operand
```
instruction. So, we could use 0FFh as a special prefix byte to further expand the instruction set.