Compilers, CPU, and memory

CIS-155

Preprocess, Compile, Link, and Execute

Previous section overview of "Hello, World!" program and creation of VC++ project made it possible to compile our first program from command line, as well as using Visual Studio IDE (integrated development environment). In both cases creation of the executable file on disk could be illustrated by the following diagram:

Since preprocessing, compiling, and linking are commonly viewed as steps of a single attempt to create an executable file, they can be collectively referred to as "building" process. Usually, if one of the steps fails, the building stops, with error descriptions printed either to the command window, or to the output window of the IDE.

The following sequence diagram demonstrates typical successful events circulating between components of the build in the order in which they occur:

1.4.1. The preprocessor

A file is the traditional unit of storage in the file system. Similarly, file is also the traditional unit of program compilation.

Having a complete program in one file is usually impossible. The code of C++ standard libraries or library of the operating system-specific programs are typically supplied by #include directives in our source file. The outline goes like this:

A user presents a source file to the compiler.
The file is preprocessed; that is, macro processing is done and all #include directives bring in the headers. (We will discuss macros in a separate section of this course.)
The result of preprocessing is called a translation unit, which now looks like a single file to the compiler. This unit is what the compiler understands and what the C++ language rules describe.

1.4.2. The compiler

Compiler is a translator program that converts high-level language program into machine language. The CPU (central processing unit) can directly understand its own machine language. The typical outline goes like this:

C++ compiler automatically invokes preprocessor (described above), which forms a translation unit from our program file.
C++ translates the unit into the object code and saves it as an intermediate file on disk.
The resulting object code produced by the compiler is incomplete. It lacks actual memory addresses of functions and data variables defined elsewhere. An example of such gap is a system function call to print characters on the screen. Although compiler does construct a CALL assembly instruction, the actual address of the function to call remains unresolved at this stage.

1.4.3. The linker

The linker is the program that binds together all separately compiled parts. A linker links the object code with the code for the missing functions to produce an executable image, and creates executable file on disk. C++ and operating system object libraries supply the missing functions. The program now can be loaded into memory and executed.

1.5. Loading and Running Programs

Once program is ready it can be run from the command window. The loader component of the operating system takes the executable image from disk and transfers it to memory for execution:

Our program is loaded alongside of other user programs in memory as follows:

The operating system occupies the layer between user programs and the central processing unit, or CPU for short.

Other layers may exist between the user programs, the operating system and the CPU. Typical layers are user-mode and kernel-mode debuggers:

User-mode debugger is a type of process that tells the CPU to execute processor instructions in a single step mode. The debugger loads the user program in a separate process. In this setting, debugger is a parent, and the user program is a child process. Debugger can execute user program one statement at a time, helping programmers to detect and investigate "bugs". Debuggers are valuable run-time tools to locate and remove logic errors. Microsoft VC++ has its own debugger with graphical user interface.

Kernel-mode debuggers (KDs) are similar to the user-mode debuggers, but when KD stops execution, entire operating system stops. KD debuggers often support an interface that allows the programmer to remotely control its functions from another computer, connected via a null-modem cable. KD debuggers allow developers to detect bugs in components of the operating system, such as the device drivers.

Although the CPU executes individual program instructions, the operating system has a mechanism to interrupt the execution of a user program and give the execution time slice to another program. This type of task scheduling creates an impression to the user that all programs are running in parallel.

1.5.1. Computer Memory

Computer memory is composed of storage cells sometimes referred to as words. Each cell has a unique numeric address associated with it, which identifies its location in the memory. The number of memory cells in a computer varies but is usually measured in millions of cells. Each memory cell contains a binary number made up of a series of binary digits or bits, usually 8, 16, 32 or 64.

A binary digit has two possible values 0 or 1. A binary number is therefore comprised of a sequence of 0s and 1s. The digits contained in each cell are represented by voltage levels with peaks representing 1s and troughs 0s.

A group of 8 bits is called a byte. The size of memory is usually expressed in the unit kilobyte, shortened to Kb, which is 1024 (2¹⁰) bytes. Storage capacity can also be expressed as a number of megabytes (Mb) approximately a million (2²⁰ = 1048576) bytes, or gigabytes (Gb) approximately a (US) billion (2³⁰ = 1073741824) bytes.

Binary numbers are usually represented using hexadecimal notation, which are the numbers to the base 16, as opposed to the base 2 used by the binary number system, or to the base 10 used in the general purpose decimal system. The hexadecimal system comprises 16 distinct digits. Hexadecimal digits 0 to 9 are the same as those for a decimal number. To represent the 6 extra digits the letters A through to F are used; these are equivalent to the decimal numbers 10 to 15:

Binary	Decimal	Hexadecimal
0000	0	0
0001	1	1
0010	2	2
0011	3	3
0100	4	4
0101	5	5
0110	6	6
0111	7	7
1000	8	8
1001	9	9
1010	10	A
1011	11	B
1100	12	C
1101	13	D
1110	14	E
1111	15	F

A binary number can be converted to a hexadecimal number by partitioning the binary number into groups of four bits and evaluating each group as a hexadecimal digit. For example, the binary number

0100110111010000

would be split into

0100  1101  1101  0000
  4     D     D     0

which would evaluate to the number 4DD0 in hexadecimal form.

TIP: programmers often use a desktop calculator to convert large numbers to different bases:

Note that each memory address is represented by a unique number - in practice this is usually a hexadecimal number. The lower end of the memory addresses range is often referred to as low memory (the other end is high memory). The operating system is normally located in low memory.

Address	Binary sequence	Hexadecimal Equivalent	Decimal Equivalent
50004	0100 1101 1101 0000	4DD0	19760
50003	0000 1100 0111 0110	0C76	3190
50002	0100 0001 0000 1001	4109	16649
50001	0010 0001 1111 1101	21FD	8701
50000	0100 1100 0000 0010	4C02	19458

1.5.2. Program in Memory and the CPU

Once in memory, the program is configured to have access to separate logical segments of the memory as follows:

code segment: contains executable CPU instructions
data segment: memory contains program data, such as variables
stack segment: space to store temporary data

The latest diagram zooms in on the Intel CPU and program in memory. The CPU has its own built-in memory: the registers. Instruction pointer always points to the next instruction that processor is about to execute. The instruction gets loaded from memory into the instruction register. General-purpose registers hold instruction operands, results of calculation, and also point to the temporary data storage in memory, that is, contain addresses of the memory reserved for program stack. Program stack provides memory for temporary data, such as temporary variables.

General-purpose registers have layout that provides access to individual bytes. For example,

1.6. CPU Instructions

Processor instructions can have one of the following formats:

        Instruction;
-or-
        Instruction Operand;
-or-
        Instruction Operand, Operand;

For example, the following little program uses C++ assembler keyword __asm, which allows to embed individual processor instructions directly in your C++ program:

void main()
{
    __asm nop
    __asm push EAX
    __asm mov EAX, 5
    __asm pop EAX
}

Here,

nop
is the instruction that does nothing (no operation)
push EAX
saves current value of the EAX register on the stack
mov EAX, 5
sets EAX register equal to 5
pop EAX
restores original value of the register EAX from the stack

The mov instruction is the most common instruction used on the CPU because it's the way to move values from one place to another:

        mov destination, source

The source operand of the mov instruction can be:

source is an immediate value (hard-coded value), such as 5:
mov EAX, 5
source is a register, such as EBX:
mov EAX, EBX
source is a memory reference, such as [EBX]:
mov EAX, [EBX]

The destination operand can be a register or a memory reference. The Intel CPUs don't allow both a source and a destination to be memory references:

destination is a register, such as EAX:
mov EAX, 5
mov EAX, EBX
destination is a memory reference, such as [EAX]:
mov [EAX], EBX

1.6.1. Memory references

Memory reference indicates that instruction operand is located in memory. Memory reference requires a memory address that is specified inside square brackets. For simplicity, we will discuss memory references that use a general-purpose register (such as EAX, EBX, ECX, or EDX) to specify the address. For example, instruction

mov [EAX], EBX

moves the value from register EBX to the memory location specified by the address stored in register EAX.

The lea instruction, whose abbreviation stands for load effective address, loads the destination register with the address of the source operand. For example,

lea EAX, main         ; load address of the main function of our program
lea EAX, DS:411A2Eh   ; Load absolute memory address located in data segment

1.6.2. Memory reference example: code label

Consider the following example. Since there are no data variables, we added a label that gives a "named address" to a particular location in our code:

void main()
{
start:
    __asm push EAX          ; save EAX data

    __asm lea EAX, start    ; load address of the program label into EAX
    __asm mov [EAX], EBX    ; move garbage from EBX to memory location [EAX]

    __asm pop EAX           ; restore EAX data
}

If we compile and run this program, it will generate an error:

    First-chance exception at 0x00411a35 in main.exe:
    0xC0000005: Access violation writing location 0x00411a2e.

Indeed, user-mode programs are not allowed to modify executable code in memory.

1.6.3. Memory reference example: global variable

In order to further demonstrate the CPU access to memory, we could modify our program to have a global variable named x:

#include <iostream>
int x;
void main()
{
    __asm push EAX
    __asm push EBX

    __asm mov EBX, 5        ; store integer number 5 in register EBX

    __asm lea EAX, x        ; load address of variable x into register EAX
    __asm mov [EAX], EBX    ; move data from EBX to memory location [EAX]

    __asm pop EBX
    __asm pop EAX
    
    std::cout << "x is equal to ";
    std::cout << x;
}

This time, program compiles and runs fine. If you run it in debug mode, you should be able to see how x changes from zero to five.

1.7. Conclusion

Our examples demonstrated that computer memory is a large array of memory locations. Each location has a unique address. A program loads into memory in three logical segments: code, data, and stack; each of them having their own address space.

Address space that does not belong to our program is off limits to us: an attempt to use that memory will result in error known as access violation exception. Our own program code is read-only to our program.

Each compiled program contains a sequence of processor instructions. An instruction is the smallest command that the processor can execute at one time. Instructions vary in size and can have one, two, or no operands. Registers are a few of the most important computer resources: they are the fastest type of memory directly referenced inside CPU instructions. Registers are used heavily when the data is moved to and from memory locations.

C++ allows direct inserts of the assembly code into programs. The compiler translates in-line assembler commands into CPU instructions. In many cases, when your program crashes, the real difference between solving the bug and screaming in frustration comes down to how well you can read a little assembly language!