Course list http://www.c-jump.com/bcc/
Originally Microprocessor without Interlocked Pipeline Stages
Very widely used in embedded systems
Reduced instruction set computer (RISC) instruction set architecture (ISA) developed by MIPS Technologies
(CISC is complex instruction set computer)
Mnemonic instructions for a specific architecture
Translated to machine code by an Assembler
Machine code:
Binary instructions for a specific architecture
A set of mnemonics for machine instructions
Opcodes, register names, addressing modes
A way to name memory addresses and constants
Other conveniences for generating machine code
Embedded systems - size/speed efficiency
Device drivers - direct control of hardware components
A good way to learn a computer architecture
The machine language the CPU implements
Instruction set architecture (ISA)
Built in data types (integers, floating point numbers)
Fixed set of instructions
Fixed set of on-processor variables (registers)
Interface for reading/writing memory
Mechanisms to do input/output
We cannot refer to individual bits
Addressable groups in MIPS:
byte - 8 bits
word - 4 bytes = 32 bits
halfword - 2 bytes = 16 bits
In math we assume arbitrary precision:
to represent a bigger number, just add more digits
Integers in memory have a fixed size
commonly 32 or 64 bits
based on register size of the architecture
This is significant for two reasons:
risk of overflow
representation of negative numbers
To represent a negative number:
start with the positive binary representation
invert every bit
add 1 to the result
Examples with 4-bit integers:
-1 == 0001 ==> 1110 ==> 1111 -4 == 0100 ==> 1011 ==> 1100 -7 == 0111 ==> 1000 ==> 1001
To change the size of an integer without changing its value
if positive (left-most bit 0), pad left with 0s
if negative (left-most bit 1), pad left with 1s
If result is out of representable range, its an error
Two kinds of errors:
overflow -- the number is too large to be represented
underflow -- the number is tooo small
When adding:
two numbers with different signs
overflow can never occur!
two numbers with the same sign
overflow occurs if the sign changes
Little-endian:
x86
MIPS (MARS simulator)
Big-endian:
Motorola 6800 and 68k
Bi-endian (configurable to be big or little):
ARM, SPARC, PowerPC
MIPS (specification)
More info: Endianness in wikipedia.
Data and code segments
# comment .data # constant and variable definitions go here .text # assembly instructions go here
Comment # any text Assembler directive .data, .asciiz, .global Operation mnemonic add, addi, lw, bne Register name $10, $t2 Address label (decl) hello:, length:, loop: Address label (use) hello, length, loop Integer constant 16, -8, 0xA4 String constant "Hello, world!\n" Character constant 'H', '?', '\n'
.eqv SYSCALL_PRINT_STRING 4 .eqv SYSCALL_EXIT_PROG 10 .data # data segment begins # Define a greeting message: Message: .asciiz "Hello World!\n" .text # Print the greeting message: li $v0, SYSCALL_PRINT_STRING la $a0, Message syscall # print string # Return to the operating system: li $v0, SYSCALL_EXIT_PROG syscall # exit program
Can be overwhelming and confusing
Strategy
develop algorithm in pseudocode
break it into small pieces
implement (and test) each piece in assembly
Tip: annotate your assembly code with descriptive comment or pseudocode
(Actually, this is a requirement in our course homework assignments!)
Number Name Usage Preserved? ------ ------- ----------------------- ---------- $0 $zero constant 0x00000000 N/A $1 $at assembler temporary No $2-$3 $v0-$v1 function return values No $4-$7 $a0-$a3 function arguments No $8-$15 $t0-$t7 temporaries No $16-$23 $s0-$s7 saved temporaries Yes $24-$25 $t8-$t9 more temporaries No $26-$27 $k0-$k1 reserved for OS kernel N/A $28 $gp global pointer Yes $29 $sp stack pointer Yes $30 $fp frame pointer Yes $31 $ra return address Yes
Preserved column idicates who is responsible for saving registers when a procedure is called:
No -- caller is responsible
Yes -- callee is responsible
# Instruction # Meaning in pseudocode add $t1, $t2, $t3 # $t1 = $t2 + $t3 sub $t1, $t2, $t3 # $t1 = $t2 - $t3 and $t1, $t2, $t3 # $t1 = $t2 & $t3 (bitwise and) or $t1, $t2, $t3 # $t1 = $t2 | $t3 (bitwise or) # set if equal: seq $t1, $t2, $t3 # $t1 = $t2 == $t3 ? 1 : 0 # set if less than: slt $t1, $t2, $t3 # $t1 = $t2 < $t3 ? 1 : 0 # set if less than or equal: sle $t1, $t2, $t3 # $t1 = $t2 <= $t3 ? 1 : 0
Other instructions of the same form
xor, nor sne, sgt, sge
Using immediate operand -- the second operand which is a constant
Note: the constant is 16-bits, sign-extended to 32-bits
# add/subtract/and immediate: # Instruction # Meaning in pseudocode addi $t1, $t2, 4 # $t1 = $t2 + 4 subi $t1, $t2, 15 # $t1 = $t2 - 15 andi $t1, $t2, 0x00FF # $t1 = $t2 & 0x00FF # set if less than immediate: slti $t1, $t2, 42 # $t1 = $t2 < 42 ? 1 : 0
Other other immediate instructions: ori and xori
Result of multiplication is a 64-bit number, stored in two 32-bit registers named "hi" and "lo"
# Instruction # Meaning in pseudocode mult $t1, $t2 # hi,lo = $t1 * $t2 mflo $t0 # $t0 = lo mfhi $t3 # $t3 = hi
There is a shortcut (macro instruction):
mul $t0, $t1, $t2 # hi,lo = $t1 * $t2; $t0 = lo
which expands to:
mult $t1, $t2 mflo $t0
Computes quotient and remainder. Simultaneously stores quotient in "lo" and remainder in "hi"
# Instruction # Meaning in pseudocode div $t1, $t2 # lo = $t1 / $t2; hi = $t1 % $t2 mflo $t0 # $t0 = lo quotient mfhi $t3 # $t3 = hi remainder
# Instruction # Meaning in pseudocode # copy register: move $t1, $t2 # $t1 = $t2 # load immediate: load constant into register (16-bit) li $t1, 42 # $t1 = 42 li $t1, 'k' # $t1 = 0x6B # load address into register la $t1, label # $t1 = label
Terminology:
label: is a memory address of a variable
directive is a type of data
constant is the initial value
For example,
.data # A zero-terminated string prompt: .asciiz "Enter a number: " # A variable to store user response num: .word 0
Note: there isn't much difference between constants and variables. Both are values in memory that program can read/write.
Long strings can be split into multiple lines like this:
label: .ascii "blah blah blah" .ascii "blah blah" .asciiz "blah blah blah blah blah blah.\n"
An Initializing an array of data can be specified as
fibs: .word 0, 1, 1, 2, 3, 5, 8, 13, 21, 35, 55, 89, 144
Using .space directive
useful for arrays of data whos values are unknown in advance
the argument is the number of bytes to reserve:
myarray: .space 40
here, myarray is the address of the 0th element of the array. The addresses of integer elements are
myarray # element 0 myarray + 4 # element 1 myarray + 8 myarray + 12 ... myarray + 36 # element 9: 36 = ( 4 times 9 )
Basic instruction to read integer from memory is called load word
lw $t1, 4($t2) # $t1 = Memory[$t2+4]
Here, $t2 contains the base address, and 4 is the offset
Note that there is a shortcut:
lw $t1, 0($t2) lw $t1, $t2 # the same lw $t1, label # $t1 = Memory[label] lw $t1, label + 4 # $t1 = Memory[label+4]
Basic instruction to write integer to memory is called store word
sw $t1, 4($t2) # Memory[$t2+4] = $t1
$t2 contains the base address
4 is the offset
Shortcuts:
sw $t1, 0($t2) sw $t1, $t2 # the same sw $t1, label # Memory[label] = $t1 sw $t1, label + 4 # Memory[label+4] = $t1
# Pseudocode: # c = (a+3) * (b-2) + a # Register mappings: # a: $t0, b: $t1, c: $t2 # tmp1: $t3, tmp2: $t4, tmp3: $t5 addi $t3, $t0, 3 # tmp1 = a+3 subi $t4, $t1, 2 # tmp2 = b-2 mul $t5, $t3, $t4 # tmp3 = tmp1 * tmp2 add $t2, $t5, $t0 # c = tmp3 + a
# Add three numbers in memory and print the result .data # string to print before the result STR_PROMPT: .asciiz "Result: " # numbers to add nums: .word -77, 13, -5 # numbers to add result: .word 0 # result .text # print the initial string li $v0, 4 # ask for print string service la $a0, STR_PROMPT syscall # load three numbers into registers la $t0, nums lw $t1, 0($t0) # lw $t1, nums lw $t2, 4($t0) # lw $t2, nums + 4 lw $t3, 8($t0) # lw $t3, nums + 8 # add and store the result in $a0 for printing add $a0, $t1, $t2 # add the first two numbers add $a0, $a0, $t3 # add the third to the sum # save a0 in memory sw $a0, result # print the result li $v0, 1 # ask for $a0 print service syscall # exit li $v0, 10 # ask for exit service syscall
Load word from memory:
lw $t1, 4($t2) # $t1 = Memory[$t2+4]
$t1 is the destination register
$t2 contains the base address (pointer to memory)
4 is the offset from the base address
Store word in memory:
sw $t1, 4($t2) # Memory[$t2+4] = $t1
$t1 is the source register
$t2 contains the base address (pointer to memory)
4 is the offset from the base address
Macro instructions to read/write a specific address
lw $t1, $t2 # $t1 = Memory[$t2] sw $t1, $t2 # Memory[$t2] = $t1
Macro instructions for reading/writing with labels
lw $t1, label # $t1 = Memory[label] lw $t1, label+4 # $t1 = Memory[label+4] lw $t1, label($t2) # $t1 = Memory[label+$t2] sw $t1, label # Memory[label] = $t1 sw $t1, label+4 # Memory[label+4] = $t1 sw $t1, label($t2) # Memory[label+$t2] = $t1
Reading sub-word data
Instructions loading shorter data into a word register:
lb load byte into 32-bit register (sign extend)
lh load halfword (sign extend)
lbu load byte unsigned (zero extend)
lhu load halfword unsigned (zero extend)
Beware of little-endian addressing
.data date: .byte 1 # month .byte 30 # day .half 2014 # year STR_PROMPT: .asciiz "1234" .text # load some values into registers lbu $t0, date # month lbu $t1, date + 1 # day lhu $t2, date + 2 # year lbu $t3, STR_PROMPT # '1' lbu $t4, STR_PROMPT + 1 # '2' lbu $t5, STR_PROMPT + 2 # '3' lbu $t6, STR_PROMPT + 3 # '4'
sb store byte (low order byte)
sh store halfword (low order halfword)
Important note: lw and sw must respect word boundaries:
the address (base+offset) must be divisible by 4
likewise for lh, lhu, and sh the address must be divisible by 2
System calls are asking OS to perform services
syscall:
checks $v0 for the type of the service
the arguments (if any) are passed in $a0 - $a3
Use MARS help or go to
courses.missouristate.edu/kenvollmar/mars/help/syscallhelp.html
and $t1, $t2, $t3 # $t1 = $t2 & $t3 (bitwise and) or $t1, $t2, $t3 # $t1 = $t2 | $t3 (bitwise or) xor $t1, $t2, $t3 # $t1 = $t2 ^ $t3 (bitwise xor)
Immediate formats
andi $t1, $t2, 0x0F # $t1 = $t2 & 0x0F (bitwise and) ori $t1, $t2, 0xF0 # $t1 = $t2 | 0xF0 (bitwise or) xori $t1, $t2, 0xFF # $t1 = $t2 ^ 0xFF (bitwise xor)
1 0 1 0 and 0 0 1 1 ------- 0 0 1 0 1 0 1 0 or 0 0 1 1 ------- 1 0 1 1 1 0 1 0 xor 0 0 1 1 ------- 1 0 0 1
MARS provides a macro instruction for bitwise not. The instruction inverts every bit
A logical not can be done as follows:
xori $t1, $t2, 1 # $t1 = not $t2 (logical not)
seq $t1, $t2, $t3 # $t1 = $t2 == $t3 ? 1 : 0 sne $t1, $t2, $t3 # $t1 = $t2 != $t3 ? 1 : 0 sge $t1, $t2, $t3 # $t1 = $t2 >= $t3 ? 1 : 0 sgt $t1, $t2, $t3 # $t1 = $t2 > $t3 ? 1 : 0 sle $t1, $t2, $t3 # $t1 = $t2 <= $t3 ? 1 : 0 slt $t1, $t2, $t3 # $t1 = $t2 < $t3 ? 1 : 0
MARS Immediate formats:
slti $t1, $t2, 42 # $t1 = $t2 < 42 ? 1 : 0 ...and so on...
# Pseudocode: # c = ( a < b ) || ( ( a + b ) == 10 ) # Register mappings: # a: t0 # b: t1 # c: t2 add $t3, $t0, $t1 # tmp = a+b li $t4, 10 # tmp = tmp == 10 seq $t3, $t3, $t4 slt $t2, $t0, $t1 # c = a < b or $t2, $t2, $t3 # c = c | tmp
# Pseudocode: # c = (a < b) && ((a+b) % 3) == 2 # Register mappings: # a: t0, b: t1, c: t2 # tmp1: t3, tmp2: t4 add $t3, $t0, $t1 # tmp1 = a+b li $t4, 3 # tmp1 = tmp1 % 3 div $t3, $t4 mfhi $t3 seq $t3, $t3, 2 # tmp1 = tmp1 == 2 slt $t4, $t0, $t1 # tmp2 = a < b and $t2, $t3, $t4 # c = tmp2 & tmp1
Strategy:
insert labels in text segment
jump or conditionally branch to labels
Your only options in MIPS are "goto" or "if-goto"
Unconditional jump instructions (branch instructions)
j label # goto label jr $t1 # goto the address in $t1
# Basic instructions beq $t1, $t2, label # if ($t1 == $t2) goto label bne $t1, $t2, label # if ($t1 != $t2) goto label bgez $t1, label # if ($t1 >= 0) goto label bgtz $t1, label # if ($t1 > 0) goto label blez $t1, label # if ($t1 <= 0) goto label bltz $t1, label # if ($t1 < 0) goto label # Macro instructions beqz $t1, label # if ($t1 == 0) goto label bnez $t1, label # if ($t1 != 0) goto label beq $t1, 123, label # if ($t1 == 123) goto label bne $t1, 123, label # if ($t1 != 123) goto label bge $t1, $t2, label # if ($t1 >= $t2) goto label bgt $t1, $t2, label # if ($t1 > $t2) goto label bge $t1, 123, label # if ($t1 >= 123) goto label bgt $t1, 123, label # if ($t1 > 123) goto label ble ... # similar blt ... # similar
# Pseudocode: # if (a < b + 3) # a = a + 1 # else # a = a + 2 # b = b + a # Register mappings: # a: $t0, b: $t1 addi $t2, $t1, 3 # tmp = b + 3 blt $t0, $t2, ifless # if (a < tmp) addi $t0, $t0, 2 # otherwise a = a + 2 j finish ifless: addi $t0, $t0, 1 # if true, a = a + 1 finish: add $t1, $t1, $t0 # b = b + a
# Pseudocode: # if (a < b + 3) # a = a + 1 # b = b + a # Register mappings: # a: $t0, b: $t1 # One implementation addi $t2, $t1, 3 # tmp = b + 3 blt $t0, $t2, ifless # if (a < tmp) j finish ifless: addi $t0, $t0, 1 # if true, a = a + 1 finish: add $t1, $t1, $t0 # b = b + a # Another implementation addi $t2, $t1, 3 # tmp = b + 3 bge $t0, $t2, finish # if (a >= tmp) goto finish addi $t0, $t0, 1 # a + 1 finish: add $t1, $t1, $t0 # b = b + a
# Translate to lower-level pseudocode: # sum = 0 # i = 0 # while (i < n) { # sum = sum + i # i = i + 1 # } li $t2, 0 # sum = 0 li $t1, 0 # i = 0 loop: bge $t1, $t0, endloop # Loop begins: if i >= n goto endloop add $t2, $t2, $t1 # sum = sum + i addi $t1, $t1, 1 # i = i + 1 j loop endloop: ...
The stack pointer is in register $sp
$sp contains the address of the top of the stack
OS loader initializes $sp when the program is loaded
At runtime, $sp is your responsibility!
For example, stack frames in procedures are managed like this:
addiu $sp, $sp, -24 # allocate 6 words on the stack ... sw $t0, 16($sp) # save $t0 in stack frame ... lw $t0, 16($sp) # restore $t0 from the stack frame ... addiu $sp, $sp, 24 # deallocate 6 words on the stack
# arithmetic and logic add $t1, $t2, $t3 or $t1, $t2, $t3 slt $t1, $t2, $t3 # mult and div mult $t2, $t3 div $t2, $t3 # move from/to mfhi $t1 mflo $t1 # jump register jr $ra
# immediate arithmetic and logic addi $t1, $t2, 345 ori $t1, $t2, 345 slti $t1, $t2, 345 # branch and branch-zero beq $t2, $t3, label bne $t2, $t3, label bgtz $t2, label # load/store lw $t1, 345($t2) sw $t2, 345($t1) lb $t1, 345($t2) sb $t2, 345($t1)
# unconditional jump j label # range: any address in current 256Mb block # jump and link jal label # sets $ra = return address # range: any address in current 256Mb block # conditional branches - beq, bne - range +/- 128KB Jump register - jr # range: any addressable memory location (4GB)
Goal: execute programs faster
How:
separate processor into stages
overlap the execution of consecutive instructions
MIPS is designed for pipelining
Pipeline stages:
IF - Instruction Fetch
ID - Instruction Decode
EX - EXecute
ME - MEmory access
WB - Write Back
Hazard is a dependency that breaks pipelining
Data hazard program needs a value that has not been computed yet
Control hazard program does not know which instruction is next
Data hazard example:
add $t1, $t2, $t3 # IF ID EX ME WB-$t1 is set here addi $t4, $t1, 1 # IF ID-$t1 is read here EX ME WB
Solution 1: processor inserts delays
add $t1, $t2, $t3 # IF ID EX ME WB-$t1 is set here addi $t4, $t1, 1 # IF XX XX XX ID-$t1 is read here EX ME WB
Solution 2: processor reorders instructions to avoid data hazards