CIS-77 Home http://www.c-jump.com/CIS77/CIS77syllabus.htm
What are the aspects of using Assembly language with high-level programming languages?
Assembly is a powerful and flexible programming tool. Every professional programmer, regardless of the chosen programming language, has to master the Assembly language as a second tool. This is similar to a common opinion held by linguists, who say that everyone who chooses to become a professional linguist must master Latin before studying other European languages.
In flat memory model all procedure calls are NEAR, which means that they take place within one large memory segment. This makes CALL/RET coordination between multiple modules, potentially written in different programming languages, quite easy.
However, procedure name coordination is more complicated:
The MASM Assembler adds the @N suffix to names, where N is the number of parameters passed on the stack, measured in bytes. The Visual C++ compiler does the same thing.
MASM generates the the leading underscore character automatically, if the standard call calling convention , STDCALL, is specified at the beginning of the program.
Coordination of uppercase and lowercase letters is important.
C++ allows function name overloading, so that the same C++ identifier can refer to different functions. For C++ programmers these functions differ in the number of parameters, parameter types, and type of the return value. Internally, C++ compiler automatically decorates function names in such way that different functions can be distinguished by the linker.
By default, Assembly programs use STDCALL calling convention for passing parameters.
When dealing with high-level languages, programmer needs to be aware of the calling convention used by that particular language.
Types of return values:
In Assembly language, everything is simple: the value is returned in the EAX register with two possibilities:
the value is either a number, or
a pointer to some variable or structure.
If the return value has the WORD data type, it is passed in the least significant word of the EAX register, namely, AX.
(When dealing with the C programming language, type casting of return value must be also considered.)
The following sample (downolad COPYSTR.zip ) demonstrates a simple module M13_COPYSTR.ASM ( download ) written in Assembly language.
The ASM file contains a procedure that copies one text string to another string.
This module can be linked to different programs written in the C++.
The caller program is M13_main.cpp ( download ) is written in C++.
The M13_compile.bat ( download ) is the command file to build the M13.EXE executable.
; The M13_COPYSTR.ASM file .586P .MODEL FLAT, stdcall ; Flat memory model PUBLIC COPYSTR _TEXT SEGMENT ; Procedure for copying the null-terminated source string to the target string. ; ; Input: ; str_dest...Target string [ EBP + 08H ] ; str_src....Source string [ EBP + 0CH ] ; Output: ; EAX........address of the target string ; ; WARNING! target string length is unchecked ; COPYSTR PROC str_dest: DWORD, str_src: DWORD MOV ESI, str_src ; DWORD PTR [ EBP + 0CH ] MOV EDI, str_dest ; DWORD PTR [ EBP + 08H ] L1: MOV AL, BYTE PTR [ESI] MOV BYTE PTR [EDI], AL CMP AL, 0 JE L2 INC ESI INC EDI JMP L1 L2: MOV EAX, DWORD PTR [ EBP + 08H ] RET COPYSTR ENDP _TEXT ENDS END
// The M13_main.cpp file #include <iostream> extern "C" int __stdcall COPYSTR( char*, char* ); int main() { char destination[ 100 ] = { 0 }; char* source = "Hello!"; COPYSTR( destination, source ); std::cout << destination; return 0; }
If you are using VC++ 2005 project to compile your executable and getting run-time errors about missing MSVCR80.DLL or MSVCR80D.DLL file,
Click Project -> M13 properties
Configuration: change to All configurations, then
click Configuration Properties, General, Use of MFC field: change to Use MFC in a Static Library.
Click Apply and OK.
Recompile and test the application again.
The function that the program calls from the main program is declared using the extern "C" and __stdcall modifiers:
// The M13_main.cpp file
extern "C" int __stdcall COPYSTR( char*, char* );
The __stdcall calling type assumes that the stack is cleared in the called procedure.
In the Assembly module, the called procedure must be declared using the PUBLIC directive:
; The M13_COPYSTR.ASM file
.586P
.MODEL FLAT, stdcall
PUBLIC COPYSTR
The C++ compiler automatically adds the underscore and the @8 suffix to the end of the public procedure name: _COPYSTR@8
To verify this, one could run the command
dumpbin /disasm Release\M13.exe > Release\M13.txt
and examine the resulting Disassembly file M13.txt.
Please note that when using PROC directive when defining procedures such as COPYSTR, the explicit setting and releasing of the stack frame using the EBP register isn't necessary. The assembler automatically controls the stack. Here is the excerpt from the M13.txt dump:
Dump of file Release\M13.exe File Type: EXECUTABLE IMAGE _COPYSTR@8: 00401000: 55 push ebp 00401001: 8B EC mov ebp,esp 00401003: 8B 75 0C mov esi,dword ptr [ebp+0Ch] 00401006: 8B 7D 08 mov edi,dword ptr [ebp+8] 00401009: 8A 06 mov al,byte ptr [esi] 0040100B: 88 07 mov byte ptr [edi],al 0040100D: 3C 00 cmp al,0 0040100F: 74 04 je 00401015 00401011: 46 inc esi 00401012: 47 inc edi 00401013: EB F4 jmp 00401009 00401015: 8B 45 08 mov eax,dword ptr [ebp+8] 00401018: C9 leave 00401019: C2 08 00 ret 8
The LEAVE command is an equivalent of
mov esp, ebp pop ebp
C/C++ extern keyword specifies that external linkage conventions to link with other languages should used by the variable or function declarator.
extern-declared functions and data become visible to the linker accross multiple .OBJ modules.
However, the extern functions must be defined in a separately compiled translation unit(s).
The following sample demonstrates this technique: COPYNEW.zip contains C++ program calling COPYNEW( ) function defined by the module written in Assembly. The sample contains the following files:
M13_main.cpp ( download ) the main driver program.
M13_NEWARRAY.h ( download ) C++ header file declaring extern functions.
M13_NEWARRAY.cpp ( download ) C++ implementation file defining external functions.
M13_COPYNEW.ASM ( download ) Assembly program that dynamically allocates block of memory and makes copy of the source string.
M13_compile.bat ( download ) Batch command file to compile, assemble and link M13.exe.
Your assignment is to build a small command-based stack machine.
Here is the minimal set of commands that your machine must understand:
push N ; PUSH( N ) add ; PUSH( POP() + POP() ) hlt ; HALT the machine start ; START the machine
where
PUSH command pushes decimal value N on the stack. If stack overflow condition is detected, the comand prints an appropriate error message and goes into a HALT state.
ADD command pops two values from the stack and computes their sum. The result is pushed back on the stack. If less than 2 values are present on the stack, the comand prints an error message and goes into a HALT state.
HLT command stops the execution and places the machine in the HALT state.
START command sets the instruction pointer to the first instruction, and switches the machine into a RUN state, which executes the loaded program.
Each command needs to print detailed information about its particular execution step. For example, the source program
push 123 push 456 add add hlt start
should generate the following (or similar) output as a result of the execution:
1. PUSH 123 2. PUSH 456 3. ADD 456 + 123 = 579 4. ADD 579 + *** ERROR: THE STACK IS EMPTY
The first column displays an imaginary program counter. This can be either a sequential number or an actual address of the command in memory.
The stack machine needs two types of memory to run:
memory to store a sequence of encoded commands and their operands
the data stack
Both memories could by allocated either statically or dynamically. It is up to you which way to go. Note that COPYNEW.zip code example illustrates dynamic memory allocation using the new C++ operator.
Implementation must be a mix of the following parts assembled, compiled, and linked together as a whole:
Main driver: use C/C++ main function as entry point into the application.
Memory allocation: any approach is fine, dynamic or static.
User input: natural choice is C/C++ standard I/O facilities, or use IO.ASM, if you like. I recommend using C/C++ I/O libraies, since they have superior industrial strength when compared to a toy-like IO.ASM bundle.
Input parsing: using C/C++ is highly recommended unless you have lots of extra time and nothing else to do.
Stack Machine State: (RUN or HALT) - since actual memory allocation can be done by either ASM or C++, use your judgement...
...The main driver of the application needs to know very little about the stack machine state, so keeping the state private to the ASM module is probably your best choice.
Command execution: you must use ASM code to implement the stack machine logic and manipulate its memory. Each command must be handled by the assembly module. Adding a set of dedicated command-specific PROCs is highly recommended. Using a table of procedure handlers is an ideal approach to implement the stack machine mechanism.
Console output: ASM code can invoke C/C++ function to handle the standard output. However, you can use IO.ASM, if you like.
It's best to begin with a very simple interface to the stack machine implementation. All communication could be accomplished by two procedures,
SM_LOAD_COMMAND( opcode, arg ); // Load command into the machine SM_RUN(); // Execute program
STM.zip sample demonstrates a prototype of SM_RUN( ) call. It includes the following files:
M13_main.cpp ( download ) is the main driver of the program.
M13_externs.h ( download ) C++ header file declaring all extern data and functions.
M13_externs.cpp ( download ) C++ implementation file defining externally visible data and functions.
M13_SM_RUN.ASM ( download ) Assembly program simulating the stack machine.
M13_compile.bat ( download ) Batch command file to compile, assemble and link M13.exe.
This is an open-ended project. The possibilities are endless when considering other useful commands and features.
Level difficulty I (easy, 5 xtra pts awarded for each feature):
Add arithmetic commands SUB, MUL, DIV, and MOD (modulus, or remainder of the division).
Add a CMP command to compare the operand with the value on top of the stack. Compute the difference and push the result on the stack:
cmp 0
Level difficulty II (advanced, 15 xtra pts each):
Add a fixed number of register-like variables (e.g. EAX, EBX, ECX, EDX, ...) available to the stack machine programmer.
Add new commands that work with variables:
push 123 pop eax push eax pop ebx
Add program comments, for example,
; This is a comment
The comments should be handled by the text input parser, thus not requiring any changes in the stack machine code.
Level difficulty III (most difficult, 25 xtra pts):
Consider adding code labels and a jump command:
push 10 push 20 push 30 label: add jmp label
The above fragment computes the sum (10+20+30) and terminates as soon as the stack becomes empty.
Once basic jump mechanism is in place, conditional jumps can be added, as well as the subroutine CALL and RET commands.
Submit only C/C++/ASM source files.
Submit also a program that demonstrates the features of your stack machine implementation. Provide comments for anything non-trivial that deserves special attention.
Do not send any EXE, binary, or project files.
Good luck!!
Dealing with user input...
...see C++ stream input, output, and type conversions (CIS-60 material)
C++ Standard Library demo: cjumpstl.exe
Another demo of
C->ASM call and __stdcall calling convention.
ASM->C call.
Best ways to deal with dynamic memory allocation...
Download latest: InClassProject.zip