Data Types and Memory Allocation

Data Organization: DB, DW, and EQU

  • Representing data types in assembly source files requires appropriate assembler directives.

  • The directives allocate data and format x86 little-endian values.

  • Bytes are allocated by define bytes DB.

  • Words are allocated by define words DW.

  • Both allow more than one byte or word to be allocated.

  • Question marks specify uninitialized data.

  • Strings allocate multiple bytes.

  • Labels in front of the directives remember offsets from the beginning of the segment which accommodates the directive.

  • DUP allows to allocate multiple bytes. The following two lines produce identical results:

        DB ?, ?, ?, ?, ?
        DB 5 DUP(?)
  • Note that EQU directive does not allocate any memory: it creates a constant value to be used by Assembler:

        CR EQU 13
        DB CR
        mov al, CR
  •   allocation directives

  • At the beginning:

      OllyDbg before excution

  • At the end:

      OllyDbg after excution

  • Different processors store multibyte integers in different orders in memory.

  • There are two popular methods of storing integers: big endian and little indian.

  • Big endian method is the most natural:

    • the biggest (i.e. most significant) byte is stored first, then the next biggest, etc.

  • IBM mainframes, most RISC processors and Motorola processors all use this big endian method.

  • However, Intel-based processors use the little endian method, in which the least significant byte is stored first.

  • Normally, the programmer does not need to worry about which format is used, unless

    1. Binary data is transfered between different computers e.g. over a network.

      • All TCP/IP headers store integers in big endian format (called network byte order.)

    2. Binary data is written out to memory as a multibyte integer and then read back as individual bytes or vise versa.

  • Endianness does not apply to the order of array elements.

  • See also: wikipedia article about endianness .

      Byte sequence order
    Data type Value(*) Big endian Little endian
    12 34
    34 12
    00 47 d5 a8
    a8 d5 47 00
    56 78 9a bc
    bc 9a 78 56
  • (*) All values shown in base 16.

  • Big Endian:

      Big Endian

  • Little Endian:

      Little Endian

Data Allocation Directives, Cont.

Keyword Description
BYTE, DB (byte) Allocates unsigned numbers from 0 to 255.
SBYTE (signed byte) Allocates signed numbers from –128 to +127.
WORD, DW (word = 2 bytes) Allocates unsigned numbers from 0 to 65,535 (64K).
SWORD (signed word) Allocates signed numbers from –32,768 to +32,767.
DWORD, DD (doubleword = 4 bytes) Allocates unsigned numbers from 0 to 4,294,967,295 (4 megabytes)
SDWORD (signed doubleword) Allocates signed numbers from –2,147,483,648 to +2,147,483,647.
FWORD, DF (farword = 6 bytes) Allocates 6-byte (48-bit) integers. These values are normally used only as pointer variables on the 80386/486 processors.
QWORD, DQ (quadword = 8 bytes) Allocates 8-byte integers used with 8087-family coprocessor instructions.
TBYTE, DT (10 bytes) Allocates 10-byte (80-bit) integers if the initializer has a radix specifying the base of the number.

    Data Type Bytes
    FWORD 6
    QWORD 8
    TBYTE 10

  • Storing different data types in register:

      storing different data types in register

  • The data types SBYTE, SWORD, and SDWORD tell the assembler to treat the initializers as signed data.

  • It is important to use these signed types with high-level constructs such as .IF, .WHILE, and .REPEAT, and with PROTO and INVOKE directives.

  • For descriptions of these directives, refer to the sections

    • Loop-Generating Directives

    • Declaring Procedure Prototypes

    • Calling Procedures with INVOKE

    in MASM Programmer's Guide.

  •   integer formats

