C++ Intro

Lecture 2: C++ built-in types and expressions


Fundamental Types and Declarations

There is a need for a name for "something in memory." This is the simplest and most fundamental notion of an object. That is, an object is a contiguous region of memory storage. Variables represent objects in our programs. A type of a variable specifies amount of computer memory needed to instantiate the variable. In addition, type describes run-time behavior of the variable.

Consider

    x = y + 2;
                

For this to make sense in a C++ program, the names x and y must be suitably declared. That is, the programmer must specify what types are x and y:

    double x;   // x is a floating-point variable
    int y = 7;  // y is an integer variable with initial value 7                    
    x = y + 2;  // add 2 to y and assign the result to x.
                

Furthermore, = (assignment) and + (addition) should be meaningful for these types. Every name (identifier) in a C++ program has a type associated with it. This type determines what operations can be applied to each name.

Fundamental Types of the C++ Language

Category Type Contents
Integral char Type char is an integral type that usually contains members of the execution character set. In Microsoft C++, this is ASCII.
The C++ compiler treats variables of type char, signed char, and unsigned char as having different types. Variables of type char are promoted to int as if they are type signed char by default, unless the /J compilation option is used. In this case they are treated as type unsigned char and are promoted to int without sign extension.
bool Type bool is an integral type that can have one of the two values true or false. Its size is unspecified.
short Type short int (or simply short) is an integral type that is larger than or equal to the size of type char, and shorter than or equal to the size of type int.
Objects of type short can be declared as signed short or unsigned short. Signed short is a synonym for short.
int Type int is an integral type that is larger than or equal to the size of type short int, and shorter than or equal to the size of type long.
Objects of type int can be declared as signed int or unsigned int. Signed int is a synonym for int.
long Type long (or long int) is an integral type that is larger than or equal to the size of type int.
Objects of type long can be declared as signed long or unsigned long. Signed long is a synonym for long.
Floating float Type float is the smallest floating type.
double Type double is a floating type that is larger than or equal to type float, but shorter than or equal to the size of type long double.
long double Type long double is a floating type that is equal to type double.

Microsoft VC++ Specific: the following table lists the amount of storage required for fundamental types

Sizes of Fundamental Types

Type Size
bool 1 byte
char, unsigned char, signed char 1 byte
short, unsigned short 2 bytes
int, unsigned int 4 bytes
long, unsigned long 4 bytes
float 4 bytes
double 8 bytes
long double 8 bytes

The boolean, character, and integer types are collectively called integral types. The integral and floating-point types are collectively called arithmetic types. The fundamental types are also called built-in types.

For most applications, one could simply use bool for logical values, char for characters, int for integer values, and double for floating-point values.

Boolean Type

A Boolean, bool, can have one of the two values: true or false.

A Boolean is used to express the results of logical operations. For example:

    int a = 2;
    int b = 2;

    bool is_equal = ( a == b ); // = is assignment, == is equality
                    

A variable of boolean type can be converted to an integer: true has the value 1 when converted to an integer, and false has the value 0. Conversely, integers can be implicitly converted to bool values: nonzero integers convert to true and 0 converts to false. For example:

    int i = -1;
    bool b = i; // b becomes true, because i is nonzero.
                    

Character Type

A variable of type char can hold a character of platform-specific character set. For example:

    char ch = 'a';
                    

Almost universally, a char has 8 bits so that it can hold one of 256 different values. Typically, the character set is a variant of ISO-646 standard, for example ASCII, thus providing the characters appearing on your keyboard.

It is safe to assume that the implementation character set includes the decimal digits, the 26 alphabetic characters of English, and some of the basic punctuation characters.

Values of particular characters are typically represented by constant character literals enclosed in single quotes. Each literal has an integer value. For example, the value of 'b' is 98 in the ASCII character set. Here is a small program that will tell you the integer value of any character you care to input:


    #include <iostream>
    int main()
    {
        char c;
        std::cin >> c;
        std::cout << "the value of '" << c << "' is " << int( c ) << '\ n';
        return 0;
    }

                    

In this example, expression int( c ) gives the integer value for a character c.

Character Literals

Again, a character literal, often called a character constant, is a character enclosed in single quotes, for example, 'a' and '0'. The type of a character literal is char. Such character literals are really symbolic constants for the integer value of the characters in the character set of the machine on which the C++ program is to run. For example, if you are running on a machine using the ASCII character set, the value of '0' is 48 . The use of character literals rather than decimal notation makes programs more portable. A few characters also have standard names that use the backslash \ as an escape character. For example, \n is a newline and \t is a horizontal tab.

Integer Types

Like char, each integer type comes in three forms: "plain" int, signed int, and unsigned int. In addition, integers come in three sizes: short int, "plain" int, and long int.

Integer Literals

There are four kinds of notations for integer literals: decimal, octal, hexadecimal, and character literals. Decimal literals are the most commonly used and look as you would expect them to: 0, 123, 4976, and so on.

A literal starting with zero followed by x, 0x is a hexadecimal (base 16) number. A literal starting with zero followed by a digit is an octal (base 8) number. For example:

decimal: 0 2 63 83
octal: 00 02 077 0123
hexadecimal: 0x0 0x2 0x3f 0x53

As we already know from the previous lecture, letters a, b, c, d, e, and f or their uppercase equivalents represent decimal numbers 10, 11, 12, 13, 14, and 15, respectively. Octal and hexadecimal notations are most useful for expressing bit patterns.

Floating-Point Types

The floating-point types represent floating-point numbers. Please follow this link to read more about floating-point types and floating-point literals.

sizeof operator

Sizes of C++ types and variables are expressed in terms of multiples of the size of a char, so by definition the size of a char is 1. The size of an object or type can be obtained using the sizeof operator:

    #include <iostream>
    int main()
    {
        std::cout << "sizeof( char )==" << sizeof( char );
        return 0;
    }
                    

This is what is guaranteed about sizes of fundamental types:

    1 = sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
    
    1 <= sizeof(bool) <= sizeof(long)
    
    sizeof(float) <= sizeof(double) <= sizeof(long double)
                    

Assuming the scale 5 bytes = 1 inch, a megabyte of memory would stretch about three miles.

C++ variables

In C++, all variables must be declared before they can be used. Declaration announces the properties of the new variable. Each declaration consists of a type name and a variable name. Multiple comma-separated names of variables may appear in a single declaration. However, this practice is not advisable, since program could quickly become difficult to read:

    int counter, month, day, year;
                    

Better:

    int counter;
    int month;
    int day;
    int year;
                    

Declarations may also have initializers to provide initial values:

    int counter = 1000; // Create variable and assign initial value of 1000
    const double pi = 3.1415926535897932385;
                    

In contrast to a "variable", keyword const creates a constant value. For constants, the "value" is permanent, which means that the program cannot change it. Therefore, declaration initializer is the only way to provide a value for a constant.

A variable name (identifier) consists of a sequence of letters, digits, and underscore characters. The first character must be a letter or underscore. Names starting with an underscore are reserved for system use, so such names should be avoided in application programs.

Uppercase and lowercase letters are distinct, so Count and count are two different names. It is unwise to choose names that differ only by capitalization. In general, it is best to avoid names that differ only in subtle ways. Single-letter names are a very poor choice, too. Some names could create significant readability problems:

    O // is this a zero or an uppercase letter 'O' ?
    l // is this a lowercase 'L' or a digit '1' ?
                    

Names from a large scope ought to have relatively long and reasonably obvious names, such as department_number, or phone_book. It is important to maintain a consistent naming style in your programs. Throughout this course, we will re-visit this topic a few times.

Scope of Variables

Each declaration introduces a variable into a specific part of the program. Accessibility (or visibility) of a variable from different parts of the program introduces an idea of a variable scope. Variables declared outside of functions are called global variables. Such variables belong to the global scope:

    int count;
    int main()
    {
        count = count + 1;
        return 0;
    }
                    

As demonstrated by the above example, global variables are visible inside any function. Global scope is a namespace, that is, an encapsulation of names. Global variables are initialized to zero by the compiler.

Alternatively, a variable declared inside a function is said to be a local variable:

    int main()
    {
        int count;
        count = 1000;
        return 0;
    }
                    

Local variables are visible only inside the function. Localization is not limited only to the functions. Even smaller scopes may exist. For example,

    #include <iostream>
    int main()
    {
        int count = 10;
        while ( count > 0 ) {
            int var = -count;
            std::cout << var;
            count = count - 1;
        }
        return 0;
    }
                    

This program prints -10-9-8-7-6-5-4-3-2-1 on the screen. The scope of the variable var is limited to the while loop inside the main function.

Declaration is a statement

Initializers of local variables are executed whenever the thread of control passes through the declaration. The reason for allowing declarations wherever a statement can be used is to minimize the errors caused by uninitialized variables. There is rarely a reason to introduce a variable before there is a value for it to hold.

The most common reason to declare a variable without an initializer is that it requires a separate statement to initialize it. Examples are input variables:

    #include <iostream>
    int main()
    {
        int number;
        std::cin >> number;
        std::cout << "You entered: " << number;
        return 0;
    }
                    

Arithmetic Operators

Arithmetic operators are:

Addition x + 7
Subtraction a - b
Multiplication x * y
Division x / y
Modulus r % s

Integer division (i.e., where both the numerator and the denominator are integers) yields an integer quotient; for example, the expression 7 / 4 evaluates to 1 and the expression 17 / 5 evaluates to 3. Note that any fractional part in integer division is discarded (i.e., truncated) and no rounding occurs.

Modulus operator, %, yields the remainder after integer division. The modulus operator can be used only with integer operands. The expression x % y yields the remainder after x is divided by y. Thus, 7%4 yields 3 and 17%5 yields 2. Modulus helps determining whether one number is a multiple of another, and as a special case determining whether a number is odd or even.


Equality and Relational Operators


Relational operators provide means to making decisions:

Is x greater than y? x > y
Is x less than y? x < y
Is x greater than or equal to y? x >= y
Is x less than or equal to y? x <= y

Equality operators also provide means to making decisions:

Is x equal to y? x == y
Is x not equal to y? x != y

Each C++ operator has a built-in, immutable property of associativity. Here are a few examples of left-to-right associativity:

Some operators with left-to-right associativity:

operator name expression evaluation order
a + b binary plus a + b + c ( a + b ) + c
x++
x--
unary postfix increment
unary postfix decrement
x++ ++
x-- --
( x++ ) ++
( x-- ) --

In contrast, right-to-left associativity operators have the following forms:

Some operators with right-to-left associativity:

operator name expression evaluation order
a = b assignment a = b = c a = ( b = c )
++x
--x
unary prefix increment
unary prefix decrement
++ ++x
-- --x
++( ++x )
--( --x )
a ? b : c conditional
(arithmetic if)
a ? b : c ? d : e a ? b : ( c ? d : e )

WARNING: it is a common mistake to think that associativity rules define the order in which parts of the expression (i.e. subexpressions) will be evaluated. You cannot assume that there will be a common left-to-right order of evaluation. Why? Because without order restrictions compiler is capable to generate the most efficient, fast, better code. If you need to enforce the order of evaluation, use the parentheses to group subexpressions.

The operators , (comma), && (logical and), and || (logical or) guarantee that their left-hand operand is evaluated before their right-hand operand. For example,

    b = ( a = 2, a + 1 )

assigns 3 to b .

The second operand of

    expr1 && expr2

 is evaluated only if its first operand is true.

The second operand of

    expr1 || expr2

 is evaluated only if its first operand is false; this is sometimes called short-circuit evaluation. Short-circuiting should be considered very carefully if your expressions have side effects, such as increment or decrement of a variable.

Parentheses can be used to force both grouping and order of evaluation. For example,

    a * b / c

means

    ( a * b ) / c

so parentheses must be used to get a * ( b / c ) if that's what desired by the programmer. In particular, for many floating-point computations a * ( b / c ) and ( a * b ) / c are significantly different, so a compiler will evaluate such expressions exactly as written.


Arithmetic If Operator


Decision Making: arithmetic if

(condition) ? expressiontrue : expressionfalse


Assignment Operators


The assignment operator = assigns a value to a variable. For example,

    x = 1

sets x to 1, and

    a = b

sets a to whatever b's value is.

Any time there is a pattern

    variable = variable operator expression

where operator is any of the binary arithmetic operators we've seen so far, and expression is any expression, we can and should replace it with the simplified

    variable operator= expression

For example, you can replace the expressions

    i = i + 1
    j = j - 10
    k = k * (n + 1)

with

    i += 1
    j -= 10
    k *= n + 1

Please note that you don't always need as many explicit parentheses when using the arithmetic assignment operators: the expression

    k *= n + 1

is interpreted as

    k = k * (n + 1)

Increment and Decrement


When we are adding or subtracting the constant 1 from a variable, C++ provides an extremely useful set of shortcuts: the increment and decrement operators. In their simplest forms, they look like this:

    ++i; // add one to i
    --j; // subtract one from j

These expressions correspond to their longer equivalents,

    i += 1; // or even longer form i = i + 1
    j -= 1; // or even longer form j = j - 1

The ++ and -- operators apply to one operand. Therefore, they are unary operators. The expression ++i adds 1 to i, and stores the incremented result back in i. This means that increment and decrement operators don't just compute new values; they also modify the value of the variable. Consequently, increment and decrement share common property with assignment operators, that is, they modify the variable.

We may say that these operators all have side effects. That is, besides the fact that a new value (i + 1) is being computed and returned as a result of the expression ++i, the side effect is also that the new value is being implicitly assigned to the variable i. In contrast, the expression (i + 1) clearly has no side effects.

The incremented or decremented result is made available to the rest of the expression, so an expression like

    a = ++i;

means "add one to i, store result in i, and assign i to a:"

    i = i + 1, a = i;

Both the ++ and -- operators have an unusual property: they can be used in two ways, depending on whether they are written to the left or the right of the variable they're operating on. In either case, they increment or decrement the variable they're operating on; the difference concerns whether it's the old or the new value that is returned to the surrounding expression.

The prefix form ++i increments i and returns the incremented value. The postfix form i++ increments i, but returns the prior, non-incremented value. A rewrite of the previous example,

    a = i++;

means "assign i to a, add one to i, and store result in i:"

    a = i, i = i + 1;

The distinction between the prefix and postfix forms of ++ and -- probably seems strange at first, but it will make more sense once we begin using these operators in more realistic situations.

The increment ++ and decrement -- operators must be used in conjuction with modifiable values, such as variables. For example, the following statement will not compile:

    a = ++( x + y ); // *** will result in compiler error

The programmer has probably intended to write

    a = x + y + 1;

Good Programming Practice


  • Refer to the operator precedence and associativity chart when writing expressions containing many operators;
  • If uncertain about the order of evaluation, break expression into smaller expressions;
  • Use parentheses to force the order of evaluation and attain a peace of mind;
  • Do not use global variables, this leads to the global scope pollution;
  • Limit scope of your local variables;
  • Keep common and local names short, and keep uncommon and non-local names longer;
  • Avoid unnecessary assumptions about the sizes of variables;
  • Prefer a plain int over a short int or a long int ;
  • Prefer a double over a float ;
  • Avoid complicated expressions;
  • Don't declare a variable until you have a value to initialize it with;
  • Don't panic! All will become clear in time! ( Bjarne Stroustrup, The C++ Programming Language, 3rd Edition)