There are a few fundamental concepts that affect how expressions are evaluated. We start by briefly discussing the concepts that apply to most (if not all) expressions. Subsequent sections will cover these topics in more detail.
There are both unary operators and binary operators. Unary operators, such as address-of (&
) and dereference (*
), act on one operand. Binary operators, such as equality (==
) and multiplication (*
), act on two operands. There is also one ternary operator that takes three operands, and one operator, function call, that takes an unlimited number of operands.
Some symbols, such as *
, are used as both a unary (dereference) and a binary (multiplication) operator. The context in which a symbol is used determines whether the symbol represents a unary or binary operator. The uses of such symbols are independent; it can be helpful to think of them as two different symbols.
Understanding expressions with multiple operators requires understanding the precedence and associativity of the operators and may depend on the order of evaluation of the operands. For example, the result of the following expression depends on how the operands are grouped to the operators:
5 + 10 * 20/2;
The operands to the *
operator could be 10
and 20
, or 10
and 20/2
, or 15
and 20
, or 15
and 20/2
. Understanding such expressions is the topic of the next section.
As part of evaluating an expression, operands are often converted from one type to another. For example, the binary operators usually expect operands with the same type. These operators can be used on operands with differing types so long as the operands can be converted (§ 2.1.2, p. 35) to a common type.
Although the rules are somewhat complicated, for the most part conversions happen in unsurprising ways. For example, we can convert an integer to floating-point, and vice versa, but we cannot convert a pointer type to floating-point. What may be a bit surprising is that small integral type operands (e.g., bool
, char
, short
, etc.) are generally promoted to a larger integral type, typically int
. We’ll look in detail at conversions in § 4.11 (p. 159).
The language defines what the operators mean when applied to built-in and compound types. We can also define what most operators mean when applied to class types. Because such definitions give an alternative meaning to an existing operator symbol, we refer to them as overloaded operators. The IO library >>
and <<
operators and the operators we used with string
s, vector
s, and iterators are all overloaded operators.
When we use an overloaded operator, the meaning of the operator—including the type of its operand(s) and the result—depend on how the operator is defined. However, the number of operands and the precedence and the associativity of the operator cannot be changed.
Every expression in C++ is either an rvalue (pronounced “are-value”) or an lvalue (pronounced “ell-value”). These names are inherited from C and originally had a simple mnemonic purpose: lvalues could stand on the left-hand side of an assignment whereas rvalues could not.
In C++, the distinction is less simple. In C++, an lvalue expression yields an object or a function. However, some lvalues, such as const
objects, may not be the left-hand operand of an assignment. Moreover, some expressions yield objects but return them as rvalues, not lvalues. Roughly speaking, when we use an object as an rvalue, we use the object’s value (its contents). When we use an object as an lvalue, we use the object’s identity (its location in memory).
Operators differ as to whether they require lvalue or rvalue operands and as to whether they return lvalues or rvalues. The important point is that (with one exception that we’ll cover in § 13.6 (p. 531)) we can use an lvalue when an rvalue is required, but we cannot use an rvalue when an lvalue (i.e., a location) is required. When we use an lvalue in place of an rvalue, the object’s contents (its value) are used. We have already used several operators that involve lvalues.
• Assignment requires a (non
const
) lvalue as its left-hand operand and yields its left-hand operand as an lvalue.
• The address-of operator (§ 2.3.2, p. 52) requires an lvalue operand and returns a pointer to its operand as an rvalue.
• The built-in dereference and subscript operators (§ 2.3.2, p. 53, and § 3.5.2, p. 116) and the iterator dereference and
string
andvector
subscript operators (§ 3.4.1, p. 106, § 3.2.3, p. 93, and § 3.3.3, p. 102) all yield lvalues.
• The built-in and iterator increment and decrement operators (§ 1.4.1, p. 12, and § 3.4.1, p. 107) require lvalue operands and the prefix versions (which are the ones we have used so far) also yield lvalues.
As we present the operators, we will note whether an operand must be an lvalue and whether the operator returns an lvalue.
Lvalues and rvalues also differ when used with decltype
(§ 2.5.3, p. 70). When we apply decltype
to an expression (other than a variable), the result is a reference type if the expression yields an lvalue. As an example, assume p
is an int*
. Because dereference yields an lvalue, decltype(*p)
is int&
. On the other hand, because the address-of operator yields an rvalue, decltype(&p)
is int**
, that is, a pointer to a pointer to type int
.
An expression with two or more operators is a compound expression. Evaluating a compound expression involves grouping the operands to the operators. Precedence and associativity determine how the operands are grouped. That is, they determine which parts of the expression are the operands for each of the operators in the expression. Programmers can override these rules by parenthesizing compound expressions to force a particular grouping.
In general, the value of an expression depends on how the subexpressions are grouped. Operands of operators with higher precedence group more tightly than operands of operators at lower precedence. Associativity determines how to group operands with the same precedence. For example, multiplication and division have the same precedence as each other, but they have higher precedence than addition. Therefore, operands to multiplication and division group before operands to addition and subtraction. The arithmetic operators are left associative, which means operators at the same precdence group left to right:
• Because of precedence, the expression
3+4*5
is23
, not35
.
• Because of associativity, the expression
20-15-3
is2
, not8
.
As a more complicated example, a left-to-right evaluation of the following expression yields 20:
6 + 3 * 4 / 2 + 2
Other imaginable results include 9, 14, and 36. In C++, the result is 14, because this expression is equivalent to
// parentheses in this expression match default precedence and associativity
((6 + ((3 * 4) / 2)) + 2)
We can override the normal grouping with parentheses. Parenthesized expressions are evaluated by treating each parenthesized subexpression as a unit and otherwise applying the normal precedence rules. For example, we can parenthesize the expression above to force the result to be any of the four possible values:
// parentheses result in alternative groupings
cout << (6 + 3) * (4 / 2 + 2) << endl; // prints 36
cout << ((6 + 3) * 4) / 2 + 2 << endl; // prints 20
cout << 6 + 3 * 4 / (2 + 2) << endl; // prints 9
We have already seen examples where precedence affects the correctness of our programs. For example, consider the discussion in § 3.5.3 (p. 120) about dereference and pointer arithmetic:
int ia[] = {0,2,4,6,8}; // array with five elements of type int
int last = *(ia + 4); // initializes last to 8, the value of ia [4]
last = *ia + 4; // last = 4, equivalent to ia [0] + 4
If we want to access the element at the location ia+4
, then the parentheses around the addition are essential. Without parentheses, *ia
is grouped first and 4
is added to the value in *ia
.
The most common case that we’ve seen in which associativity matters is in input and output expressions. As we’ll see in § 4.8 (p. 155), the operators used for IO are left associative. This associativity means we can combine several IO operations in a single expression:
cin >> v1 >> v2; // read into v1 and then into v2
Table 4.12 (p. 166) lists all the operators organized into segments separated by double lines. Operators in each segment have the same precedence, and have higher precedence than operators in subsequent segments. For example, the prefix increment and dereference operators share the same precedence, which is higher than that of the arithmetic operators. The table includes a page reference to each operator’s description. We have seen some of these operators already and will cover most of the rest in this chapter. However, there are a few operators that we will not cover until later.
Exercises Section 4.1.2
Exercise 4.2: Using Table 4.12 (p. 166), parenthesize the following expressions to indicate the order in which the operands are grouped:
(a)
* vec.begin()
(b)
* vec.begin() + 1
Precedence specifies how the operands are grouped. It says nothing about the order in which the operands are evaluated. In most cases, the order is largely unspecified. In the following expression
int i = f1() * f2();
we know that f1
and f2
must be called before the multiplication can be done. After all, it is their results that are multiplied. However, we have no way of knowing whether f1
will be called before f2
or vice versa.
For operators that do not specify evaluation order, it is an error for an expression to refer to and change the same object. Expressions that do so have undefined behavior (§ 2.1.2, p. 36). As a simple example, the <<
operator makes no guarantees about when or how its operands are evaluated. As a result, the following output expression is undefined:
int i = 0;
cout << i << " " << ++i << endl; // undefined
Because this program is undefined, we cannot draw any conclusions about how it might behave. The compiler might evaluate ++i
before evaluating i
, in which case the output will be 1 1
. Or the compiler might evaluate i
first, in which case the output will be 0 1
. Or the compiler might do something else entirely. Because this expression has undefined behavior, the program is in error, regardless of what code the compiler generates.
There are four operators that do guarantee the order in which operands are evaluated. We saw in § 3.2.3 (p. 94) that the logical AND (&&
) operator guarantees that its left-hand operand is evaluated first. Moreover, we are also guaranteed that the right-hand operand is evaluated only if the left-hand operand is true
. The only other operators that guarantee the order in which operands are evaluated are the logical OR (||
) operator (§ 4.3, p. 141), the conditional (? :
) operator (§ 4.7, p. 151), and the comma (,
) operator (§ 4.10, p. 157).
Order of operand evaluation is independent of precedence and associativity. In an expression such as f() + g() * h() + j()
:
• Precedence guarantees that the results of
g()
andh()
are multiplied.
• Associativity guarantees that the result of
f()
is added to the product ofg()
andh()
and that the result of that addition is added to the value ofj()
.
• There are no guarantees as to the order in which these functions are called.
If f
, g
, h
, and j
are independent functions that do not affect the state of the same objects or perform IO, then the order in which the functions are called is irrelevant. If any of these functions do affect the same object, then the expression is in error and has undefined behavior.
Exercises Section 4.1.3
Exercise 4.3: Order of evaluation for most of the binary operators is left undefined to give the compiler opportunities for optimization. This strategy presents a trade-off between efficient code generation and potential pitfalls in the use of the language by the programmer. Do you consider that an acceptable trade-off? Why or why not?
When you write compound expressions, two rules of thumb can be helpful:
1. When in doubt, parenthesize expressions to force the grouping that the logic of your program requires.
2. If you change the value of an operand, don’t use that operand elsewhere in the same expresion.
An important exception to the second rule occurs when the subexpression that changes the operand is itself the operand of another subexpression. For example, in
*++iter
, the increment changes the value ofiter
. The (now changed) value ofiter
is the operand to the dereference operator. In this (and similar) expressions, order of evaluation isn’t an issue. The increment (i.e., the subexpression that changes the operand) must be evaluated before the dereference can be evaluated. Such usage poses no problems and is quite common.