Although most programmers are familiar with data structures such as vector
s and list
s, many have never used an associative data structure. Before we look at the details of how the library supports these types, it will be helpful to start with examples of how we can use these containers.
A map
is a collection of key–value pairs. For example, each pair might contain a person’s name as a key and a phone number as its value. We speak of such a data structure as “mapping names to phone numbers.” The map
type is often referred to as an associative array. An associative array is like a “normal” array except that its subscripts don’t have to be integers. Values in a map
are found by a key rather than by their position. Given a map
of names to phone numbers, we’d use a person’s name as a subscript to fetch that person’s phone number.
In contrast, a set
is simply a collection of keys. A set
is most useful when we simply want to know whether a value is present. For example, a business might define a set
named bad_checks
to hold the names of individuals who have written bad checks. Before accepting a check, that business would query bad_checks
to see whether the customer’s name was present.
map
A classic example that relies on associative arrays is a word-counting program:
// count the number of times each word occurs in the input
map<string, size_t> word_count; // empty map from string to size_t
string word;
while (cin >> word)
++word_count[word]; // fetch and increment the counter for word
for (const auto &w : word_count) // for each element in the map
// print the results
cout << w.first << " occurs " << w.second
<< ((w.second > 1) ? " times" : " time") << endl;
This program reads its input and reports how often each word appears.
Like the sequential containers, the associative containers are templates (§ 3.3, p. 96). To define a map
, we must specify both the key and value types. In this program, the map
stores elements in which the keys are string
s and the values are size_t
s (§ 3.5.2, p. 116). When we subscript word_count
, we use a string
as the subscript, and we get back the size_t
counter associated with that string
.
The while
loop reads the standard input one word at a time. It uses each word to subscript word_count
. If word
is not already in the map
, the subscript operator creates a new element whose key is word
and whose value is 0
. Regardless of whether the element had to be created, we increment the value.
Once we’ve read all the input, the range for
(§ 3.2.3, p. 91) iterates through the map
, printing each word and the corresponding counter. When we fetch an element from a map
, we get an object of type pair
, which we’ll describe in § 11.2.3 (p. 426). Briefly, a pair
is a template type that holds two (public
) data elements named first
and second
. The pair
s used by map
have a first
member that is the key and a second
member that is the corresponding value. Thus, the effect of the output statement is to print each word and its associated counter.
If we ran this program on the text of the first paragraph in this section, our output would be
Although occurs 1 time
Before occurs 1 time
an occurs 1 time
and occurs 1 time
...
set
A logical extension to our program is to ignore common words like “the,” “and,” “or,” and so on. We’ll use a set
to hold the words we want to ignore and count only those words that are not in this set:
// count the number of times each word occurs in the input
map<string, size_t> word_count; // empty map from string to size_t
set<string> exclude = {"The", "But", "And", "Or", "An", "A",
"the", "but", "and", "or", "an", "a"};
string word;
while (cin >> word)
// count only words that are not in exclude
if (exclude.find(word) == exclude.end())
++word_count[word]; // fetch and increment the counter for word
Like the other containers, set
is a template. To define a set
, we specify the type of its elements, which in this case are string
s. As with the sequential containers, we can list initialize (§ 9.2.4, p. 336) the elements of an associative container. Our exclude
set holds the 12 words we want to ignore.
The important difference between this program and the previous program is that before counting each word, we check whether the word is in the exclusion set. We do this check in the if
:
// count only words that are not in exclude
if (exclude.find(word) == exclude.end())
The call to find
returns an iterator. If the given key is in the set
, the iterator refers to that key. If the element is not found, find
returns the off-the-end iterator. In this version, we update the counter for word
only if word
is not in exclude
.
If we run this version on the same input as before, our output would be
Although occurs 1 time
Before occurs 1 time
are occurs 1 time
as occurs 1 time
...
Exercises Section 11.1
Exercise 11.1: Describe the differences between a
map
and avector
.Exercise 11.2: Give an example of when each of
list
,vector
,deque
,map
, andset
might be most useful.Exercise 11.3: Write your own version of the word-counting program.
Exercise 11.4: Extend your program to ignore case and punctuation. For example, “example.” “example,” and “Example” should all increment the same counter.