11.1. Using an Associative Container

Although most programmers are familiar with data structures such as vectors and lists, many have never used an associative data structure. Before we look at the details of how the library supports these types, it will be helpful to start with examples of how we can use these containers.

A map is a collection of key–value pairs. For example, each pair might contain a person’s name as a key and a phone number as its value. We speak of such a data structure as “mapping names to phone numbers.” The map type is often referred to as an associative array. An associative array is like a “normal” array except that its subscripts don’t have to be integers. Values in a map are found by a key rather than by their position. Given a map of names to phone numbers, we’d use a person’s name as a subscript to fetch that person’s phone number.

In contrast, a set is simply a collection of keys. A set is most useful when we simply want to know whether a value is present. For example, a business might define a set named bad_checks to hold the names of individuals who have written bad checks. Before accepting a check, that business would query bad_checks to see whether the customer’s name was present.

Using a `map`

A classic example that relies on associative arrays is a word-counting program:

// count the number of times each word occurs in the input map<string, size_t> word_count; // empty map from string to size_t string word; while (cin >> word) ++word_count[word]; // fetch and increment the counter for word for (const auto &w : word_count) // for each element in the map // print the results cout << w.first << " occurs " << w.second << ((w.second > 1) ? " times" : " time") << endl;

This program reads its input and reports how often each word appears.

Like the sequential containers, the associative containers are templates (§ 3.3, p. 96). To define a map, we must specify both the key and value types. In this program, the map stores elements in which the keys are strings and the values are size_ts (§ 3.5.2, p. 116). When we subscript word_count, we use a string as the subscript, and we get back the size_t counter associated with that string.

The while loop reads the standard input one word at a time. It uses each word to subscript word_count. If word is not already in the map, the subscript operator creates a new element whose key is word and whose value is 0. Regardless of whether the element had to be created, we increment the value.

Once we’ve read all the input, the range for (§ 3.2.3, p. 91) iterates through the map, printing each word and the corresponding counter. When we fetch an element from a map, we get an object of type pair, which we’ll describe in § 11.2.3 (p. 426). Briefly, a pair is a template type that holds two (public) data elements named first and second. The pairs used by map have a first member that is the key and a second member that is the corresponding value. Thus, the effect of the output statement is to print each word and its associated counter.

If we ran this program on the text of the first paragraph in this section, our output would be

Although occurs 1 time Before occurs 1 time an occurs 1 time and occurs 1 time ...

Using a `set`

A logical extension to our program is to ignore common words like “the,” “and,” “or,” and so on. We’ll use a set to hold the words we want to ignore and count only those words that are not in this set:

// count the number of times each word occurs in the input map<string, size_t> word_count; // empty map from string to size_t set<string> exclude = {"The", "But", "And", "Or", "An", "A", "the", "but", "and", "or", "an", "a"}; string word; while (cin >> word) // count only words that are not in exclude if (exclude.find(word) == exclude.end()) ++word_count[word]; // fetch and increment the counter for word

Like the other containers, set is a template. To define a set, we specify the type of its elements, which in this case are strings. As with the sequential containers, we can list initialize (§ 9.2.4, p. 336) the elements of an associative container. Our exclude set holds the 12 words we want to ignore.

The important difference between this program and the previous program is that before counting each word, we check whether the word is in the exclusion set. We do this check in the if:

// count only words that are not in exclude if (exclude.find(word) == exclude.end())

The call to find returns an iterator. If the given key is in the set, the iterator refers to that key. If the element is not found, find returns the off-the-end iterator. In this version, we update the counter for word only if word is not in exclude.

If we run this version on the same input as before, our output would be

Although occurs 1 time Before occurs 1 time are occurs 1 time as occurs 1 time ...

Exercises Section 11.1

Exercise 11.1: Describe the differences between a map and a vector.

Exercise 11.2: Give an example of when each of list, vector, deque, map, and set might be most useful.

Exercise 11.3: Write your own version of the word-counting program.

Exercise 11.4: Extend your program to ignore case and punctuation. For example, “example.” “example,” and “Example” should all increment the same counter.