Collecting all values (here: docIDs) for a given key (here: termID) into one list is the task of the inverters in the reduce phase. The master assigns each term partition to a different inverter-and, as in the case of parsers, reas-signs term partitions in case of failing or slow inverters. Each term partition
(corresponding to r segment files, one on each parser) is processed by one in-verter. We assume here that segment files are of a size that a single machine
can handle (Exercise 4.9). Finally, the list of values is sorted for each key and
written to the final sorted postings list (“postings” in the figure). (Note that
postings in Figure 4.6 include term frequencies, whereas each posting in the
other sections of this chapter is simply a docID without term frequency in-formation.) The data flow is shown for a–f in Figure 4.5. This completes the
construction of the inverted index