To make index construction more efficient, we represent terms as termIDs
(instead of strings as we did in Figure 1.4), where each termID is a unique
serial number. We can build the mapping from terms to termIDs on the fly
while we are processing the collection; or, in a two-pass approach, we compile the vocabulary in the first pass and construct the inverted index in the
second pass. The index construction algorithms described in this chapter all
do a single pass through the data. Section 4.7 gives references to multipass
algorithms that are preferable in certain applications, for example, when disk
space is scarce.