Collections are often so large that we cannot perform index construction efficiently
on a single machine. This is particularly true of the World Wide Web
forwhich we need large computer clusters1 to construct any reasonably sized
web index. Web search engines, therefore, use distributed indexing algorithms
for index construction. The result of the construction process is a distributed
index that is partitioned across several machines – either according to term
or according to document. In this section, we describe distributed indexing
for a term-partitioned index. Most large search engines prefer a document-partitioned index (which can be easily generated from a term-partitioned
index). We discuss this topic further in Section 20.3 (page 454).