Collections are often so large that we cannot perform index construction effi-
ciently on a single machine. This is particularly true of the World Wide Web
for which we need large computer clusters1
to construct any reasonably sized
web index. Web search engines, therefore, use distributed indexing algorithms
for index construction. The result of the construction process is a distributed
index that is partitioned across several machines-either according to term
or according to document. In this section, we describe distributed indexing
for a term-partitioned index. Most large search engines prefer a document-