In this way, each page is associated with a set of term data (query terms and/or tag terms) and a set of usage data (the selection, tag, share, and voting count). The term data is represented as a Lucene (lucene.apache.org) index table, with each page indexed under its associated query and tag terms, and provides the basis for retrieving and ranking promotion candidates. The usage data provides an additional source of evidence that can be used to filter results and to generate a final set of recommendations. At search time, a set of recommendations is produced in a number of stages: relevant results are retrieved and ranked from the Lucene stak index; these promotion candidates are filtered based on an evidence model to eliminate noisy recommendations; and the remaining results are added to the Google resultlist according to a set of recommendation rules.
Briefly, there are two types of promotion candidates: primary promotions are results that come from the active stak St ; whereas secondary promotions come from other staks in the searcher’s stak-list. To generate these promotion candidates, the HeyStaks server uses the current query qt as a probe into each stak index, Si, to identify a set of relevant stak pages P(Si,qt ). Each candidate page, p, is scored using Lucene’s TF.IDF retrieval function as per 18.5, which serves as the basis for an initial recommendation ranking.