Introduction
Large investments in genome sequencing and gene analysis
were originally made with the notion that, although
expensive in the short term, they would be highly cost
effective in the end. It is already the case that gene discovery
using heterologous probes has been largely
replaced because of our ability to use sequence information
generated through large-scale sequencing programs.
The enormous value of partial cDNA sequence (expressed
sequence tag, EST) collections was first realised in the
plant model species Arabidopsis thaliana [1] and has ignited
interest in the analogous application of this approach to
various other plant species, including crop species. Largescale
applications of EST sequencing, however, have
revealed not only the potential, but also the limitations of
this procedure. These limitations are imposed mainly by
the fact that the majority of every organism’s genes are,
universally, not represented in any given library.
Recognition of this limitation and the desire to gain additional
information about genome structure (e.g. about
regulatory elements involved the control of gene expression)
inspired the initiation of genomic sequencing
programs for Arabidopsis (a dicot) and rice (a monocot) [2•]