The US Digital Library Initiative (DLI) is a federal program funding research at six universities to further the state of the art in searching for and displaying selections from large, heterogeneous collections of on-line reference materials. The DLI project at the University of Illinois is building a testbed for querying multiple SGML repositories of scientific publications over the Internet as if they were organized into a single collection. This process of merging the diverse text collections into a single virtual repository is called ``federation''. Since the document collections use different DTDs, the project uses a canonical DTD based on the ISO 12083 Article Document Type Definition. Each collection is translated from its original DTD into the canonical DTD for indexing and retrieval. Pat is used as the search engine for the federated repository.
The DLI testbed's federation approach could be useful for merging the APIB with other repositories of international standards. ISO has developed its own DTD for representing international standards documents. This DTD could serve as the canonical DTD for a federation of international standards collections. However, because the ISO DTD is not specific to any of the collections, an ISO-federated repository would not support searches based on specific structures of any particular collection.
SGML-tagged query results are displayed using Panorama[SQ], a commercial SGML viewer, rather than being translated to HTML for display with a web browser. The DLI testbed's implementation includes a mapping between elements in the canonical DTD and display styles used by Panorama. This mapping is analogous to the APIB gateway's SGML-to-HTML translator. Because query results are displayed in their native SGML rather than in HTML, they can include items HTML is incapable of representing such as complex mathematical equations. Also, because Panorama supports some of the HyTime[HYTIME] international standard (ISO 10744), an application of SGML describing the structure of documents for use in hypertext and multimedia applications, the DLI testbed can support a richer variety of hyper-links between documents than an HTML browser alone can support. However, users need to have HyTime-aware SGML viewing software installed locally in order to use the DLI testbed.
In addition to the testbed, the University of Illinois DLI project is conducting research in several other areas. These include integration of multiple query engines and information retrieval from digital libraries based on deep semantics. Another area of research is in techniques for keeping track of state information for federated CGI applications. The University of Illinois DLI project (along with the other DLI projects) has the potential to make a major contribution to the global infrastructure for on-line information retrieval over the next several years.