Publications

Data Acquisition and Integration in the DGRC's Energy Data Collection Project

Abstract

The EDC project is developing new methods to make data that has been represented in disparate ways and stored in heterogeneous forms available to users in an integrated, manageable, and understandable way. Our approach is to represent the structure and types of data in the disparate collections in a standard format (called a domain model) and then to embed the domain model (s) into a large overarching taxonomy of terms (the ontology). Once thus standardized, the data collections can be found and retrieved via a single interface and access planning system. A major bottleneck, however, is the rapid inclusion of new data collections into this system. This paper describes some recent research in developing methods to automate the acquisition and inclusion processes.

Date
January 1, 1970
Authors
EH Hovy, Andrew Philpot, Jose Luis Ambite, Yigal Arens, Judith Klavans, Walter Bourne, Deniz Saroz
Journal
Proceedings of the NSF's dg. o