Publications

An unsupervised instance matcher for schema-free RDF data

Abstract

This article presents an unsupervised system that performs instance matching between entities in schema-free Resource Description Framework (RDF) files. Rather than relying on domain expertise or manually labeled samples, the system automatically generates its own heuristic training set. The training sets are first used by the system to align the properties in the input graphs. The property alignment and training sets are used together to simultaneously learn two functions, one for the blocking step of instance matching and the other for the classification step. Finally, the learned functions are used to perform instance matching. The full system is implemented as a sequence of components that can be iteratively executed to boost performance. Evaluations on a suite of ten test cases show individual components to be competitive with state-of-the-art baselines. The system as a whole is shown to compete effectively …

Date
December 1, 2015
Authors
Mayank Kejriwal, Daniel P Miranker
Journal
Journal of Web Semantics
Volume
35
Pages
102-123
Publisher
Elsevier