Harvesting maps on the Web

Abstract

Maps are one of the most valuable documents for gathering geospatial information about a region. Yet, finding a collection of diverse, high-quality maps is a significant challenge because there is a dearth of content-specific metadata available to identify them from among other images on the Web. For this reason, it is desirous to analyze the content of each image. The problem is further complicated by the variations between different types of maps, such as street maps and contour maps, and also by the fact that many high-quality maps are embedded within other documents such as PDF reports. In this paper, we present an automatic method to find high-quality maps for a given geographic region. Not only does our method find documents that are maps, but also those that are embedded within other documents. We have developed a Content-Based Image Retrieval (CBIR) approach that uses a new set of …

Date: January 1, 1970
Authors: Aman Goel, Matthew Michelson, Craig A Knoblock
Journal: International Journal on Document Analysis and Recognition (IJDAR)
Volume: 14
Pages: 349-372
Publisher: Springer-Verlag

View Paper

Information Sciences Institute

Publications

Harvesting maps on the Web

Abstract