Pay-as-you-go Reconciliation in Schema Matching Networks

Publication Date: 
Friday, 4 April, 2014
Published in: 
ICDE 2014
Nguyen Quoc Viet Hung, Nguyen Thanh Tam, Zoltan Miklos, Karl Aberer, Avigdor Gal and Matthias Weidlich (EPFL)

Schema matching is the process of establishing correspondences between the attributes of database schemas for data integration purposes. Although several automatic schema matching tools have been developed, their results are often incomplete or erroneous. To obtain a correct set of correspondences, a human expert is usually required to validate the generated correspondences. We analyze this reconciliation process in a setting where a number of schemas needs to be matched, in the presence of consistency expectations about the network of attribute correspondences. We develop a probabilistic model that helps to identify the most uncertain correspondences, thus allowing us to guide the expert’s work and collect his input about the most problematic cases. As the availability of such experts is often limited, we develop techniques that can construct a set of good quality correspondences with a high probability, even if the expert does not validate all the necessary correspondences. We demonstrate the efficiency of our techniques through extensive experimentation using real-world datasets.

PDF icon ICDE14_conf_full_131.pdf399.39 KB