Venkatesan T. Chakaravarthy, Fabio Checconi, et al.
IEEE TPDS
Faced with growing knowledge management needs, enterprises are increasingly realizing the importance of interlinking critical business information distributed across structured and unstructured data sources. We present a novel system, called EROCS, for linking a given text document with relevant structured data. EROCS views the structured data as a predefined set of “entities” and identifies the entities that best match the given document. EROCS also embeds the identified entities in the document, effectively creating links between the structured data and segments within the document. Unlike prior approaches, EROCS identifies such links even when the relevant entity is not explicitly mentioned in the document. EROCS uses an efficient algorithm that performs this task keeping the amount of information retrieved from the database at a minimum. Our evaluation shows that EROCS achieves high accuracy with reasonable overheads.
Venkatesan T. Chakaravarthy, Fabio Checconi, et al.
IEEE TPDS
Gang Luo
VLDB 2006
Himanshu Gupta, Vinay J. Ribeiro, et al.
MASCOTS 2010
Shariq Rizvi, Alberto Mendelzon, et al.
SIGMOD 2004