Aditya Kalyanpur, Siddharth Patwardhan, et al.
CIKM 2011
Information extraction from large data repositories is critical to Information Management solutions. In addition to prerequisite corpus analysis, to determine domain-specific characteristics of text resources, developing, refining and evaluating analytics entails a complex and lengthy process, typically requiring more than just domain expertise. Modern architectures for text processing, while facilitating reuse and (re-)composition of analytical pipelines, place additional constraints upon the analytics development, as domain experts need not only configure individual annotator components, but situate these within a fully functional annotator pipeline. We present the design, and current status, of a tool for configuring model-driven annotators, which abstracts away from annotator implementation details, pipeline composition constraints, and data management. Instead, the tool embodies support for all stages of ontology-centric model development cycle - from corpus analysis and concept definition, to model development and testing, to large scale evaluation, to easy and rapid composition of text applications deploying these concept models. With our design, we aim to meet the needs of domain experts, who are not necessarily expert NLP practitioners.
Aditya Kalyanpur, Siddharth Patwardhan, et al.
CIKM 2011
Jennifer Chu-Carroll, John Prager, et al.
SIGIR 2006
Avik Sinha, Amit Paradkar, et al.
DSN 2009
Brian Davis, Siegfried Handschuh, et al.
LREC 2008