Ruoming Jin, Yang Xiang, et al.
SIGMOD 2008
Several XML DBMSs support XQuery and/or SQL/XML languages, which are based on navigational primitives in the form of XPath expressions. Typically, these systems either model each XPath step as a separate query plan operator, or employ holistic approaches that can evaluate multiple steps of a single XPath expression. There have also been proposals to execute as many XPath expressions as possible within a single FLWOR block simultaneously in a data streaming context. We observe that blindly combining all possible XPath expressions for concurrent execution can result in significant performance degradation in a database system. We identify two main problems with this strategy. First, the simple strategy of grouping all XPath expressions on a single document does not always work if the query involves more than one data source or has nested query blocks. Second, merging XPath expressions may result in unnecessary execution of branches that can be filtered by predicates in other branches or elsewhere in the query. To rectify these problems, IBM® DB2® pureXML™ adopts a combination of heuristic-based rewrite transformations, to decide which XPath expressions should be grouped for concurrent evaluation, and cost-based optimization to globally order the groups within the queiy execution plan, and locally order the branches within individual groups. Experimental evaluation confirms that selectively grouping multiple XPath expressions allows for better query evaluation performance and reduces the query optimization complexity. These optimization techniques have been implemented as part of IBMDB2 9.5 (pureXML). Copyright 2008 ACM.
Ruoming Jin, Yang Xiang, et al.
SIGMOD 2008
Andrey Balmin, Tim Kaldewey, et al.
SIGMOD 2012
Diptikalyan Saha, Avrilia Floratou, et al.
VLDB 2016
Kevin Beyer, Roberta Cochrane, et al.
IBM Systems Journal