Foxhound is the better* Database Monitor for SQL Anywhere.
*better: More thorough, more relevant, more effective.
...more Alerts, more All Clears, more details, more control in your hands.
|
[Home] | [Table of Contents] | [Previous Section] | [Next Section] |
Breck Carter
Last modified: February 12, 1998
mail to: bcarter@bcarter.com
A correlated subquery is a nested select that refers to a column from the outer select. In some cases they can rank among the poorest performing SQL statements imaginable because the inner result set must be constructed for every single row that is a candidate for inclusion in the outer result set. If both the inner and outer result sets are large the amount of processing required can be huge.
Figure 29A shows two solutions for the same query: "Display the type and quantity for each row, together with the average quantity for that value of type." The first solution involves a correlated subquery that recalculates the average quantity corresponding to each row in the outer result set. The second solution involves the creation of a temporary table to hold average quantities by type, followed by a join to get the final answer.
Figure 29A - Correlated Subquery Versus Temporary Table
|
The correlated subquery solution in Figure 29A takes almost 14 seconds for ten rows and over 65 seconds for fifty rows. By contrast the second solution is not only much faster but it takes the same amount of time for either 10 rows or 50 rows: 3 seconds to both build the temporary table and do the join.
Correlated subqueries represent a challenge when you're trying to optimize a query. First of all, you should be absolutely sure there is a problem before trying to change anything. Run some tests, don't rely on the PLAN to predict a problem. For example SQL Anywhere used exactly the same execution strategy "PLAN> A (test1): B (seq)" for both solutions in Figure 29A but the actual run times were dramatically different.
Another challenge is that correlated subqueries aren't always evil things. Figure 29B shows two solutions to a very common query: "Find all the parent rows for which no child rows exist." The first solution uses a NOT EXISTS operator with a correlated subquery in the outer WHERE clause, whereas the second solution uses an outer join between the parent and child tables and calls IfNull() in the WHERE clause to select only childless parents.
Figure 29B - Correlated Subqueries Aren't Always Bad
|
The two alternatives in Figure 29B require exactly the same number of I/O operations and almost the same amount of time. This indicates that SQL Anywhere may be using the same technique for both even though the I/O estimates are vastly different. The correlated subquery performs just fine and there is no real advantage to the outer join solution.
[Home] | [Table of Contents] | [Previous Section] | [Next Section] |