Foxhound is the better* Database Monitor for SQL Anywhere.
*better: More thorough, more relevant, more effective.
...more Alerts, more All Clears, more details, more control in your hands.
|
Breck Carter
Last modified: June 7, 1996
mail to: bcarter@bcarter.com
Yesterday's Pop Quiz Watcom 'Rithmetic contained a huge error, completely separate from the relatively minor problem with integer - versus - real arithmetic. Did you spot it?
If not, here it is again, repeated as today's Pop Quiz: What's really wrong with this select?
select product.name, sum ( sales_order_items.quantity / 12 ) / 365 from product, sales_order_items where sales_order_items.ship_date >= '1994 01 01' and sales_order_items.ship_date <= '1994 12 31' group by product.name order by product.name;
Here's a clue: Have a look at Too Many Rows?
Yes, indeed... if it wasn't for the group by clause the cartesian product would be obvious. Every single quantity is being summarized for every product; i.e., all the sales are being paired with all the products. The where clause is missing a line:
and product.id = sales_order_items.prod_id
Before the sum() function and group by clause summarized everything, the result set should have contained 411 product/quantity pairs. But without the join on product.id it actually contained 4110 rows.
If it wasn't for this explosion of data the arithmetic problem might have been more obvious. This select...
select product.name, sum ( sales_order_items.quantity / 12 ) / 365 from product, sales_order_items where sales_order_items.ship_date >= '1994 01 01' and sales_order_items.ship_date <= '1994 12 31' and product.id = sales_order_items.prod_id group by product.name order by product.name;
...returns all zeros for the daily average sales figures:
Baseball Cap 0 Shorts 0 Sweatshirt 0 Tee Shirt 0 Visor 0
Here's another question: If there are 5 products, and 411 order quantities, why did the cartesian product contain 4110 rows? Shouldn't it contain only 2055 rows, or 411 times 5?
Have another look at the select: The group by is on product.name whereas the missing join is on product.id. The name column isn't unique. There are actually 10 different id values and only 5 different names: different styles of product have different ids but the same name.
For the record, here's the final solution:
select product.name, sum ( cast ( sales_order_items.quantity as real ) / 12 ) / 365 from product, sales_order_items where sales_order_items.ship_date >= '1994 01 01' and sales_order_items.ship_date <= '1994 12 31' and product.id = sales_order_items.prod_id group by product.name order by product.name;
Here's the final result set, with neither an arithmetic problem nor a cartesian product:
Baseball Cap 0.556393 Shorts 0.419178 Sweatshirt 0.394521 Tee Shirt 0.646575 Visor 0.430137
And the moral of this story? If you're going to use a group by clause to summarize data make sure the base result set contains the right rows. Test the select without the summarization first. A group by clause can hide a huge mistake, even a cartesian product!