MySQL aggregation (returning columns not in the GROUP BY clause)
Tuesday, December 18th, 2007More good news for today - again this is another thing that i presumed *everybody* knew about MySQL - but they don’t.
MySQL allows for slightly naughty but useful GROUP BY activities (for instance returning scalar values without using aggregations for columns that aren’t included in the Group By). Basically it returns a non-specified random value from that set so it needs to be used with a bit of intelligence.
Say we’ve got a Sale entity and a Product entity which has a SaleId. What we want is a query where we can show the product name, product RRP (recommended retail price) which are on the product table, and the number of sales from the Sale table (which relate to the product).
We could write this as a subquery but it would be inefficient when we need to select from some criteria that’s found in the Sale table. Instead we write it as a simple INNER JOIN with a GROUP BY.
SELECT p.productid, p.productname, p.rrp, COUNT(s.productid) AS NumberOfSales
FROM product p
INNER JOIN sale s ON s.productid = p.productid
GROUP BY p.productid
Now in most DBMSs this would fail as the product name and the RRP aren’t in aggregate functions and aren’t specified in the GROUP BY. So we’d end up wrapping each in MIN(p.productname), MIN(p.rrp) or adding them to the GROUP BY.
However, we know this isn’t necessary as the productname and rrp are associated with the productid, so the MIN aggregate function is just a waste of time. So for MySQL we can leave the original query and it’ll run nicely.
Another winning tool for MySQL users.
Disclaimer: this functionality is a tough one to fully comprehend in all circumstances - so use with care. Other DBMSs don’t allow this for a reason!
