I’m going to be performing a lot of query optimization this year, and knowing the order in which SQL Server executes queries is going to play a major part in the decisions I make when re-writing slow running queries.
Below is the exact order in which a query is executed:
I found it interesting that the FROM and WHERE were executed first, mainly because before I found this out – I’d always assumed that SQL worked in the same way that many other programming languages work, in that they execute from top to bottom, but the above does make perfect sense.
You can’t get data until you know where it comes from, hence the FROM clause being executed first.
The data from the tables your query now has access to is then filtered down by the WHERE clause, as it would be inefficient to return everything, and then filter it out.
The GROUP BY and HAVING clauses are then executed, if specified. Again, this makes perfect sense as the HAVING clause is essentially another WHERE, but with the aggregation of a column, and you can’t have a HAVING without a GROUP BY to get that aggregation!
Next the SELECT is executed to bring back all of the filtered and / or aggregated data.
Lastly, the ORDER BY is called.
The fact that the ORDER BY is called after the SELECT, but the GROUP BY isn’t, makes it clear as to why an aliase can be used within the ORDER BY clause, but not the GROUP BY.
Because the FROM is executed first, it’s a good idea to do as much filtering here as possible. Doing this will improve effeciency by having less data returned in the first place to be filtered by the WHERE.
Below is an example of a query filtering in the FROM, and below that is the same query written with the filtering in the WHERE.
SELECT * FROM Orders O JOIN Products P ON P.ProductId = O.ProductId AND O.IsShipped = 1
Is equivalent to:
SELECT * FROM Orders O JOIN Products P ON P.ProductId = O.ProductId WHERE O.IsShipped = 1