Taking a look at CROSS APPLY
Applying a Sub-Query, Joining a Derived Table …
I think the easiest way to think of CROSS APPLY is that it is like doing a CROSS JOIN with a correlated sub-query instead of a derived table. Let's see if I can explain that …. A derived table is "self-contained", in that all tables and columns in the parent SELECT are not accessible (though variables and parameters can be referenced). For example, consider:
select A.*, b.X
from A
cross join (select B.X from B where B.Val=A.Val) b
from A
cross join (select B.X from B where B.Val=A.Val) b
That is not legal because A.Val is out of scope within the derived table; this is because the derived table is evaluated independently of the other tables in the SELECT. To limit the rows in table B so that B.Val = A.Val, we must do that outside of the derived table via a join or in the criteria:
select A.*, b.X
from A
cross join (select * from B) b
where A.Val = b.Val
from A
cross join (select * from B) b
where A.Val = b.Val
(Of course, the above is equivalent to doing an INNER JOIN to the derived table, or just joining to the table B.)
Also, keep in mind that the scope-of-derived-tables rule isn't just for CROSS JOINS, it's for all JOINS -- CROSS, INNER, OUTER and even UNION; they all use "self-contained" derived tables.
This is in contrast to a correlated sub-query, where the parent SELECT is in scope for the sub-query; the sub-query is evaluated for each row in the query, so the other tables and columns in the SELECT are all available:
select A.*, (select B.X from B where B.Val=A.Val) as X
from A
from A
(Note: I am ignoring for now the fact that returning multiple rows in a sub-query will return an error.)
This is an easy way to think of the difference between CROSS JOIN and CROSS APPLY. CROSS JOIN, as we saw, joins to a derived table; however, CROSS APPLY, despite looking like a JOIN, actually is applying a correlated sub-query. This imposes both the advantages of a correlated sub-query but also the performance implications.
So, we can simply rewrite our first example using CROSS APPLY like this:
select A.*, b.X
from A
cross apply (select B.X from B where B.Val=A.Val) b
from A
cross apply (select B.X from B where B.Val=A.Val) b
Since we are performing an APPLY and not a JOIN, A.Val is in scope and it works just fine.
Table Valued User Defined Functions
Note that the same rules apply when using Table-Valued User-Defined Functions:
select A.*, B.X
from A
cross join dbo.UDF(A.Val) B
from A
cross join dbo.UDF(A.Val) B
is not legal; once again, A.Val is not in scope for the user-defined function. The best we can do before SQL 2005 was to use a correlated sub-query:
select A.*, (select X from dbo.UDF(A.Val)) X
from A
from A
However, that is not logically equivalent; the UDF cannot return more than 1 row or it will result in an error, and wouldn't make logical sense anyway if it did.
Starting with SQL 2005, we can now use CROSS APPLY and it will work fine:
select A.*, b.X
from A
cross apply dbo.UDF(A.Val) b
from A
cross apply dbo.UDF(A.Val) b
So, that is one way to think of the difference between a JOIN and an APPLY; a JOIN combines two separate result sets, but APPLY is more of a loop that evaluates one result set over and over for each row in another. This means that, in general, APPLY will be less efficient than a JOIN, just as, in general, correlated sub-queries are less efficient than derived tables. (The optimizer, however, is generally quite good at optimized correlated sub-queries when possible.)
So, why use CROSS APPLY instead of a correlated sub-query? What's the advantage? Actually, quite a lot -- it is much more powerful!
CROSS APPLY can return multiple rows
Unlike correlate sub-queries, CROSS APPLY works with multiple rows. This allows us to do things like "joining" a table to a function that parses a CSV column in that table into multiple rows:
select A.ID, b.Val
from A
cross apply dbo.ParseCSV(A.CSV) b
from A
cross apply dbo.ParseCSV(A.CSV) b
When the ParseCSV() function returns multiple rows, it simply acts as if we have joined Table A to the function's return table, duplicating the rows in Table A for each row in the joined table. This is not possible with a correlated sub-query, and will result in an error. This is a quick and easy way to parse a table of data into multiple rows in an efficient "set-based" manner when the algorithm requires a complex User-Defined Function. (The CSV is not a great example, but other parsing routines are not as easily accomplished via a JOIN to a numbers table.)
CROSS APPLY can return multiple columns
Again, in a correlated sub-query, we can only return a single value. If we write a SQL statement that returns a running sum, we can use a correlated sub-query like this:
select o.*,
(select sum(Amount) from Order o
where p.OrderDate <= o.OrderDate) as RunningSum
from Order o
(select sum(Amount) from Order o
where p.OrderDate <= o.OrderDate) as RunningSum
from Order o
However, what if we'd like to return an additional running sum of Orders based on some other criteria (e.g., for orders with the same "OrderCode")? We'd need another correlated sub-query, greatly reducing the efficiency of our SELECT:
select o.*,
(select sum(Amount) from Order o
where p.OrderDate <= o.OrderDate) as RunningSum,
(select sum(Amount) from Order o
where p.OrderCode = o.OrderCode and p.OrderDate <= o.OrderDate) as SameCode
from Order o
(select sum(Amount) from Order o
where p.OrderDate <= o.OrderDate) as RunningSum,
(select sum(Amount) from Order o
where p.OrderCode = o.OrderCode and p.OrderDate <= o.OrderDate) as SameCode
from Order o
However, we can easily re-write that using a single CROSS APPLY:
select o.*, rs.RunningSum, rs.SameCode
from Order o
cross apply
(
select
sum(Amount) as RunningSum,
sum(case when p.OrderCode = o.OrderCode then Amount else 0 end) as SameCode
from Order P
where P.OrderDate <= O.OrderDate
) rs
from Order o
cross apply
(
select
sum(Amount) as RunningSum,
sum(case when p.OrderCode = o.OrderCode then Amount else 0 end) as SameCode
from Order P
where P.OrderDate <= O.OrderDate
) rs
So, we get the benefit of return multiple columns like a derived table, and we also get the ability to reference outer values in our SELECT to use in criteria and CASE statements. Very, very powerful.
CROSS APPLY also allows us to quickly get columns from the "previous" row in a table quite easily:
select o.*, prev.*
from Order o
cross apply
(
select top 1 *
from Order P where P.OrderDate < O.OrderDate
order by OrderDate DESC
) prev
from Order o
cross apply
(
select top 1 *
from Order P where P.OrderDate < O.OrderDate
order by OrderDate DESC
) prev
We can express the running total and counts as CROSS JOINS or INNER JOINS by reworking a correlated sub-query, but using TOP and referencing the outer SELECT is not easily done in that manner.
Note that the above CROSS APPLY will not return any orders without a previous order; we would use OUTER APPLY to ensure that all orders are returned even if no previous orders exist:
select o.*, prev.*
from Order o
outer apply
(
select top 1 *
from Order P where P.OrderDate < O.OrderDate
order by OrderDate DESC
) prev
from Order o
outer apply
(
select top 1 *
from Order P where P.OrderDate < O.OrderDate
order by OrderDate DESC
) prev
We can also CROSS APPLY to a Table-Valued User Defined Function that returns exactly one row, but with multiple columns, to return separate pieces of data from a single function call. For example, we can parse an email address into separate username and domain columns. A few months back I wrote a SQL Team article that discusses that concept.
Summary
CROSS and OUTER APPLY are very powerful and can be very useful, but we must be careful to use them only when necessary; I am still testing the possibilities, but in general an APPLY will tend to be not as efficient as a JOIN. In addition, APPLY is mostly demonstrated by applying table-valued user-defined functions, but it can be used with in-line SELECT statements as well.
see also:
- Taking a look at CROSS APPLY
- The "Nested WHERE-IN" SQL Anti-Pattern
- Using GROUP BY to avoid self-joins
- Criteria on Outer Joined Tables
- Better Alternatives to a FULL OUTER JOIN
- Conditional Joins in SQL Server
- How to JOIN Multiple Transactional Tables in SQL
- The power of the Cross Join
Legacy Comments
Dan Kennedy
2007-10-22 |
re: Taking a look at CROSS APPLY Great article. I've struggled to find many uses for CROSS APPLY as yet but this has certainly helped. Just a minor point though, is there a slight naming error on the line sum(case when p.OrderCount = o.OrderCode then Amount else 0 end) as SameCode shouldn't it be sum(case when p.OrderCode = o.OrderCode then Amount else 0 end) as SameCode Cheers Dan |
Jeff
2007-10-22 |
re: Taking a look at CROSS APPLY Hi Dan -- Fixed. Thank you!! - Jeff |
Keith Bloom
2007-10-24 |
re: Taking a look at CROSS APPLY I happened to use a CROSS APPLY for the first time a couple of days ago. I wanted to produce a table of messages where the message may be sent to one or more recipients. Using the CROSS APPLY I was able to generate a delimited list of recipients in the subquery and join it to the main table. Bye bye cursor. If you are interested I adapted the final query in this example: http://zulfiqar.typepad.com/zulfiqars_web/2005/04/tsql_concatenat.html Keith. |
Peso
2008-05-22 |
re: Taking a look at CROSS APPLY Some qaution is required for CROSS APPLY. Analyzing the query plan reveals that it's behaviour is very much like a correlated subquery, but with the ability to return multiple columns/records. |
Jeff
2008-05-22 |
re: Taking a look at CROSS APPLY Hi Peso -- Thanks! That's pretty much the entire point of my article! :) From the article: "This is an easy way to think of the difference between CROSS JOIN and CROSS APPLY. CROSS JOIN, as we saw, joins to a derived table; however, CROSS APPLY, despite looking like a JOIN, actually is applying a correlated sub-query. This imposes both the advantages of a correlated sub-query but also the performance implications." |
sachin
2008-10-21 |
re: Taking a look at CROSS APPLY nice post, thanks |
Karim
2009-11-19 |
re: Taking a look at CROSS APPLY Thanks, really well presented. |
Anil Rajendran
2009-11-22 |
re: Taking a look at CROSS APPLY The Best article in using SQL queries. Thanks for that. |
Prabhakar J
2009-12-10 |
re: Taking a look at CROSS APPLY Good one mate!.. Thanks for the article. thoroughly understood |
Josue Monteiro Viana
2010-09-13 |
re: Taking a look at CROSS APPLY Hello people, I tried to use the examples and got in trouble because “cross apply” only works on SQL 90 compatibility mode – not in SQL 2005 with mode 80 enabled (my case). Since I can’t change the database, I create the following work around: -- Code USE tempdb – tempdb is a native SQL 2005 DB SELECT tb.field1, tb.field2, fc.* FROM #tempTB as tb CROSS APPLY OficialDB_Compatibility80.dbo.myUDF (tb.field1, tb.field2, tb.field3, @dinamicVAR) AS fc USE OficialDB_Compatibility80 -- Code This worked for me. Good lock! Bye Josué Monteiro Viana |