Issue
I am trying to join multiple tables using LEFT OUTER
in SQL Server. In the sample below, the table on the left is not that big but the table on the right is about 4-5 millions rows and that's just two tables. There are 3 more tables that needs to be join and they are rather huge tables.
-- Example of join for 2 tables
SELECT a.id, a.user_name, a.start_timestamp, a.ref1, a.ref2, a.ref3. a.ref4, b.start_timestamp, b.ref1, b.ref2, b.ref3. b.ref4
FROM
(
SELECT id, user_name, start_timestamp, ref1, ref2, ref3. ref4
FROM user
WHERE DATEDIFF(day,[start_timestamp],GETDATE()) between 0 and 7
) a
LEFT JOIN
(
SELECT id, user_name, start_timestamp, ref1, ref2, ref3. ref4
FROM user_activity
WHERE DATEDIFF(day,[start_timestamp],GETDATE()) between 0 and 7
) b
ON a.id = b.id
I tried to limit the columns to absolute necessary fields only and taking only the last 7 days but the SQL query is very slow and taking a long time. Is there a way to make to optimize the joins?
Solution
The subqueries are not of any help, so I would remove them.
Then, rephrase the date logic to be direct comparisons. For this, I will assume that there are no future dates.
Finally, add indexes:
select . . .
from user u left join
user_activity ua
on ua.id = u.id and
ua.start_timestamp >= dateadd(day, -7, convert(date, getdate()))
where u.start_timestamp >= dateadd(day, -7, convert(date, getdate()))
For this query, you want indexes on user(start_timestamp)
and user_activity(id, start_timestamp)
.
The filtering condition on the first table goes in the where
clause. For the subsequent tables, it goes in the on
clause.
Note also the use of meaningful table aliases rather than arbitrary letters such as a
and b
. This makes the query much easier to follow.
Answered By - Gordon Linoff Answer Checked By - David Marino (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.