Issue

I have this query:

SELECT TOP 1 
    MAX(HORA_LEIT), ST_BOMBA, Q_BOMBA, SEQUENCIAL 
FROM 
    DADOS 
WHERE 
    COD_PONTO = 2085 
    AND (ST_BOMBA = 'ON' OR ST_BOMBA = 'OFF') 
GROUP BY 
    ST_BOMBA, Q_BOMBA, SEQUENCIAL 
ORDER BY 
    MAX(HORA_LEIT) DESC

I decided to create two indexes:

CREATE INDEX ix_1 
ON dados (cod_ponto, St_bomba) 
INCLUDE (q_bomba, sequencial, hora_leit) 
WHERE St_bomba IN ('ON', 'OFF')

Actual execution plan: https://www.brentozar.com/pastetheplan/?id=HkxiKmDXs

and

CREATE INDEX ix_2 
ON dados (cod_ponto, hora_leit) 
INCLUDE (St_bomba, q_bomba, sequencial) 
WHERE St_bomba IN ('ON', 'OFF')

Actual execution plan: https://www.brentozar.com/pastetheplan/?id=By_1tmDQj

I figured out, as far as I can see, that the first execution plan is better, although the query optimizer is giving preference to the second one.

Am I misevaluating the performance?

Solution

The ix_2 is clearly better.

The first thing to note is that the query is written in a very convoluted way.

The query

SELECT TOP 1 
    MAX(HORA_LEIT), ST_BOMBA, Q_BOMBA, SEQUENCIAL 
FROM 
    DADOS 
WHERE 
    COD_PONTO = 2085 
    AND (ST_BOMBA = 'ON' OR ST_BOMBA = 'OFF') 
GROUP BY 
    ST_BOMBA, Q_BOMBA, SEQUENCIAL 
ORDER BY 
    MAX(HORA_LEIT) DESC

is equivalent to

SELECT TOP 1 
    HORA_LEIT, ST_BOMBA, Q_BOMBA, SEQUENCIAL 
FROM 
    DADOS 
WHERE 
    COD_PONTO = 2085 
    AND (ST_BOMBA = 'ON' OR ST_BOMBA = 'OFF') 
ORDER BY 
    HORA_LEIT DESC

As you are only interested in the TOP 1 the GROUP BY can be optimized out here.

Side note this may not be immediately apparent why but...

Take a row with the highest HORA_LEIT in the table (matching the where conditions). This is going to be in a group at least tied for first place when ordered by MAX(HORA_LEIT) DESC.

So taking the ST_BOMBA, Q_BOMBA, SEQUENCIAL from that row is going to produce the values for a valid group that satisfies the initial query. If different rows have the same HORA_LEIT equalling the MAX but different ST_BOMBA, Q_BOMBA, SEQUENCIAL then it is undeterminstic which one you get in both versions of the query.

Both execution plans recognize this and don't contain any aggregation operators.

For index 2 the filtered index condition guarantees that all rows match the ST_BOMBA condition. It just has to do a backward ordered index seek on COD_PONTO = 2085 and read the first row and then stop (as the second key column is HORA_LEIT).

For index 1 the index seek is actually two seeks.

seek on (COD_PONTO, ST_BOMBA) = (2085, 'ON')
seek on (COD_PONTO, ST_BOMBA) = (2085, 'OFF')

The combined rows matching either of those conditions then go into the Top N sort to get the TOP 1 matching row as ordered by HORA_LEIT DESC. It is only 96 rows in this case but is potentially unbounded and just depends on your data.

Reading a single row and stopping is better than reading an arbitrary number of rows and sorting them.

Answered By - Martin Smith

Answer Checked By - Willingham (PHPFixing Volunteer)

Monday, October 31, 2022

[FIXED] Which one is the better query plan?

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Monday, October 31, 2022

Issue

Solution

0 Comments:

Post a Comment

Total Pageviews

Featured Post

Why Learn PHP Programming

Subscribe To