postgresql: разные скорости выполнения запроса при использовании JOIN

Рейтинг: 0Ответов: 0Опубликовано: 04.08.2023

объясните пожалуйста с чем связано различие в скорости выполнения запросов (примерно в 10 раз):

быстрый запрос:

WITH selected_t AS (
    SELECT 
        *
    FROM
        tbl_02
)
SELECT 
    COUNT(*) AS count, year AS value
FROM
    selected_t
GROUP BY year;

медленный запрос:

WITH selected_t AS (
    SELECT 
        *
    FROM
        tbl_01
    LEFT JOIN tbl_02 USING (id)
)
SELECT 
    COUNT(*) AS count, year AS value
FROM
    selected_t
GROUP BY year;

поле year находится в таблице tbl_02 и индексировано

соединение таблиц происходит по одинаковому полю id, которое индексировано в обоих таблицах

кол-во записей в таблицах одинаково, т.е. одинаковое поле id и значения одинаковые в id, а остальные поля разные

в таблицах по 40 миллионов записей

P.S.

понимаю, что задача немного экзотичная и избыточная, но это часть более сложного запроса в котором WHERE будет

P.P.S.

данные по EXPLAIN ANALYZE:

быстрый запрос:

"Finalize GroupAggregate  (cost=1000.50..367357.36 rows=4 width=12) (actual time=1027.565..1062.806 rows=4 loops=1)"
"  Group Key: tbl2.year"
"  ->  Gather Merge  (cost=1000.50..367357.24 rows=16 width=12) (actual time=1027.321..1062.769 rows=18 loops=1)"
"        Workers Planned: 4"
"        Workers Launched: 4"
"        ->  Partial GroupAggregate  (cost=0.44..366355.28 rows=4 width=12) (actual time=64.494..806.830 rows=4 loops=5)"
"              Group Key: tbl2.year"
"              ->  Parallel Index Only Scan using idx_tbl2_year on tbl2  (cost=0.44..314513.04 rows=10368440 width=4) (actual time=0.075..460.769 rows=8178446 loops=5)"
"                    Heap Fetches: 0"
"Planning Time: 0.167 ms"
"JIT:"
"  Functions: 18"
"  Options: Inlining false, Optimization false, Expressions true, Deforming true"
"  Timing: Generation 3.901 ms, Inlining 0.000 ms, Optimization 1.187 ms, Emission 22.171 ms, Total 27.259 ms"
"Execution Time: 1064.528 ms"

медленный запрос:

"Finalize GroupAggregate  (cost=1061734.21..1061736.25 rows=4 width=12) (actual time=8805.749..9342.205 rows=4 loops=1)"
"  Group Key: tbl2.year"
"  ->  Gather Merge  (cost=1061734.21..1061736.13 rows=16 width=12) (actual time=8805.742..9342.197 rows=20 loops=1)"
"        Workers Planned: 4"
"        Workers Launched: 4"
"        ->  Sort  (cost=1060734.16..1060734.17 rows=4 width=12) (actual time=8690.508..8690.511 rows=4 loops=5)"
"              Sort Key: tbl2.year"
"              Sort Method: quicksort  Memory: 25kB"
"              Worker 0:  Sort Method: quicksort  Memory: 25kB"
"              Worker 1:  Sort Method: quicksort  Memory: 25kB"
"              Worker 2:  Sort Method: quicksort  Memory: 25kB"
"              Worker 3:  Sort Method: quicksort  Memory: 25kB"
"              ->  Partial HashAggregate  (cost=1060734.07..1060734.11 rows=4 width=12) (actual time=8690.449..8690.452 rows=4 loops=5)"
"                    Group Key: tbl2.year"
"                    Batches: 1  Memory Usage: 24kB"
"                    Worker 0:  Batches: 1  Memory Usage: 24kB"
"                    Worker 1:  Batches: 1  Memory Usage: 24kB"
"                    Worker 2:  Batches: 1  Memory Usage: 24kB"
"                    Worker 3:  Batches: 1  Memory Usage: 24kB"
"                    ->  Parallel Hash Join  (cost=516397.31..1009960.33 rows=10154748 width=4) (actual time=4424.311..8112.672 rows=8178446 loops=5)"
"                          Hash Cond: (tbl1.id = tbl2.id)"
"                          ->  Parallel Index Only Scan using idx_tbl1_id on tbl1  (cost=0.56..315855.51 rows=10154748 width=8) (actual time=0.066..651.861 rows=8178446 loops=5)"
"                                Heap Fetches: 0"
"                          ->  Parallel Hash  (cost=386791.24..386791.24 rows=10368440 width=12) (actual time=4375.725..4375.726 rows=8178446 loops=5)"
"                                Buckets: 67108864  Batches: 1  Memory Usage: 2443072kB"
"                                ->  Parallel Index Scan using idx_tbl2_year on tbl2  (cost=0.44..386791.24 rows=10368440 width=12) (actual time=107.607..1968.085 rows=8178446 loops=5)"
"Planning Time: 0.576 ms"
"JIT:"
"  Functions: 68"
"  Options: Inlining true, Optimization true, Expressions true, Deforming true"
"  Timing: Generation 17.020 ms, Inlining 265.814 ms, Optimization 185.610 ms, Emission 84.950 ms, Total 553.394 ms"
"Execution Time: 9345.371 ms"

Ответы

Ответов пока нет.