PostgreSQL limit N的成本估算,是通過計(jì)算總成本A,以及估算得到的總記錄數(shù)B得到:
成都創(chuàng)新互聯(lián)公司主營蓬溪網(wǎng)站建設(shè)的網(wǎng)絡(luò)公司,主營網(wǎng)站建設(shè)方案,成都app開發(fā),蓬溪h5小程序開發(fā)搭建,蓬溪網(wǎng)站營銷推廣歡迎蓬溪等地區(qū)企業(yè)咨詢
(N/B)*A 大概意思就是占比的方法計(jì)算
對(duì)于單表查詢,這種方法通常來說比較適用,但是如果數(shù)據(jù)分布有傾斜,實(shí)際上也并不一定適用,例如以下兩種情況:
1、符合條件的數(shù)據(jù)占總記錄數(shù)的50%,但是全部分布在表的末尾,那么limit 10000 條到底是走索引快還是走全表掃描快呢?
2、符合條件的數(shù)據(jù)占總記錄數(shù)的50%,全部分布在表的頭部,那么LIMIT 10000 條,肯定是全表掃描快了。
對(duì)于JOIN的情況,同樣有類似的問題:
比如JOIN并且?guī)l件時(shí),LIMIT N,是走嵌套循環(huán)快,還是走M(jìn)ERGE 或 HASH JOIN快?
1、嵌套循環(huán)+where+LIMIT的成本計(jì)算方法,可以使用LIMIT占總估算記錄數(shù)占比的方法得到,還算是比較合理。
2、MERGE JOIN+where+LIMIT的成本計(jì)算方法,必須考慮啟動(dòng)成本,例如WHERE條件在A表上(可以走索引直接從條件位置開始掃描),B表則需要從索引的開頭開始掃描(到與A表的索引匹配時(shí),也許需要掃描很多的索引ENTRY,這個(gè)啟動(dòng)成本可能會(huì)很高),啟動(dòng)成本,加上LIMIT條數(shù)在剩余的所有成本中的一個(gè)占比,得到的成本是一個(gè)比較合理的成本。
3、hash join+where+limit的成本計(jì)算方法,使用啟動(dòng)成本+LIMIT占總估算記錄數(shù)占比的方法得到,優(yōu)化器目前就是這么做的,比較合理。
然而,對(duì)于MERGE JOIN,目前在使用LIMIT時(shí),PG沒有加上這個(gè)啟動(dòng)成本,使得最后得到的執(zhí)行計(jì)劃可能不準(zhǔn)確。
改進(jìn)方法建議可以加入merge join啟動(dòng)成本。
1、建表如下:
postgres=# create table test1(a int, b text, primary key(a)); CREATE TABLE postgres=# create table test2(a int, b text, primary key(a)); CREATE TABLE postgres=# alter table test1 add constraint testcheck foreign key(a) references test2(a); ALTER TABLE postgres=# insert into test2 select generate_series(1,1000000),'abcdefg'; INSERT 0 1000000 postgres=# insert into test1 select generate_series(1,1000000,2),'abcdefg'; INSERT 0 500000 analyze test1; analyze test2;
2、查詢SQL如下:
explain (analyze,verbose,timing,costs,buffers) select * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 limit 10;
該語句中表結(jié)構(gòu)比較特殊,兩個(gè)關(guān)聯(lián)字段都是主鍵,并且存在外鍵約束關(guān)系,查詢計(jì)劃如下:
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.73..0.89 rows=10 width=24) (actual time=54.729..54.739 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Buffers: shared hit=2042 -> Merge Left Join (cost=0.73..7929.35 rows=498340 width=24) (actual time=54.728..54.735 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Inner Unique: true Merge Cond: (test2.a = test1.a) Buffers: shared hit=2042 -> Index Scan using test2_pkey on public.test2 (cost=0.37..3395.42 rows=498340 width=12) (actual time=0.017..0.020 rows=10 loops=1) Output: test2.a, test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4 -> Index Scan using test1_pkey on public.test1 (cost=0.37..2322.99 rows=500000 width=12) (actual time=0.006..34.120 rows=250006 loops=1) Output: test1.a, test1.b Buffers: shared hit=2038 Planning Time: 0.216 ms Execution Time: 54.765 ms (17 rows)
從執(zhí)行計(jì)劃上可以看出,根據(jù)test2表先查詢出滿足條件的10條記錄,然后和test1表采用mergejoin關(guān)聯(lián),由于在估算的時(shí)候沒有考慮到limit的影響,估算的行數(shù)非常大,是498340行,
實(shí)際采用nestloop效果會(huì)更好(關(guān)閉掉seqscan和megejoin)
postgres=# set enable_seqscan =off; SET postgres=# set enable_mergejoin =off; SET postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.73..4.53 rows=10 width=24) (actual time=0.040..0.060 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Buffers: shared hit=39 -> Nested Loop Left Join (cost=0.73..189339.64 rows=498340 width=24) (actual time=0.039..0.057 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Inner Unique: true Buffers: shared hit=39 -> Index Scan using test2_pkey on public.test2 (cost=0.37..3395.42 rows=498340 width=12) (actual time=0.025..0.027 rows=10 loops=1) Output: test2.a, test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4 -> Index Scan using test1_pkey on public.test1 (cost=0.37..0.37 rows=1 width=12) (actual time=0.002..0.002 rows=0 loops=10) Output: test1.a, test1.b Index Cond: (test2.a = test1.a) Buffers: shared hit=35 Planning Time: 0.112 ms Execution Time: 0.078 ms (17 rows)
但是從評(píng)估的成本來看,merge join+limit 比 nestloop+limit更低,原因是nestloop的總成本更高(189339 比 7929)。所以優(yōu)化器根據(jù)比例算法(未參照merge join的啟動(dòng)成本),認(rèn)為在LIMIT的情況下,也是merge join成本更低。
實(shí)際情況是,MERGE JOIN的沒帶查詢條件的B表,需要從索引的頭部開始掃,而不是從指定位置開始掃。 因此實(shí)際情況是merge join是更慢的。
目前優(yōu)化器使用hash join時(shí),已經(jīng)算上了startup cost,例子
postgres=# set enable_mergejoin =off; SET postgres=# set enable_seqscan =off; SET postgres=# set enable_nestloop =off; SET 啟動(dòng)成本=3536.51 postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=3536.51..3536.61 rows=10 width=24) (actual time=158.148..158.219 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Buffers: shared hit=4079, temp written=1464 -> Hash Left Join (cost=3536.51..8135.83 rows=494590 width=24) (actual time=158.147..158.215 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Inner Unique: true Hash Cond: (test2.a = test1.a) Buffers: shared hit=4079, temp written=1464 -> Index Scan using test2_pkey on public.test2 (cost=0.37..3369.86 rows=494590 width=12) (actual time=0.023..0.027 rows=26 loops=1) Output: test2.a, test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4 -> Hash (cost=2322.99..2322.99 rows=500000 width=12) (actual time=156.848..156.849 rows=500000 loops=1) Output: test1.a, test1.b Buckets: 262144 Batches: 4 Memory Usage: 7418kB Buffers: shared hit=4072, temp written=1464 -> Index Scan using test1_pkey on public.test1 (cost=0.37..2322.99 rows=500000 width=12) (actual time=0.011..72.506 rows=500000 loops=1) Output: test1.a, test1.b Buffers: shared hit=4072 Planning Time: 0.141 ms Execution Time: 162.086 ms (21 rows)
針對(duì)test1表,需要估算a<500000有多少行,作為索引掃描的startup成本。
postgres=# explain select * from test1 where a<500000; QUERY PLAN --------------------------------------------------------------------------------- Index Scan using test1_pkey on test1 (cost=0.37..1702.83 rows=249893 width=12) Index Cond: (a < 500000) (2 rows) postgres=# explain select * from test1; QUERY PLAN ------------------------------------------------------------- Seq Scan on test1 (cost=0.00..133.15 rows=500000 width=12) (1 row)
所以,索引掃描test1(where a > 500000)的merge join啟動(dòng)成本應(yīng)該有 1702,加上這個(gè)成本后,成本遠(yuǎn)大于NEST LOOP JOIN的成本,所以不會(huì)選擇merge join。
create table test1(a int, b varchar2(4000), primary key(a)); create table test2(a int, b varchar2(4000), primary key(a)); alter table test1 add constraint testcheck foreign key(a) references test2(a); insert into test2 select rownum, 'abcdefg' from dual connect by level <=1000000; insert into test1 select * from (select rownum as rn, 'abcdefg' from dual connect by level <=1000000) t where mod(rn,2)=1;
exec DBMS_STATS.GATHER_TABLE_STATS('DIGOAL', 'TEST1', method_opt => 'FOR COLUMNS (a, b)'); exec DBMS_STATS.GATHER_TABLE_STATS('DIGOAL', 'TEST2', method_opt => 'FOR COLUMNS (a, b)');
查詢SQL如下:
set autotrace on set linesize 120 set pagesize 200 set wrap off select * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 and rownum<=10; A B ---------- ------------------------------------------------------------------------------------------------------------- 500001 abcdefg 500002 abcdefg 500003 abcdefg 500004 abcdefg 500005 abcdefg 500006 abcdefg 500007 abcdefg 500008 abcdefg 500009 abcdefg 500010 abcdefg 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3391785554 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 500 | 15 (0)| 00:00:01 | |* 1 | COUNT STOPKEY | | | | | | | 2 | NESTED LOOPS OUTER | | 10 | 500 | 15 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| TEST2 | 10 | 250 | 4 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | SYS_C00151146 | 9000 | | 3 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| TEST1 | 1 | 25 | 2 (0)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | SYS_C00151145 | 1 | | 1 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(ROWNUM<=10) 4 - access("TEST2"."A">500000) 6 - access("TEST2"."A"="TEST1"."A"(+)) filter("TEST1"."A"(+)>500000) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 25 consistent gets 0 physical reads 0 redo size 937 bytes sent via SQL*Net to client 500 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
Oracle 選擇了nestloop join。
使用HINT,讓Oracle使用merge join,看看成本是多少,是否與修正PostgreSQL merge join啟動(dòng)成本接近。
select /*+ USE_MERGE(test2,test1) */ * from test2 left join test1 on test2.a = test1.a where test2.a > 500000 and rownum<=10; Execution Plan ---------------------------------------------------------- Plan hash value: 492577188 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10 | 750 | 29 (7)| 00:00:01 | |* 1 | COUNT STOPKEY | | | | | | | 2 | MERGE JOIN OUTER | | 10 | 750 | 29 (7)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | TEST2 | 10 | 250 | 4 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | SYS_C00151146 | 9000 | | 3 (0)| 00:00:01 | |* 5 | SORT JOIN | | 25000 | 610K| 25 (8)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| TEST1 | 25000 | 610K| 23 (0)| 00:00:01 | |* 7 | INDEX RANGE SCAN | SYS_C00151145 | 4500 | | 11 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(ROWNUM<=10) 4 - access("TEST2"."A">500000) 5 - access("TEST2"."A"="TEST1"."A"(+)) filter("TEST2"."A"="TEST1"."A"(+)) 7 - access("TEST1"."A"(+)>500000) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 1099 consistent gets 0 physical reads 0 redo size 937 bytes sent via SQL*Net to client 500 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 10 rows processed
1、PostgreSQL 在計(jì)算merge join+limit的成本時(shí),優(yōu)化器有優(yōu)化的空間,可以考慮把啟動(dòng)成本算進(jìn)來,提高優(yōu)化器選擇帶limit輸出的SQL的JOIN方法的正確性。
2、如果是inner join,通過query rewrite可以對(duì)merge join進(jìn)行優(yōu)化,跳過不符合條件的頭部INDEX SCAN。
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 join test1 on test2.a = test1.a where test2.a > 500000 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.77..1.09 rows=10 width=24) (actual time=54.626..54.638 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Buffers: shared hit=2042 -> Merge Join (cost=0.77..7895.19 rows=247295 width=24) (actual time=54.625..54.635 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Inner Unique: true Merge Cond: (test2.a = test1.a) Buffers: shared hit=2042 -> Index Scan using test2_pkey on public.test2 (cost=0.37..3369.86 rows=494590 width=12) (actual time=0.017..0.020 rows=19 loops=1) Output: test2.a, test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4 -> Index Scan using test1_pkey on public.test1 (cost=0.37..2322.99 rows=500000 width=12) (actual time=0.008..34.009 rows=250010 loops=1) Output: test1.a, test1.b Buffers: shared hit=2038 Planning Time: 0.244 ms Execution Time: 54.669 ms (17 rows) sql rewrite: 可以做到內(nèi)核里面,這樣就不需要改SQL了。效果如下,超好。 postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test2 join test1 on test2.a = test1.a where test2.a > 500000 and test1.a > 500000limit 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.75..1.30 rows=10 width=24) (actual time=0.035..0.048 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Buffers: shared hit=8 -> Merge Join (cost=0.75..6711.51 rows=123700 width=24) (actual time=0.034..0.044 rows=10 loops=1) Output: test2.a, test2.b, test1.a, test1.b Inner Unique: true Merge Cond: (test2.a = test1.a) Buffers: shared hit=8 -> Index Scan using test2_pkey on public.test2 (cost=0.37..3369.86 rows=494590 width=12) (actual time=0.015..0.019 rows=19 loops=1) Output: test2.a, test2.b Index Cond: (test2.a > 500000) Buffers: shared hit=4 -> Index Scan using test1_pkey on public.test1 (cost=0.37..1704.30 rows=250106 width=12) (actual time=0.015..0.017 rows=10 loops=1) Output: test1.a, test1.b Index Cond: (test1.a > 500000) Buffers: shared hit=4 Planning Time: 0.276 ms Execution Time: 0.074 ms (18 rows)
《PostgreSQL 優(yōu)化器案例之 - order by limit 索引選擇問題》
src/backend/optimizer/path/costsize.c
原文地址:https://github.com/digoal/blog/blob/master/201810/20181004_03.md