真实的国产乱ⅩXXX66竹夫人,五月香六月婷婷激情综合,亚洲日本VA一区二区三区,亚洲精品一区二区三区麻豆

成都創(chuàng)新互聯(lián)網(wǎng)站制作重慶分公司

深入解析:從源碼窺探MySQL優(yōu)化器

深入解析:從源碼窺探MySQL優(yōu)化器

南川網(wǎng)站建設(shè)公司創(chuàng)新互聯(lián),南川網(wǎng)站設(shè)計(jì)制作,有大型網(wǎng)站制作公司豐富經(jīng)驗(yàn)。已為南川上1000家提供企業(yè)網(wǎng)站建設(shè)服務(wù)。企業(yè)網(wǎng)站搭建\成都外貿(mào)網(wǎng)站制作要多少錢(qián),請(qǐng)找那個(gè)售后服務(wù)好的南川做網(wǎng)站的公司定做!

作者 | 湯愛(ài)中,云和恩墨SQM開(kāi)發(fā)者,Oracle/MySQL/DB2的SQL解析引擎、SQL審核與智能優(yōu)化引擎的重要貢獻(xiàn)者,產(chǎn)品廣泛應(yīng)用于金融、電信等行業(yè)客戶中。

摘要

優(yōu)化器是邏輯SQL到物理存儲(chǔ)的解釋器,是一個(gè)復(fù)雜而“愚蠢”的數(shù)學(xué)模型,它的入?yún)⑼ǔJ荢QL、統(tǒng)計(jì)信息以及優(yōu)化器參數(shù)等,而輸出通常一個(gè)可執(zhí)行的查詢計(jì)劃,因此優(yōu)化器的優(yōu)劣取決于數(shù)學(xué)模型的穩(wěn)定性和健壯性,理解這個(gè)數(shù)學(xué)模型就能理解數(shù)據(jù)庫(kù)的SQL處理流程。

01 優(yōu)化器的執(zhí)行流程 

深入解析:從源碼窺探MySQL優(yōu)化器

(注:此圖出自李海翔)

上圖展示了優(yōu)化器的大致執(zhí)行過(guò)程,可以簡(jiǎn)單描述為:

1 根據(jù)語(yǔ)法樹(shù)及統(tǒng)計(jì)統(tǒng)計(jì),構(gòu)建初始表訪問(wèn)數(shù)組(init_plan_arrays)

2 根據(jù)表訪問(wèn)數(shù)組,計(jì)算每個(gè)表的最佳訪問(wèn)路徑(find_best_ref),同時(shí)保存當(dāng)前最優(yōu)執(zhí)行計(jì)劃(COST最?。?/p>

3 如果找到更優(yōu)的執(zhí)行計(jì)劃則更新最優(yōu)執(zhí)行計(jì)劃,否則優(yōu)化結(jié)束。

從上述流程可以看出,執(zhí)行計(jì)劃的生成是一個(gè)“動(dòng)態(tài)規(guī)劃/貪心算法”的過(guò)程,動(dòng)態(tài)規(guī)劃公式可以表示為:Min(Cost(Tn+1)) = Min(Cost(T1))+Min(Cost(T2))+...Min(Cost(Tn-1))+Min(Cost(Tn)),其中Cost(Tn)表示訪問(wèn)表T1 T2一直到Tn的代價(jià)。如果優(yōu)化器沒(méi)有任何先驗(yàn)知識(shí),則需要進(jìn)行 A(n,n) 次循環(huán),是一個(gè)全排列過(guò)程,很顯然優(yōu)化器是有先驗(yàn)知識(shí)的,如表大小,外連接,子查詢等都會(huì)使得表的訪問(wèn)是部分有序的,可以理解為一個(gè)“被裁減”的動(dòng)態(tài)規(guī)劃,動(dòng)態(tài)規(guī)則的核心函數(shù)為:Join::Best_extention_limited_search,在源碼中有如下代碼結(jié)構(gòu):

bool Optimize_table_order::best_extension_by_limited_search(
         table_map remaining_tables,
         uint      idx,
         uint      current_search_depth)
{
      for (JOIN_TAB **pos= join->best_ref + idx; *pos; pos++)
    {
      ......
         best_access_path(s, remaining_tables, idx, false,
                       idx ? (position-1)->prefix_rowcount : 1.0,
                       position);
......
  if (best_extension_by_limited_search(remaining_tables_after,
                                             idx + 1,
                                             current_search_depth - 1))
          ......
          backout_nj_state(remaining_tables, s);
......
}
}

以上代碼是在一個(gè)for循環(huán)中遞歸搜索,這是一個(gè)典型的全排列的算法。

02優(yōu)化器參數(shù)

MySQL的優(yōu)化器對(duì)于Oracle來(lái)說(shuō)還顯得比較幼稚。Oracle有著各種豐富的統(tǒng)計(jì)信息,比如系統(tǒng)統(tǒng)計(jì)信息,表統(tǒng)計(jì)信息,索引統(tǒng)計(jì)信息等,而MySQL則需要更多的常量,其中MySQL5.7提供了表mysql.server_cost和表mysql.engine_cost,可以供用戶配置,使得用戶能夠調(diào)整優(yōu)化器模型,下面就幾個(gè)常見(jiàn)而又非常重要的參數(shù)進(jìn)行介紹:

1 #define ROW_EVALUATE_COST 0.2f

  計(jì)算符合條件的行的代價(jià),行數(shù)越多,代價(jià)越大

2 #define IO_BLOCK_READ_COST 1.0f

  從磁盤(pán)讀取一個(gè)Page的代價(jià)

3 #define MEMORY_BLOCK_READ_COST 1.0f

 從內(nèi)存讀取一個(gè)Page的代價(jià),對(duì)于Innodb來(lái)說(shuō),表示從一個(gè)Buffer Pool讀取一個(gè)Page的代價(jià),因此讀取內(nèi)存頁(yè)和磁盤(pán)頁(yè)的默認(rèn)代價(jià)是一樣的

4 #define COND_FILTER_EQUALITY 0.1f

等值過(guò)濾條件默認(rèn)值為0.1,例如name = ‘lily’, 表大小為100,則返回10行數(shù)據(jù)

5 #define COND_FILTER_INEQUALITY 0.3333f

非等值過(guò)濾條件的默認(rèn)值是0.3333,例如col1>col2

6 #define COND_FILTER_BETWEEN 0.1111f

 Between過(guò)濾的默認(rèn)值是0.1111f,例如:col1 between a and b

......

這樣的常量很多,涉及到過(guò)濾條件、JOIN緩存、臨時(shí)表等等各種代價(jià),理解這些常量后,看到執(zhí)行計(jì)劃的Cost后,你會(huì)有種豁然開(kāi)朗的感覺(jué)!

03 優(yōu)化器選項(xiàng)

在MySQL中,執(zhí)行select @@optimizer_trace, 得到如下參數(shù):

index_merge=on,index_merge_union=off,index_merge_sort_union=off, index_merge_intersection=on, engine_condition_pushdown=on, index_condition_pushdown=on, mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,subquery_materialization_cost_based=on, use_index_extensions=on, condition_fanout_filter=on

04 Optimize Trace是如何生成的?

在流程圖中的函數(shù)中,存在大量如下代碼:

Opt_trace_object trace_ls(trace, "searching_loose_scan_index");

因此,在優(yōu)化器運(yùn)行過(guò)程中,優(yōu)化器的執(zhí)行路徑也被保存在Opt_trace_object中,進(jìn)而保存在information_schema.optimizer_trace中,方便用戶查詢和跟蹤。

05 優(yōu)化器的典型使用場(chǎng)景

5.1 全表掃描

select * from sakila.actor;

表actor統(tǒng)計(jì)信息如下:

Db_name

Table_name

Last_update

n_rows

Cluster_index_size

Other_index

sakila

actor

2018-11-20 16:20:12

200

1

0

 

主鍵actor_id統(tǒng)計(jì)信息如下:

Index_name

Last_update

Stat_name

Stat_value

Sample_size

Stat_description

PRIMARY

2018-11-14 14:25:49

n_diff_pfx01

200

1

actor_id

PRIMARY

2018-11-14 14:25:49

n_leaf_pages

1

NULL

Number of leaf pages in the index

PRIMARY

2018-11-14 14:25:49

size

1

NULL

Number of pages in the index

執(zhí)行計(jì)劃:  

{                                                                                           
  "query_block": {                                                                                        
    "select_id": 1,                                                                       
    "cost_info": {                                                                                            
      "query_cost": "41.00"                                                                                    
    },                                                       
    "table": {                                                            
      "table_name": "actor",                                                                                    
      "access_type": "ALL",                                                                      
      "rows_examined_per_scan": 200,                                                                 
      "rows_produced_per_join": 200,                              
      "filtered": "100.00",                                                                  
      "cost_info": {                                                          
        "read_cost": "1.00",                                                                    
        "eval_cost": "40.00",                                                              
        "prefix_cost": "41.00",                                                               
        "data_read_per_join": "56K"  
      },                                                       
      "used_columns": [                                                                 
        "actor_id",                                                      
        "first_name",                                                            
        "last_name",                                                             
        "last_update",                                                            
        "id"                                                          
      ]                                                         
    }                                                                 
  }                                                                
}
IO_COST = CLUSTER_INDEX_SIZE * PAGE_READ_TIME = 1 * 1 =1;
EVAL_COST = TABLE_ROWS*EVALUATE_COST = 200 * 0.2 =40;
PREFIX_COST = IO_COST + EVAL_COST;

注意以上過(guò)程忽略了內(nèi)存頁(yè)和磁盤(pán)頁(yè)的訪問(wèn)代價(jià)差異。

 5.2 表連接時(shí)使用全表掃描

SELECT
  *
FROM
  sakila.actor a,
  sakila.film_actor b
WHERE a.actor_id = b.actor_id

Db_name

Table_name

Last_update

n_rows

Cluster_index_size

Other_index_size

Sakila

Film_actor

2018-11-20 16:55:31

5462

12

5

表film_actor中索引(actor_id,film_id)統(tǒng)計(jì)信息如下:

Index_name

Last_update

Stat_name

Stat_value

Sample_size

Stat_description

PRIMARY

2018-11-14 14:25:49

n_diff_pfx01

200

1

actor_id

PRIMARY

2018-11-14 14:25:49

n_diff_pfx02

5462

1

actor_id,film_id

PRIMARY

2018-11-14 14:25:49

n_leaf_pages

11

NULL

Number of leaf pages in the index

PRIMARY

2018-11-14 14:25:49

size

12

NULL

Number of pages in theindex

{                                      
  "query_block": {                   
    "select_id": 1,                    
    "cost_info": {                   
      "query_cost": "1338.07"             
    },                                
    "nested_loop": [                
      {                              
        "table": {       
          "table_name": "a",  
          "access_type": "ALL", 
          "possible_keys": [   
            "PRIMARY"            
          ],                  
          "rows_examined_per_scan": 200, 
          "rows_produced_per_join": 200, 
          "filtered": "100.00", 
          "cost_info": {       
            "read_cost": "1.00", 
            "eval_cost": "40.00", 
            "prefix_cost": "41.00",  
            "data_read_per_join": "54K"  
          },                     
          "used_columns": [     
            "actor_id",      
            "first_name",    
            "last_name",      
            "last_update"          
          ]                          
        }                              
      },                              
      {                            
        "table": {               
          "table_name": "b",     
          "access_type": "ref",  
          "possible_keys": [  
            "PRIMARY"               
          ],                     
          "key": "PRIMARY",       
          "used_key_parts": [     
            "actor_id"              
          ],                    
          "key_length": "2",     
          "ref": [            
            "sakila.a.actor_id"    
          ],                 
          "rows_examined_per_scan": 27,  
          "rows_produced_per_join": 5461, 
          "filtered": "100.00",     
          "cost_info": {          
            "read_cost": "204.67", 
            "eval_cost": "1092.40", 
            "prefix_cost": "1338.07", 
            "data_read_per_join": "85K" 
          }, 
          "used_columns": [  
            "actor_id",            
            "film_id",        
            "last_update"           
          ]                              
        }                               
      }                                 
    ]                                     
  }                                          
}

第一張表actor的全表掃代價(jià)為41,可以參考示例1。

        第二個(gè)表就是NET LOOP 代價(jià):

read_cost(204.67) =prefix_rowcount * (1 + keys_per_value/table_rows*cluster_index_size =

200 * (1+27/13863*12)*1

注意:27 相當(dāng)于對(duì)于每個(gè)actor_id,film_actor的索引估計(jì),對(duì)于每個(gè)actor_id,平均有27條記錄=5462/200

Table_rows是如何計(jì)算的呢?

Film_actor表的實(shí)際記錄數(shù)是5462,一共12個(gè)page,11個(gè)葉子頁(yè),總大小為11*16K(默認(rèn)頁(yè)大小)=180224Byte, 最小記錄長(zhǎng)度為26(通過(guò)計(jì)算字段長(zhǎng)度可得),13863 = 180224/26*2, 2是安全因子,做最差的代價(jià)估計(jì)。

表連接返回行數(shù)=200*5462/200,因此行估算代價(jià)為5462*0.2=1902.4

5.3 IN查詢

表film_actor中索引idx_id(film_id)統(tǒng)計(jì)信息如下:

Index_name

Last_update

Stat_name

Stat_value

Sample_size

Stat_description

idx_id

2018-11-14 14:25:49

n_diff_pfx01

997

4

actor_id

idx_id

2018-11-14 14:25:49

n_diff_pfx02

5462

4

film_id,actor_id

idx_id

2018-11-14 14:25:49

n_leaf_pages

4

NULL

Number of leaf pages in the index

idx_id

2018-11-14 14:25:49

size

5

NULL

Number of pages in the index

EXPLAIN SELECT * FROM ACTOR WHERE actor_id IN (SELECT film_id FROM film_actor)
{                                  
  "query_block": {              
    "select_id": 1,             
    "cost_info": {                  
      "query_cost": "460.79"    
    },                            
    "nested_loop": [               
      {                      
        "table": {        
          "table_name": "ACTOR", 
          "access_type": "ALL", 
          "possible_keys": [ 
            "PRIMARY"          
          ],               
          "rows_examined_per_scan": 200, 
          "rows_produced_per_join": 200, 
          "filtered": "100.00", 
          "cost_info": { 
            "read_cost": "1.00", 
            "eval_cost": "40.00", 
            "prefix_cost": "41.00", 
            "data_read_per_join": "56K" 
          },                 
          "used_columns": [ 
            "actor_id",  
            "first_name",   
            "last_name", 
            "last_update",   
            "id"                 
          ]                    
        }                         
      },                         
      {                        
        "table": {     
          "table_name": "film_actor",     
          "access_type": "ref", 
          "possible_keys": [ 
            "idx_id"            
          ],                
          "key": "idx_id", 
          "used_key_parts": [ 
            "film_id"           
          ],               
          "key_length": "2",   
          "ref": [      
            "sakila.ACTOR.actor_id" 
          ],           
          "rows_examined_per_scan": 5, 
          "rows_produced_per_join": 200, 
          "filtered": "100.00", 
          "using_index": true, 
          "first_match": "ACTOR", 
          "cost_info": { 
            "read_cost": "200.66", 
            "eval_cost": "40.00", 
            "prefix_cost": "460.79", 
            "data_read_per_join": "3K" 
          },                      
          "used_columns": [  
            "film_id"            
          ],                      
          "attached_condition": "(`sakila`.`actor`.`actor_id` = `sakila`.`film_actor`.`film_id`)"                                                                        
        }                            
      }                                
    ]                                     
  }                                       
}
id  select_type  table       partitions  type    possible_keys  key     key_len  ref                      rows  filtered  Extra                                        
------  -----------  ----------  ----------  ------  -------------  ------  -------  ---------------------  ------  --------  ---------------------------------------------
     1  SIMPLE       ACTOR       (NULL)      ALL     PRIMARY        (NULL)  (NULL)   (NULL)                    200    100.00  (NULL)                                       
     1  SIMPLE       film_actor  (NULL)      ref     idx_id         idx_id  2 
sakila.ACTOR.actor_id       5    100.00  Using where; Using index; FirstMatch(ACTOR)

從執(zhí)行計(jì)劃中可以看出,MySQL采用FirstMatch方式。在MySQL中,半鏈接優(yōu)化方式為:Materialization Strategy,LooseScan,F(xiàn)irstMatch,DuplicateWeedout,默認(rèn)情況下四種優(yōu)化方式都是存在的,選取方式基于最小COST?,F(xiàn)在我們以FirstMatch為例,講解優(yōu)化器的執(zhí)行流程。

 SQL如下:

  select * from Country
where Country.code IN (select City.Country
                       from City
                       where City.Population > 1*1000*1000)
      and Country.continent='Europe'

深入解析:從源碼窺探MySQL優(yōu)化器

從上圖可以看出,F(xiàn)irstMatch是通過(guò)判斷記錄是否已經(jīng)在結(jié)果集中存在來(lái)減少查詢和匹配流程。

表actor的訪問(wèn)代價(jià)可以參考示例1.

表film_actor表的訪問(wèn)代價(jià)200.66是如何計(jì)算的呢?

訪問(wèn)表film_actor中索引字段film_id,MySQL會(huì)走覆蓋索引掃,即IDEX_ONLY_SCAN,一次索引訪問(wèn)的代價(jià)是如何計(jì)算的呢?

參考函數(shù)double handler::index_only_read_time(uint keynr, double records)

索引塊大小為16K,并且MySQL假設(shè)塊都是半滿的,則一個(gè)塊能夠存放的索引記錄數(shù)為:

16K/2/(索引長(zhǎng)度+主鍵長(zhǎng)度(注:二級(jí)索引存儲(chǔ)的是主鍵的引用))=16K/2/(2+4)+1=1366,

其中主鍵為(actor_id,film_id),兩個(gè)字段都是smallint,占用4個(gè)字節(jié),而索引idx_id(film_id)是2個(gè)字節(jié),因此每次訪問(wèn)索引的代價(jià)為:(5.47+1366-1)/1366 = 1.0032, 訪問(wèn)film_actor表一共需要200次,總訪問(wèn)代價(jià)為:200*1.0032=200.66

總代價(jià)460.79 = 表actor的訪問(wèn)代價(jià)+表film_actor訪問(wèn)代價(jià)+行估算代價(jià)=

41+200.66+200*1*5.47*1*02,其中兩個(gè)1分別表示過(guò)濾因子,由于兩個(gè)表均沒(méi)有過(guò)濾條件因此過(guò)濾因子都是1。


當(dāng)前題目:深入解析:從源碼窺探MySQL優(yōu)化器
當(dāng)前網(wǎng)址:http://weahome.cn/article/jgieoe.html

其他資訊

在線咨詢

微信咨詢

電話咨詢

028-86922220(工作日)

18980820575(7×24)

提交需求

返回頂部