報錯如下MySQL不能正常啟動
成都創(chuàng)新互聯(lián)公司專注于梨林企業(yè)網(wǎng)站建設,成都響應式網(wǎng)站建設公司,商城網(wǎng)站制作。梨林網(wǎng)站建設公司,為梨林等地區(qū)提供建站服務。全流程按需定制,專業(yè)設計,全程項目跟蹤,成都創(chuàng)新互聯(lián)公司專業(yè)和態(tài)度為您提供的服務
2017-09-22 10:39:05 21409 [Note] InnoDB: Database was not shutdown normally! 2017-09-22 10:39:05 21409 [Note] InnoDB: Starting crash recovery. 2017-09-22 10:39:05 21409 [Note] InnoDB: Reading tablespace information from the .ibd files... 2017-09-22 10:39:05 21409 [ERROR] InnoDB: Space id in fsp header 1416128883,but in the page header 824195850
我曾經(jīng)寫過這樣一個文章如下:
InnoDB: Error: space id and page n:o stored in the page?
http://blog.itpub.net/7728585/viewspace-2121548/
但是上面的文章只是人為模擬,實際上在生產(chǎn)情況中嚴重得多,基本是塊的物理順壞,今天又有網(wǎng)友問我這個問題,實際上遇到這種問題一般都涉及到塊的物理損壞了,修復的可能性并不大,我們來看看源碼拋錯位置:
/**********************************************************************//** Reads the space id from the first page of a tablespace. @return space id, ULINT UNDEFINED if error */ ulint fsp_header_get_space_id( /*====================*/ const page_t* page) /*!< in: first page of a tablespace */ { ulint fsp_id; ulint id; fsp_id = mach_read_from_4(FSP_HEADER_OFFSET + page + FSP_SPACE_ID); id = mach_read_from_4(page + FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID); DBUG_EXECUTE_IF("fsp_header_get_space_id_failure", id = ULINT_UNDEFINED;); if (id != fsp_id) { ib::error() << "Space ID in fsp header is " << fsp_id << ", but in the page header it is " << id << "."; return(ULINT_UNDEFINED); } return(id); }
File space header是每一個表空間的第一個塊才會有的,存儲的信息如下:
/* SPACE HEADER ============ File space header data structure: this data structure is contained in the first page of a space. The space for this header is reserved in every extent descriptor page, but used only in the first. */ /*-------------------------------------*/ #define FSP_SPACE_ID 0 /* space id */ #define FSP_NOT_USED 4 /* this field contained a value up to which we know that the modifications in the database have been flushed to the file space; not used now */ #define FSP_SIZE 8 /* Current size of the space in pages */ #define FSP_FREE_LIMIT 12 /* Minimum page number for which the free list has not been initialized: the pages >= this limit are, by definition, free; note that in a single-table tablespace where size < 64 pages, this number is 64, i.e., we have initialized the space about the first extent, but have not physically allocated those pages to the file */ #define FSP_SPACE_FLAGS 16 /* fsp_space_t.flags, similar to dict_table_t::flags */ #define FSP_FRAG_N_USED 20 /* number of used pages in the FSP_FREE_FRAG list */ #define FSP_FREE 24 /* list of free extents */ #define FSP_FREE_FRAG (24 + FLST_BASE_NODE_SIZE) /* list of partially free extents not belonging to any segment */ #define FSP_FULL_FRAG (24 + 2 * FLST_BASE_NODE_SIZE) /* list of full extents not belonging to any segment */ #define FSP_SEG_ID (24 + 3 * FLST_BASE_NODE_SIZE) /* 8 bytes which give the first unused segment id */ #define FSP_SEG_INODES_FULL (32 + 3 * FLST_BASE_NODE_SIZE) /* list of pages containing segment headers, where all the segment inode slots are reserved */ #define FSP_SEG_INODES_FREE (32 + 4 * FLST_BASE_NODE_SIZE) /* list of pages containing segment headers, where not all the segment header slots are reserved */
那么這個錯誤簡單的說就是某個表空間ibd文件的第一個塊的34后4字節(jié)和38字節(jié)后4個字節(jié)不能通過檢測,但是要注意這僅僅是這8個字節(jié)一個檢測。實際上這種錯誤基本對應著表空間文件的物理順壞,一般來說其他字節(jié)也出現(xiàn)了問題,如果沒有備份或者其他雙機手段可能要導致這個表空間的數(shù)據(jù)丟失。好了我們回到案例。
隨即我問這位朋友是否是獨立表空間,他說是的,然后我們就需要找到到底哪個表空間出了問題。我曾經(jīng)有一個讀取二進制文件的工具,放到百度云盤:
http://pan.baidu.com/s/1num76RJ
可以直接讀取這樣的信息如下:
****************************************************************** This Tool Is Uesed For Find The Data In Binary format(Hexadecimal) Usage:./bcview file blocksize offset cnt-bytes! file: Is Your File Will To Find Data! blocksize: Is N kb Block.Eg: 8 Is 8 Kb Blocksize(Oracle)! Eg: 16 Is 16 Kb Blocksize(Innodb)! offset:Is Every Block Offset Your Want Start! cnt-bytes:Is After Offset,How Bytes Your Want Gets! Edtor QQ:22389860! Used gcc version 4.1.2 20080704 (Red Hat 4.1.2-46) ****************************************************************** ----Current file size is :0.109375 Mb ----Current use set blockszie is 16 Kb ----Current file name is t6.ibd current block:00000000--Offset:00036--cnt bytes:08--data is:001e0000001e0000
我們可以得到數(shù)據(jù)
current block:00000000--Offset:00036--cnt bytes:08--data is:001e0000001e0000
但是要掃描全部的ibd文件才能找到是哪個表空間檢測出錯,但是這個哥們shell也是比較好,寫了一個如下的shell來完成:
find /opt/app/mysql5/var/ -iname \*.ibd > a for i in `cat ./a`;do ./bcview $i 16 34 8 | head -15 >> a.log;done
這樣將所有表空間的第一個塊的信息的34-42字節(jié)信息都提取出來了,我隨即查看了一下,找到了報錯的表空間:
current block:00000000--Offset:00034--cnt bytes:08--data is:31203b0a54686973
顯然他們是不相等,正常情況下這8個字節(jié)4字節(jié)4字節(jié)比較是相同的,然后我查看了這個文件中塊的checksum也是不相等,這個時候只有2個正常的處理方式:
顯然這里也再次重申了備份的重要性,遇到這樣的錯誤,最安全的方式還是進行數(shù)據(jù)恢復,當然某些工具可以直接提取數(shù)據(jù)文件里面的數(shù)據(jù),但是現(xiàn)有的版本支持并不是太好,而且風險也是特別大。
作者微信