本節(jié)簡單介紹了PostgreSQL手工執(zhí)行vacuum的處理流程,主要分析了ExecVacuum->vacuum->vacuum_rel->heap_vacuum_rel->lazy_scan_heap->heap_execute_freeze_tuple函數(shù)的實現(xiàn)邏輯,該函數(shù)執(zhí)行實際的元組凍結(jié)操作(先前已完成準(zhǔn)備工作)。
公司主營業(yè)務(wù):成都做網(wǎng)站、成都網(wǎng)站建設(shè)、移動網(wǎng)站開發(fā)等業(yè)務(wù)。幫助企業(yè)客戶真正實現(xiàn)互聯(lián)網(wǎng)宣傳,提高企業(yè)的競爭能力。成都創(chuàng)新互聯(lián)是一支青春激揚、勤奮敬業(yè)、活力青春激揚、勤奮敬業(yè)、活力澎湃、和諧高效的團隊。公司秉承以“開放、自由、嚴(yán)謹(jǐn)、自律”為核心的企業(yè)文化,感謝他們對我們的高要求,感謝他們從不同領(lǐng)域給我們帶來的挑戰(zhàn),讓我們激情的團隊有機會用頭腦與智慧不斷的給客戶帶來驚喜。成都創(chuàng)新互聯(lián)推出興縣免費做網(wǎng)站回饋大家。
宏定義
Vacuum和Analyze命令選項
/* ----------------------
* Vacuum and Analyze Statements
* Vacuum和Analyze命令選項
*
* Even though these are nominally two statements, it's convenient to use
* just one node type for both. Note that at least one of VACOPT_VACUUM
* and VACOPT_ANALYZE must be set in options.
* 雖然在這里有兩種不同的語句,但只需要使用統(tǒng)一的Node類型即可.
* 注意至少VACOPT_VACUUM/VACOPT_ANALYZE在選項中設(shè)置.
* ----------------------
*/
typedef enum VacuumOption
{
VACOPT_VACUUM = 1 << 0, /* do VACUUM */
VACOPT_ANALYZE = 1 << 1, /* do ANALYZE */
VACOPT_VERBOSE = 1 << 2, /* print progress info */
VACOPT_FREEZE = 1 << 3, /* FREEZE option */
VACOPT_FULL = 1 << 4, /* FULL (non-concurrent) vacuum */
VACOPT_SKIP_LOCKED = 1 << 5, /* skip if cannot get lock */
VACOPT_SKIPTOAST = 1 << 6, /* don't process the TOAST table, if any */
VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */
} VacuumOption;
HeapTupleHeaderData
堆元組頭部.為了避免浪費空間,字段通過這么一種方式進行布局避免不必要的對齊填充.
/*
* Heap tuple header. To avoid wasting space, the fields should be
* laid out in such a way as to avoid structure padding.
* 堆元組頭部.為了避免浪費空間,字段通過這么一種方式進行布局避免結(jié)構(gòu)體不必要的填充.
*
* Datums of composite types (row types) share the same general structure
* as on-disk tuples, so that the same routines can be used to build and
* examine them. However the requirements are slightly different: a Datum
* does not need any transaction visibility information, and it does need
* a length word and some embedded type information. We can achieve this
* by overlaying the xmin/cmin/xmax/cmax/xvac fields of a heap tuple
* with the fields needed in the Datum case. Typically, all tuples built
* in-memory will be initialized with the Datum fields; but when a tuple is
* about to be inserted in a table, the transaction fields will be filled,
* overwriting the datum fields.
* 組合類型(行類型)的Datums與磁盤上的元組共享相同的常規(guī)結(jié)構(gòu)體,
* 因此可以使用相同的處理過程來構(gòu)造和檢查這些信息.
* 但是,需求可能很不一樣:Datum不需要任何事物可見性相關(guān)的信息,但確實需要長度字和一些嵌入的類型信息.
* 在Datum這種情況下,我們可以通過使用堆元組中的xmin/cmin/xmax/cmax/xvac字段疊加來獲取這些信息.
* 典型的,在內(nèi)存中構(gòu)造的所有元組會通過Datum字段初始化,但在元組將要插入到表時,事務(wù)字段會被填充,覆寫Datum字段.
*
* The overall structure of a heap tuple looks like:
* fixed fields (HeapTupleHeaderData struct)
* nulls bitmap (if HEAP_HASNULL is set in t_infomask)
* alignment padding (as needed to make user data MAXALIGN'd)
* object ID (if HEAP_HASOID_OLD is set in t_infomask, not created
* anymore)
* user data fields
* 堆元組的整體結(jié)構(gòu)看起來是這樣的:
* 固定字段(HeapTupleHeaderData結(jié)構(gòu)體)
* nulls位圖(如在t_infomask中設(shè)置了HEAP_HASNULL標(biāo)記位)
* 對齊填充(如MAXALIGN)
* 對象ID(如t_infomask設(shè)置了HEAP_HASOID_OLD標(biāo)記位,則沒有創(chuàng)建)
* 用戶數(shù)據(jù)字段
*
* We store five "virtual" fields Xmin, Cmin, Xmax, Cmax, and Xvac in three
* physical fields. Xmin and Xmax are always really stored, but Cmin, Cmax
* and Xvac share a field. This works because we know that Cmin and Cmax
* are only interesting for the lifetime of the inserting and deleting
* transaction respectively. If a tuple is inserted and deleted in the same
* transaction, we store a "combo" command id that can be mapped to the real
* cmin and cmax, but only by use of local state within the originating
* backend. See combocid.c for more details. Meanwhile, Xvac is only set by
* old-style VACUUM FULL, which does not have any command sub-structure and so
* does not need either Cmin or Cmax. (This requires that old-style VACUUM
* FULL never try to move a tuple whose Cmin or Cmax is still interesting,
* ie, an insert-in-progress or delete-in-progress tuple.)
* 在三個物理字段中存儲了5個"虛擬"字段,分別是Xmin, Cmin, Xmax, Cmax, and Xvac.
* Xmin和Xmax通常是實際存儲的,但Cmin,Cmax和Xvac共享一個字段.
* 這樣之所以可行是因為我們知道Cmin和Cmax只在相應(yīng)的插入和刪除事務(wù)生命周期時才會有用.
* 如果元組在同一個事務(wù)中插入和刪除,則存儲一個"combo"命令I(lǐng)D,該ID可以映射到實際的cmin和cmax,
* 但只有在原始后臺進程中使用本地狀態(tài)時才使用.
* 同時,Xvac在老版本的VACUUM FULL時才會設(shè)置,該命令不存在命令子結(jié)構(gòu)因此不需要Cmin和Cmax.
* (這需要老版本的VACUUM FULL永遠(yuǎn)不要嘗試移動Cmin和Cmax仍有用的元組,比如在插入或刪除元組期間).
*
* A word about t_ctid: whenever a new tuple is stored on disk, its t_ctid
* is initialized with its own TID (location). If the tuple is ever updated,
* its t_ctid is changed to point to the replacement version of the tuple. Or
* if the tuple is moved from one partition to another, due to an update of
* the partition key, t_ctid is set to a special value to indicate that
* (see ItemPointerSetMovedPartitions). Thus, a tuple is the latest version
* of its row iff XMAX is invalid or
* t_ctid points to itself (in which case, if XMAX is valid, the tuple is
* either locked or deleted). One can follow the chain of t_ctid links
* to find the newest version of the row, unless it was moved to a different
* partition. Beware however that VACUUM might
* erase the pointed-to (newer) tuple before erasing the pointing (older)
* tuple. Hence, when following a t_ctid link, it is necessary to check
* to see if the referenced slot is empty or contains an unrelated tuple.
* Check that the referenced tuple has XMIN equal to the referencing tuple's
* XMAX to verify that it is actually the descendant version and not an
* unrelated tuple stored into a slot recently freed by VACUUM. If either
* check fails, one may assume that there is no live descendant version.
* 關(guān)于c_ctid要說的:不管什么時候元組存儲到磁盤上,元組的t_ctid使用自己的TID(位置)進行初始化.
* 如果元組曾經(jīng)修改過,那么t_ctid修改為指向元組的新版本上.
* 或者,如果元組從一個分區(qū)移動到另外一個分區(qū),由于分區(qū)鍵的修改,
* t_ctid會設(shè)置為一個特別的值用以表示這種情況(詳細(xì)查看ItemPointerSetMovedPartitions).
* 因此,在XMAX是無需或者t_ctid指向自己的時候,元組是最后的版本
* (在這種情況下,如果XMAX是有效的,元組要么被鎖定要么已被刪除)
*
* t_ctid is sometimes used to store a speculative insertion token, instead
* of a real TID. A speculative token is set on a tuple that's being
* inserted, until the inserter is sure that it wants to go ahead with the
* insertion. Hence a token should only be seen on a tuple with an XMAX
* that's still in-progress, or invalid/aborted. The token is replaced with
* the tuple's real TID when the insertion is confirmed. One should never
* see a speculative insertion token while following a chain of t_ctid links,
* because they are not used on updates, only insertions.
* t_ctid有時候用于存儲 speculative insertion token而不是一個實際的TID.
* 在正在插入的元組上設(shè)置speculative token,直至插入程序確定繼續(xù)插入.
* 因此token在XMAX事務(wù)正在處理或者無效/回滾時可以查看.
* token在插入確認(rèn)后被替換成實際的TID.
* 在跟蹤t_ctid鏈接鏈時,不應(yīng)該看到speculative insertion token,
* 因為它們不用于更新,只用于插入。
*
* Following the fixed header fields, the nulls bitmap is stored (beginning
* at t_bits). The bitmap is *not* stored if t_infomask shows that there
* are no nulls in the tuple. If an OID field is present (as indicated by
* t_infomask), then it is stored just before the user data, which begins at
* the offset shown by t_hoff. Note that t_hoff must be a multiple of
* MAXALIGN.
* 在固定的頭部字段后是nulls位圖(以t_bits開始).
* 如t_infomask標(biāo)記提示沒有空值,則不存才nulls位圖.
* 如果OID字段是現(xiàn)成的(通過t_infomask指示),那么在用戶數(shù)據(jù)前存儲,用戶數(shù)據(jù)從t_hoff所示的偏移量開始。
* 注意t_hoff必須是MAXALIGN的倍數(shù).
*/
typedef struct HeapTupleFields
{
TransactionId t_xmin; /* 插入事務(wù)ID;inserting xact ID */
TransactionId t_xmax; /* 刪除或鎖定事務(wù)ID;deleting or locking xact ID */
union
{
CommandId t_cid; /* 插入或刪除命令I(lǐng)D或者combo命令;inserting or deleting command ID, or both */
TransactionId t_xvac; /* old-style VACUUM FULL xact ID */
} t_field3;//聯(lián)合體
} HeapTupleFields;//頭部字段
typedef struct DatumTupleFields
{
int32 datum_len_; /* 可變長頭部(不能夠直接接觸);varlena header (do not touch directly!) */
int32 datum_typmod; /* -1或者是記錄類型標(biāo)識符;-1, or identifier of a record type */
Oid datum_typeid; /* 組合類型OID或者RECORDOID;composite type OID, or RECORDOID */
/*
* datum_typeid cannot be a domain over composite, only plain composite,
* even if the datum is meant as a value of a domain-over-composite type.
* This is in line with the general principle that CoerceToDomain does not
* change the physical representation of the base type value.
* 即使datum是domain-over-composite類型,datum_typeid也不能是域組合只能是平面組合.
* 這與一般原則相一致,即CoerceToDomain不改變基類型值的物理表示形式。
*
* Note: field ordering is chosen with thought that Oid might someday
* widen to 64 bits.
* 注意:字段排序的選擇考慮到Oid可能有一天會擴展到64位。
*/
} DatumTupleFields;
struct HeapTupleHeaderData
{
union
{
HeapTupleFields t_heap;
DatumTupleFields t_datum;
} t_choice;
ItemPointerData t_ctid; /* current TID of this or newer tuple (or a
* speculative insertion token) */
/* Fields below here must match MinimalTupleData! */
#define FIELDNO_HEAPTUPLEHEADERDATA_INFOMASK2 2
uint16 t_infomask2; /* number of attributes + various flags */
#define FIELDNO_HEAPTUPLEHEADERDATA_INFOMASK 3
uint16 t_infomask; /* various flag bits, see below */
#define FIELDNO_HEAPTUPLEHEADERDATA_HOFF 4
uint8 t_hoff; /* sizeof header incl. bitmap, padding */
/* ^ - 23 bytes - ^ */
#define FIELDNO_HEAPTUPLEHEADERDATA_BITS 5
bits8 t_bits[FLEXIBLE_ARRAY_MEMBER]; /* bitmap of NULLs */
/* MORE DATA FOLLOWS AT END OF STRUCT */
};
typedef HeapTupleHeaderData* HeapTupleHeader;
/*
結(jié)構(gòu)體展開,詳見下表:
Field Type Length Offset Description
t_xmin TransactionId 4 bytes 0 insert XID stamp
t_xmax TransactionId 4 bytes 4 delete XID stamp
t_cid CommandId 4 bytes 8 insert and/or delete CID stamp (overlays with t_xvac)
t_xvac TransactionId 4 bytes 8 XID for VACUUM operation moving a row version
t_ctid ItemPointerData 6 bytes 12 current TID of this or newer row version
t_infomask2 uint16 2 bytes 18 number of attributes, plus various flag bits
t_infomask uint16 2 bytes 20 various flag bits
t_hoff uint8 1 byte 22 offset to user data
//注意:t_cid和t_xvac為聯(lián)合體,共用存儲空間
*/
//t_infomask=\x0802,十進制值為2050,二進制值為100000000010
//t_infomask說明
1 #define HEAP_HASNULL 0x0001 /* has null attribute(s) */
10 #define HEAP_HASVARWIDTH 0x0002 /* has variable-width attribute(s) */
100 #define HEAP_HASEXTERNAL 0x0004 /* has external stored attribute(s) */
1000 #define HEAP_HASOID 0x0008 /* has an object-id field */
10000 #define HEAP_XMAX_KEYSHR_LOCK 0x0010 /* xmax is a key-shared locker */
100000 #define HEAP_COMBOCID 0x0020 /* t_cid is a combo cid */
1000000 #define HEAP_XMAX_EXCL_LOCK 0x0040 /* xmax is exclusive locker */
10000000 #define HEAP_XMAX_LOCK_ONLY 0x0080 /* xmax, if valid, is only a locker */
/* xmax is a shared locker */
#define HEAP_XMAX_SHR_LOCK (HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_KEYSHR_LOCK)
#define HEAP_LOCK_MASK (HEAP_XMAX_SHR_LOCK | HEAP_XMAX_EXCL_LOCK | \
HEAP_XMAX_KEYSHR_LOCK)
100000000 #define HEAP_XMIN_COMMITTED 0x0100 /* t_xmin committed */
1000000000 #define HEAP_XMIN_INVALID 0x0200 /* t_xmin invalid/aborted */
#define HEAP_XMIN_FROZEN (HEAP_XMIN_COMMITTED|HEAP_XMIN_INVALID)
10000000000 #define HEAP_XMAX_COMMITTED 0x0400 /* t_xmax committed */
100000000000 #define HEAP_XMAX_INVALID 0x0800 /* t_xmax invalid/aborted */
1000000000000 #define HEAP_XMAX_IS_MULTI 0x1000 /* t_xmax is a MultiXactId */
10000000000000 #define HEAP_UPDATED 0x2000 /* this is UPDATEd version of row */
100000000000000 #define HEAP_MOVED_OFF 0x4000 /* moved to another place by pre-9.0
* VACUUM FULL; kept for binary
* upgrade support */
1000000000000000 #define HEAP_MOVED_IN 0x8000 /* moved from another place by pre-9.0
* VACUUM FULL; kept for binary
* upgrade support */
#define HEAP_MOVED (HEAP_MOVED_OFF | HEAP_MOVED_IN)
1111111111110000 #define HEAP_XACT_MASK 0xFFF0 /* visibility-related bits */
//\x0802,二進制100000000010表示第2位和第12位為1,
//意味著存在可變長屬性(HEAP_HASVARWIDTH),XMAX無效(HEAP_XMAX_INVALID)
/*
* information stored in t_infomask2:
*/
#define HEAP_NATTS_MASK 0x07FF /* 11 bits for number of attributes */
/* bits 0x1800 are available */
#define HEAP_KEYS_UPDATED 0x2000 /* tuple was updated and key cols
* modified, or tuple deleted */
#define HEAP_HOT_UPDATED 0x4000 /* tuple was HOT-updated */
#define HEAP_ONLY_TUPLE 0x8000 /* this is heap-only tuple */
#define HEAP2_XACT_MASK 0xE000 /* visibility-related bits */
//把十六進制值轉(zhuǎn)換為二進制顯示
11111111111 #define HEAP_NATTS_MASK 0x07FF
10000000000000 #define HEAP_KEYS_UPDATED 0x2000
100000000000000 #define HEAP_HOT_UPDATED 0x4000
1000000000000000 #define HEAP_ONLY_TUPLE 0x8000
1110000000000000 #define HEAP2_XACT_MASK 0xE000
1111111111111110 #define SpecTokenOffsetNumber 0xfffe
//前(低)11位為屬性的個數(shù),3意味著有3個屬性(字段)
xl_heap_freeze_tuple
xl_heap_freeze_tuple表示’freeze plan’,用于存儲在vacuum期間凍結(jié)tuple所需要的信息.
/*
* This struct represents a 'freeze plan', which is what we need to know about
* a single tuple being frozen during vacuum.
* 該結(jié)構(gòu)表示'freeze plan',用于存儲在vacuum期間凍結(jié)tuple所需要的信息
*/
/* 0x01 was XLH_FREEZE_XMIN */
#define XLH_FREEZE_XVAC 0x02
#define XLH_INVALID_XVAC 0x04
typedef struct xl_heap_freeze_tuple
{
TransactionId xmax;
OffsetNumber offset;
uint16 t_infomask2;
uint16 t_infomask;
uint8 frzflags;
} xl_heap_freeze_tuple;
heap_execute_freeze_tuple執(zhí)行實際的元組凍結(jié)操作(先前已完成準(zhǔn)備工作),邏輯很簡單,設(shè)置xmax和凍結(jié)事務(wù)號.
/*
* heap_execute_freeze_tuple
* Execute the prepared freezing of a tuple.
* 執(zhí)行實際的元組凍結(jié)操作(先前已完成準(zhǔn)備工作)
*
* Caller is responsible for ensuring that no other backend can access the
* storage underlying this tuple, either by holding an exclusive lock on the
* buffer containing it (which is what lazy VACUUM does), or by having it be
* in private storage (which is what CLUSTER and friends do).
* 調(diào)用者有責(zé)任確保沒有其他后臺進程可以訪問該元組所在的存儲空間,
* 通過持有該元組所在的buffer獨占鎖(lazy VACUUM所做的事情),
* 或者在私有存儲空間中存儲(CLUSTER和友元的處理方式)
*
* Note: it might seem we could make the changes without exclusive lock, since
* TransactionId read/write is assumed atomic anyway. However there is a race
* condition: someone who just fetched an old XID that we overwrite here could
* conceivably not finish checking the XID against pg_xact before we finish
* the VACUUM and perhaps truncate off the part of pg_xact he needs. Getting
* exclusive lock ensures no other backend is in process of checking the
* tuple status. Also, getting exclusive lock makes it safe to adjust the
* infomask bits.
* 注意:看起來我們可以不需要獨占鎖就可以進行修改,因為TransactionId R/W假定是原子操作.
* 但是,這里有條件爭用:某些進程剛剛提取了一個舊的XID,而該XID已被覆蓋,
* 這時候會出現(xiàn)在完成VACUUM之前還沒有完成pg_xact之上的XID檢查,
* 并且可能會出現(xiàn)截斷了pg_xact所需要的部分內(nèi)容.
* 獲取獨占鎖可以確保沒有其他后臺進程正在檢查元組狀態(tài).
* 同時,獲取獨占鎖可以安全的調(diào)整infomask標(biāo)記位.
*
* NB: All code in here must be safe to execute during crash recovery!
* 注意:這里的所有代碼必須在崩潰恢復(fù)期間可以安全的執(zhí)行.
*/
void
heap_execute_freeze_tuple(HeapTupleHeader tuple, xl_heap_freeze_tuple *frz)
{
HeapTupleHeaderSetXmax(tuple, frz->xmax);
if (frz->frzflags & XLH_FREEZE_XVAC)
HeapTupleHeaderSetXvac(tuple, FrozenTransactionId);
if (frz->frzflags & XLH_INVALID_XVAC)
HeapTupleHeaderSetXvac(tuple, InvalidTransactionId);
tuple->t_infomask = frz->t_infomask;
tuple->t_infomask2 = frz->t_infomask2;
}
//設(shè)置元組的xmax值
#define HeapTupleHeaderSetXmax(tup, xid) \
( \
(tup)->t_choice.t_heap.t_xmax = (xid) \
)
//設(shè)置
#define HeapTupleHeaderSetXvac(tup, xid) \
do { \
Assert((tup)->t_infomask & HEAP_MOVED); \
(tup)->t_choice.t_heap.t_field3.t_xvac = (xid); \
} while (0)
N/A
PG Source Code