PostgreSQL 源码解读(127)- MVCC#11(vacuum过程-vacuum_rel函数)
本节简单介绍了PostgreSQL手工执行vacuum的主处理流程,主要分析了ExecVacuum->vacuum->vacuum_rel函数的实现逻辑。
一、数据结构
宏定义
Vacuum和Analyze命令选项
/* ---------------------- * Vacuum and Analyze Statements * Vacuum和Analyze命令选项 * * Even though these are nominally two statements, it's convenient to use * just one node type for both. Note that at least one of VACOPT_VACUUM * and VACOPT_ANALYZE must be set in options. * 虽然在这里有两种不同的语句,但只需要使用统一的Node类型即可. * 注意至少VACOPT_VACUUM/VACOPT_ANALYZE在选项中设置. * ---------------------- */typedef enum VacuumOption{ VACOPT_VACUUM = 1 << 0, /* do VACUUM */ VACOPT_ANALYZE = 1 << 1, /* do ANALYZE */ VACOPT_VERBOSE = 1 << 2, /* print progress info */ VACOPT_FREEZE = 1 << 3, /* FREEZE option */ VACOPT_FULL = 1 << 4, /* FULL (non-concurrent) vacuum */ VACOPT_SKIP_LOCKED = 1 << 5, /* skip if cannot get lock */ VACOPT_SKIPTOAST = 1 << 6, /* don't process the TOAST table, if any */ VACOPT_DISABLE_PAGE_SKIPPING = 1 << 7 /* don't skip any pages */} VacuumOption;
VacuumStmt
存储vacuum命令的option&Relation链表
typedef struct VacuumStmt{ NodeTag type;//Tag //VacuumOption位标记 int options; /* OR of VacuumOption flags */ //VacuumRelation链表,如为NIL-->所有Relation. List *rels; /* list of VacuumRelation, or NIL for all */} VacuumStmt;
VacuumParams
vacuum命令参数
/* * Parameters customizing behavior of VACUUM and ANALYZE. * 客户端调用VACUUM/ANALYZE时的定制化参数 */typedef struct VacuumParams{ //最小freeze age,-1表示使用默认 int freeze_min_age; /* min freeze age, -1 to use default */ //扫描整个table的freeze age int freeze_table_age; /* age at which to scan whole table */ //最小的multixact freeze age,-1表示默认 int multixact_freeze_min_age; /* min multixact freeze age, -1 to * use default */ //扫描全表的freeze age,-1表示默认 int multixact_freeze_table_age; /* multixact age at which to scan * whole table */ //是否强制wraparound? bool is_wraparound; /* force a for-wraparound vacuum */ //以毫秒为单位的最小执行阈值 int log_min_duration; /* minimum execution threshold in ms at * which verbose logs are activated, -1 * to use default */} VacuumParams;
VacuumRelation
VACUUM/ANALYZE命令的目标表信息
/* * Info about a single target table of VACUUM/ANALYZE. * VACUUM/ANALYZE命令的目标表信息. * * If the OID field is set, it always identifies the table to process. * Then the relation field can be NULL; if it isn't, it's used only to report * failure to open/lock the relation. * 如设置了OID字段,该值通常是将要处理的数据表. * 那么关系字段可以为空;如果不是,则仅用于报告未能打开/锁定关系。 */typedef struct VacuumRelation{ NodeTag type; RangeVar *relation; /* table name to process, or NULL */ Oid oid; /* table's OID; InvalidOid if not looked up */ List *va_cols; /* list of column names, or NIL for all */} VacuumRelation;
二、源码解读
vacuum_rel():vacuum one heap relation
大体逻辑如下:
1.启动事务,快照入栈,设置事务状态为PROC_IN_VACUUM
2.打开relation,请求合适的锁(FULL->AccessExclusiveLock,Concurrent->ShareUpdateExclusiveLock)
3.执行相应的检查(owner/relkind/临时表/分区表…)
4.执行TOAST相关处理
5.执行前期准备工作(切换user等)
6.执行实际的工作
6.1.FULL->cluster_rel
6.2.Concurrent->heap_vacuum_rel
7.执行收尾工作
8.如存在TOAST,在执行TOAST表的vacuum
/* * vacuum_rel() -- vacuum one heap relation * vacuum_rel() -- vacuum一个heap relation * * relid identifies the relation to vacuum. If relation is supplied, * use the name therein for reporting any failure to open/lock the rel; * do not use it once we've successfully opened the rel, since it might * be stale. * relid是需要vacuum的relation. * 如提供了relation,使用其中的名称报告任何未能打开/锁定rel的情况. * 只要我们成功打开了relation,那么就不需要再使用relation了,因为该变量可能已失效. * * Returns true if it's okay to proceed with a requested ANALYZE * operation on this table. * 如执行在请求的数据表上执行ANALYZE操作是OK的,返回T. * * Doing one heap at a time incurs extra overhead, since we need to * check that the heap exists again just before we vacuum it. The * reason that we do this is so that vacuuming can be spread across * many small transactions. Otherwise, two-phase locking would require * us to lock the entire database during one pass of the vacuum cleaner. * 一次只做一个堆会带来额外的开销,因为我们需要在vacuum之前检查它是否再次存在。 * 我们这样做的原因是,vacuuming可以分散到许多小事务中。 * 否则,两阶段锁定将要求我们在一次vacuum时锁定整个数据库。 * * At entry and exit, we are not inside a transaction. * 在入口和退出处,不在事务中. */static boolvacuum_rel(Oid relid, RangeVar *relation, int options, VacuumParams *params){ LOCKMODE lmode; Relation onerel; LockRelId onerelid; Oid toast_relid; Oid save_userid; int save_sec_context; int save_nestlevel; Assert(params != NULL); /* Begin a transaction for vacuuming this relation */ //vacuuming relation时开启一个事务 StartTransactionCommand(); /* * Functions in indexes may want a snapshot set. Also, setting a snapshot * ensures that RecentGlobalXmin is kept truly recent. * 处理索引上的函数可能需要快照集. * 同样的设置一个快照可以确保RecentGlobalXmin保持为最近的. */ PushActiveSnapshot(GetTransactionSnapshot()); if (!(options & VACOPT_FULL)) { /* * In lazy vacuum, we can set the PROC_IN_VACUUM flag, which lets * other concurrent VACUUMs know that they can ignore this one while * determining their OldestXmin. (The reason we don't set it during a * full VACUUM is exactly that we may have to run user-defined * functions for functional indexes, and we want to make sure that if * they use the snapshot set above, any tuples it requires can't get * removed from other tables. An index function that depends on the * contents of other tables is arguably broken, but we won't break it * here by violating transaction semantics.) * 在lazy vacuum,我们可以设置PROC_IN_VACUUM标记,这样可以让其他并发的VACUUMs * 知道他们在确定OldestXmin时可以忽略这个. * (不在full VACUUM中设置的原因正好是我们可能不得不为函数索引执行用户自定义函数, * 我们希望确保如果使用上面的快照集,快照需要的所有元组不能在其他表中清除). * 依赖于其他表的内容的索引函数可能会被破坏,但是我们不会违反事务语义来破坏它。 * * We also set the VACUUM_FOR_WRAPAROUND flag, which is passed down by * autovacuum; it's used to avoid canceling a vacuum that was invoked * in an emergency. * 我们还设置了VACUUM_FOR_WRAPAROUND标记,通过autovacuum传递下去, * 这可以用于避免取消在紧急情况下启动的vacuum进程. * * Note: these flags remain set until CommitTransaction or * AbortTransaction. We don't want to clear them until we reset * MyPgXact->xid/xmin, else OldestXmin might appear to go backwards, * which is probably Not Good. * 注意:这些标记一直保留到CommitTransaction或AbortTransaction. * 我们不希望清除这些标记,直至重置MyPgXact->xid/xmin,否则旧的xmin可能会出现倒退,这可能不好. */ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyPgXact->vacuumFlags |= PROC_IN_VACUUM; if (params->is_wraparound) MyPgXact->vacuumFlags |= PROC_VACUUM_FOR_WRAPAROUND; LWLockRelease(ProcArrayLock); } /* * Check for user-requested abort. Note we want this to be inside a * transaction, so xact.c doesn't issue useless WARNING. * 检查用户请求的abort. * 注意我们希望这个操作在事务中,因此xact.c不需要发起不必要的WARNING. */ CHECK_FOR_INTERRUPTS(); /* * Determine the type of lock we want --- hard exclusive lock for a FULL * vacuum, but just ShareUpdateExclusiveLock for concurrent vacuum. Either * way, we can be sure that no other backend is vacuuming the same table. * 确定需要的锁类型. * 唯一锁 for FULL vacuum. * ShareUpdateExclusiveLock for 普通的并发vacuum. * 不管哪种途径,可以确保的是没有那个后台进程同时在vacuum同一张表. */ lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock; /* open the relation and get the appropriate lock on it */ //打开relation,获取合适的锁. onerel = vacuum_open_relation(relid, relation, params, options, lmode); /* leave if relation could not be opened or locked */ //如relation不能被打开/锁定,则退出 if (!onerel) { PopActiveSnapshot(); CommitTransactionCommand(); return false; } /* * Check if relation needs to be skipped based on ownership. This check * happens also when building the relation list to vacuum for a manual * operation, and needs to be done additionally here as VACUUM could * happen across multiple transactions where relation ownership could have * changed in-between. Make sure to only generate logs for VACUUM in this * case. * 检查relation是否可以跳过. * 在构建relation链表准备vacuum手工操作也需要检查, * 并且需要在这里进行额外的操作,因为VACUUM可能发生在多个事务之间, * 而这些事务之间的关系所有权可能已经发生了变化. * 在这种情况下,确保只为VACUUM生成日志。 */ if (!vacuum_is_relation_owner(RelationGetRelid(onerel), onerel->rd_rel, options & VACOPT_VACUUM)) { relation_close(onerel, lmode); PopActiveSnapshot(); CommitTransactionCommand(); return false; } /* * Check that it's of a vacuumable relkind. * 检查relation是vacuumable relkind */ if (onerel->rd_rel->relkind != RELKIND_RELATION && onerel->rd_rel->relkind != RELKIND_MATVIEW && onerel->rd_rel->relkind != RELKIND_TOASTVALUE && onerel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE) { ereport(WARNING, (errmsg("skipping \"%s\" --- cannot vacuum non-tables or special system tables", RelationGetRelationName(onerel)))); relation_close(onerel, lmode); PopActiveSnapshot(); CommitTransactionCommand(); return false; } /* * Silently ignore tables that are temp tables of other backends --- * trying to vacuum these will lead to great unhappiness, since their * contents are probably not up-to-date on disk. (We don't throw a * warning here; it would just lead to chatter during a database-wide * VACUUM.) * 忽略其他进程产生的临时表. * 如果尝试vacuum这些表会导致相当"不愉快"的事情发生,因为这些表的内容可能在磁盘上不是最新的. * (在这里不会产生警告,它只会在数据库范围内的VACUUM引起抖动。) */ if (RELATION_IS_OTHER_TEMP(onerel)) { relation_close(onerel, lmode); PopActiveSnapshot(); CommitTransactionCommand(); return false; } /* * Silently ignore partitioned tables as there is no work to be done. The * useful work is on their child partitions, which have been queued up for * us separately. * 忽略分区表. * 有用的工作只是在他们的子分区上,这些表已经加入处理队列中. */ if (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) { relation_close(onerel, lmode); PopActiveSnapshot(); CommitTransactionCommand(); /* It's OK to proceed with ANALYZE on this table */ return true; } /* * Get a session-level lock too. This will protect our access to the * relation across multiple transactions, so that we can vacuum the * relation's TOAST table (if any) secure in the knowledge that no one is * deleting the parent relation. * 获取session-level锁. * 这可以保护在跨多事务时对relation的访问, * 因此我们可以安全的vacuum关系的TOAST表,在没有进程删除父relation的情况下. * * NOTE: this cannot block, even if someone else is waiting for access, * because the lock manager knows that both lock requests are from the * same process. * 注意:这个处理不能被阻塞,即时有其他进程在等待访问, * 因为锁管理器知道这两个所请求来自于同一个进程. */ onerelid = onerel->rd_lockInfo.lockRelId; LockRelationIdForSession(&onerelid, lmode); /* * Remember the relation's TOAST relation for later, if the caller asked * us to process it. In VACUUM FULL, though, the toast table is * automatically rebuilt by cluster_rel so we shouldn't recurse to it. * 如调用者要求我们处理TOAST,则为后续的处理标记relation's TOAST relation * 但在VACUUM FULL,toast table会通过cluster_rel自动重建,因此无需处理. */ if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL)) toast_relid = onerel->rd_rel->reltoastrelid; else toast_relid = InvalidOid; /* * Switch to the table owner's userid, so that any index functions are run * as that user. Also lock down security-restricted operations and * arrange to make GUC variable changes local to this command. (This is * unnecessary, but harmless, for lazy VACUUM.) * 切换至owner的userid,以便所有的索引函数可以作为此用户运行. * 同时,锁定security-restricted操作并重新组织,为该命令修改本地GUC变量. * (这其实是不需要的,但悲催的是,需要为lazy VACUUM设置) */ GetUserIdAndSecContext(&save_userid, &save_sec_context); SetUserIdAndSecContext(onerel->rd_rel->relowner, save_sec_context | SECURITY_RESTRICTED_OPERATION); save_nestlevel = NewGUCNestLevel(); /* * Do the actual work --- either FULL or "lazy" vacuum * 执行实际的工作. * 不管是FULL 或"lazy" vacuum */ if (options & VACOPT_FULL) { int cluster_options = 0; /* close relation before vacuuming, but hold lock until commit */ //在vacuuming前关闭relation,持有锁直至commit relation_close(onerel, NoLock); onerel = NULL; if ((options & VACOPT_VERBOSE) != 0) cluster_options |= CLUOPT_VERBOSE; /* VACUUM FULL is now a variant of CLUSTER; see cluster.c */ //VACUUM FULL现在是CLUSTER的一个变体,详细请查看cluster.c cluster_rel(relid, InvalidOid, cluster_options); } else heap_vacuum_rel(onerel, options, params, vac_strategy); /* Roll back any GUC changes executed by index functions */ //回滚索引函数修改的GUC参数 AtEOXact_GUC(false, save_nestlevel); /* Restore userid and security context */ //恢复userid和安全上下文 SetUserIdAndSecContext(save_userid, save_sec_context); /* all done with this class, but hold lock until commit */ //所有的工作已完成,但持有锁,直到提交 if (onerel) relation_close(onerel, NoLock); /* * Complete the transaction and free all temporary memory used. * 完成事务,并释放索引临时内存. */ PopActiveSnapshot(); CommitTransactionCommand(); /* * If the relation has a secondary toast rel, vacuum that too while we * still hold the session lock on the master table. Note however that * "analyze" will not get done on the toast table. This is good, because * the toaster always uses hardcoded index access and statistics are * totally unimportant for toast relations. * 如果relation有第二个toast rel,vacuum仍然会在主表上持有session lock. * 注意analyze不会在toast table上完成.这样操作没有问题, * 因为toaster总是使用硬编码的索引访问,而统计数据对于toast relations来说不重要. */ if (toast_relid != InvalidOid) vacuum_rel(toast_relid, NULL, options, params); /* * Now release the session-level lock on the master table. * 现在可以释放session-level锁了. */ UnlockRelationIdForSession(&onerelid, lmode); /* Report that we really did it. */ //DONE! return true;}
三、跟踪分析
测试脚本
17:19:28 (xdb@[local]:5432)testdb=# vacuum t1;
启动gdb,设置断点
(gdb) b vacuum_relBreakpoint 2 at 0x6bb319: file vacuum.c, line 1310.(gdb) cContinuing.Breakpoint 2, vacuum_rel (relid=42634, relation=0x22948d0, options=1, params=0x7fff403d8880) at vacuum.c:13101310 bool rel_lock = true;(gdb)
输入参数
relid=42634 -> t1
relation=0x22948d0 -> t1
使用默认的vacuum参数
###16:20:00 (xdb@[local]:5432)testdb=# select oid,relname,reltype from pg_class where oid=42634; oid | relname | reltype -------+---------+--------- 42634 | t1 | 42636(1 row)###(gdb) p *relation$5 = {type = T_RangeVar, catalogname = 0x0, schemaname = 0x0, relname = 0x22948b0 "t1", inh = true, relpersistence = 112 'p', alias = 0x0, location = 7}(gdb) (gdb) p *params$6 = {freeze_min_age = -1, freeze_table_age = -1, multixact_freeze_min_age = -1, multixact_freeze_table_age = -1, is_wraparound = false, log_min_duration = -1}(gdb)
启动事务,快照入栈,设置事务状态为PROC_IN_VACUUM
(gdb) n1312 Assert(params != NULL);(gdb) 1315 StartTransactionCommand();(gdb) 1321 PushActiveSnapshot(GetTransactionSnapshot());(gdb) 1323 if (!(options & VACOPT_FULL))(gdb) 1345 LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);(gdb) 1346 MyPgXact->vacuumFlags |= PROC_IN_VACUUM;(gdb) 1347 if (params->is_wraparound)(gdb) 1349 LWLockRelease(ProcArrayLock);(gdb) 1356 CHECK_FOR_INTERRUPTS();(gdb)
打开relation,请求合适的锁(FULL->AccessExclusiveLock,Concurrent->ShareUpdateExclusiveLock)
(gdb) 1363 lmode = (options & VACOPT_FULL) ? AccessExclusiveLock : ShareUpdateExclusiveLock;(gdb) 1374 if (!(options & VACOPT_NOWAIT))(gdb) 1375 onerel = try_relation_open(relid, lmode);(gdb) 1388 if (!onerel)(gdb) p *onerel$7 = {rd_node = {spcNode = 1663, dbNode = 16402, relNode = 42634}, rd_smgr = 0x0, rd_refcnt = 1, rd_backend = -1, rd_islocaltemp = false, rd_isnailed = false, rd_isvalid = true, rd_indexvalid = 0 '\000', rd_statvalid = false, rd_createSubid = 0, rd_newRelfilenodeSubid = 0, rd_rel = 0x7f2d2571bbb8, rd_att = 0x7f2d25637268, rd_id = 42634, rd_lockInfo = {lockRelId = {relId = 42634, dbId = 16402}}, rd_rules = 0x0, rd_rulescxt = 0x0, trigdesc = 0x0, rd_rsdesc = 0x0, rd_fkeylist = 0x0, rd_fkeyvalid = false, rd_partkeycxt = 0x0, rd_partkey = 0x0, rd_pdcxt = 0x0, rd_partdesc = 0x0, rd_partcheck = 0x0, rd_indexlist = 0x0, rd_oidindex = 0, rd_pkindex = 0, rd_replidindex = 0, rd_statlist = 0x0, rd_indexattr = 0x0, rd_projindexattr = 0x0, rd_keyattr = 0x0, rd_pkattr = 0x0, rd_idattr = 0x0, rd_projidx = 0x0, rd_pubactions = 0x0, rd_options = 0x0, rd_index = 0x0, rd_indextuple = 0x0, rd_amhandler = 0, rd_indexcxt = 0x0, rd_amroutine = 0x0, rd_opfamily = 0x0, rd_opcintype = 0x0, rd_support = 0x0, rd_supportinfo = 0x0, rd_indoption = 0x0, rd_indexprs = 0x0, rd_indpred = 0x0, rd_exclops = 0x0, rd_exclprocs = 0x0, rd_exclstrats = 0x0, rd_amcache = 0x0, rd_indcollation = 0x0, rd_fdwroutine = 0x0, rd_toastoid = 0, pgstat_info = 0x2313030}(gdb)
执行相应的检查(owner/relkind/临时表/分区表…)
(gdb) n1442 if (!(pg_class_ownercheck(RelationGetRelid(onerel), GetUserId()) ||(gdb) p GetUserId()$8 = 10(gdb) p RelationGetRelid(onerel)$9 = 42634(gdb) n1466 if (onerel->rd_rel->relkind != RELKIND_RELATION &&(gdb) 1487 if (RELATION_IS_OTHER_TEMP(onerel))(gdb) 1500 if (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)(gdb) 1519 onerelid = onerel->rd_lockInfo.lockRelId;(gdb) 1520 LockRelationIdForSession(&onerelid, lmode);(gdb) p onerelid$10 = {relId = 42634, dbId = 16402}(gdb)
执行TOAST相关处理
(gdb) n1527 if (!(options & VACOPT_SKIPTOAST) && !(options & VACOPT_FULL))(gdb) 1528 toast_relid = onerel->rd_rel->reltoastrelid;(gdb) 1538 GetUserIdAndSecContext(&save_userid, &save_sec_context);(gdb) p toast_relid$11 = 0(gdb)
执行前期准备工作(切换user等)
(gdb) n1539 SetUserIdAndSecContext(onerel->rd_rel->relowner,(gdb) 1541 save_nestlevel = NewGUCNestLevel();(gdb) 1546 if (options & VACOPT_FULL)(gdb) p save_nestlevel$12 = 2(gdb) n
执行实际的工作
Concurrent->heap_vacuum_rel(11.1版本为lazy_vacuum_rel函数)
(gdb) n1557 lazy_vacuum_rel(onerel, options, params, vac_strategy);(gdb)
执行收尾工作
(gdb) 1560 AtEOXact_GUC(false, save_nestlevel);(gdb) 1563 SetUserIdAndSecContext(save_userid, save_sec_context);(gdb) 1566 if (onerel)(gdb) 1567 relation_close(onerel, NoLock);(gdb) 1572 PopActiveSnapshot();(gdb) 1573 CommitTransactionCommand();(gdb) 1582 if (toast_relid != InvalidOid)(gdb)
如存在TOAST,在执行TOAST表的vacuum
(gdb) 1582 if (toast_relid != InvalidOid)(gdb) 1588 UnlockRelationIdForSession(&onerelid, lmode);(gdb) 1591 return true;(gdb)
执行完成
1592 }(gdb) vacuum (options=1, relations=0x23525f0, params=0x7fff403d8880, bstrategy=0x2352478, isTopLevel=true) at vacuum.c:344344 if (options & VACOPT_ANALYZE)(gdb)
DONE!
四、参考资料
PG Source Code