PostgreSQL 源码解读(98)- 分区表#4(数据查询路由#1-“扩展”分区表)
在查询分区表的时候PG如何确定查询的是哪个分区?如何确定?相关的机制是什么?接下来几个章节将一一介绍,本节是第一部分。
零、实现机制
我们先看下面的例子,两个普通表t_normal_1和t_normal_2,执行UNION ALL操作:
drop table if exists t_normal_1;drop table if exists t_normal_2;create table t_normal_1 (c1 int not null,c2 varchar(40),c3 varchar(40));create table t_normal_2 (c1 int not null,c2 varchar(40),c3 varchar(40));insert into t_normal_1(c1,c2,c3) VALUES(0,'HASH0','HAHS0');insert into t_normal_2(c1,c2,c3) VALUES(0,'HASH0','HAHS0');testdb=# explain verbose select * from t_normal_1 where c1 = 0testdb-# union alltestdb-# select * from t_normal_2 where c1 <> 0; QUERY PLAN ---------------------------------------------------------------------------- Append (cost=0.00..34.00 rows=350 width=200) -> Seq Scan on public.t_normal_1 (cost=0.00..14.38 rows=2 width=200) Output: t_normal_1.c1, t_normal_1.c2, t_normal_1.c3 Filter: (t_normal_1.c1 = 0) -> Seq Scan on public.t_normal_2 (cost=0.00..14.38 rows=348 width=200) Output: t_normal_2.c1, t_normal_2.c2, t_normal_2.c3 Filter: (t_normal_2.c1 <> 0)(7 rows)
两张普通表的UNION ALL,PG使用APPEND操作符把t_normal_1顺序扫描的结果集和t_normal_2顺序扫描的结果集"APPEND"在一起作为最终的结果集输出.
分区表的查询也是类似的机制,把各个分区的结果集APPEND在一起,然后作为最终的结果集输出,如下例所示:
testdb=# explain verbose select * from t_hash_partition where c1 = 1 OR c1 = 2; QUERY PLAN ------------------------------------------------------------------------------------- Append (cost=0.00..30.53 rows=6 width=200) -> Seq Scan on public.t_hash_partition_1 (cost=0.00..15.25 rows=3 width=200) Output: t_hash_partition_1.c1, t_hash_partition_1.c2, t_hash_partition_1.c3 Filter: ((t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2)) -> Seq Scan on public.t_hash_partition_3 (cost=0.00..15.25 rows=3 width=200) Output: t_hash_partition_3.c1, t_hash_partition_3.c2, t_hash_partition_3.c3 Filter: ((t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2))(7 rows)
查询分区表t_hash_partition,条件为c1 = 1 OR c1 = 2,从执行计划可见是把t_hash_partition_1顺序扫描的结果集和t_hash_partition_3顺序扫描的结果集"APPEND"在一起作为最终的结果集输出.
这里面有几个问题需要解决:
1.识别分区表并找到所有的分区子表;
2.根据约束条件识别需要查询的分区,这是出于性能的考虑;
3.对结果集执行APPEND,作为最终结果输出.
本节介绍了PG如何识别分区表并找到所有的分区子表,实现的函数是expand_inherited_tables.
一、数据结构
AppendRelInfo
Append-relation信息.
当我们将可继承表(分区表)或UNION-ALL子查询展开为"追加关系"(本质上是子RTE的链表)时,为每个子RTE构建一个AppendRelInfo。
AppendRelInfos链表指示在展开父节点时必须包含哪些子rte,每个节点具有将引用父节点的Vars转换为引用该子节点的Vars所需的所有信息。
/* * Append-relation info. * Append-relation信息. * * When we expand an inheritable table or a UNION-ALL subselect into an * "append relation" (essentially, a list of child RTEs), we build an * AppendRelInfo for each child RTE. The list of AppendRelInfos indicates * which child RTEs must be included when expanding the parent, and each node * carries information needed to translate Vars referencing the parent into * Vars referencing that child. * 当我们将可继承表(分区表)或UNION-ALL子查询展开为"追加关系"(本质上是子RTE的链表)时, * 为每个子RTE构建一个AppendRelInfo。 * AppendRelInfos链表指示在展开父节点时必须包含哪些子rte, * 每个节点具有将引用父节点的Vars转换为引用该子节点的Vars所需的所有信息。 * * These structs are kept in the PlannerInfo node's append_rel_list. * Note that we just throw all the structs into one list, and scan the * whole list when desiring to expand any one parent. We could have used * a more complex data structure (eg, one list per parent), but this would * be harder to update during operations such as pulling up subqueries, * and not really any easier to scan. Considering that typical queries * will not have many different append parents, it doesn't seem worthwhile * to complicate things. * 这些结构体保存在PlannerInfo节点的append_rel_list中。 * 注意,只是将所有的结构体放入一个链表中,并在希望展开任何父类时扫描整个链表。 * 本可以使用更复杂的数据结构(例如,每个父节点一个列表), * 但是在提取子查询之类的操作中更新它会更困难, * 而且实际上也不会更容易扫描。 * 考虑到典型的查询不会有很多不同的附加项,因此似乎不值得将事情复杂化。 * * Note: after completion of the planner prep phase, any given RTE is an * append parent having entries in append_rel_list if and only if its * "inh" flag is set. We clear "inh" for plain tables that turn out not * to have inheritance children, and (in an abuse of the original meaning * of the flag) we set "inh" for subquery RTEs that turn out to be * flattenable UNION ALL queries. This lets us avoid useless searches * of append_rel_list. * 注意:计划准备阶段完成后, * 当且仅当它的"inh"标志已设置时,给定的RTE是一个append parent在append_rel_list中的一个条目。 * 我们为没有child的平面表清除"inh"标记, * 同时(有滥用标记的嫌疑)为UNION ALL查询中的子查询RTEs设置"inh"标记。 * 这样可以避免对append_rel_list进行无用的搜索。 * * Note: the data structure assumes that append-rel members are single * baserels. This is OK for inheritance, but it prevents us from pulling * up a UNION ALL member subquery if it contains a join. While that could * be fixed with a more complex data structure, at present there's not much * point because no improvement in the plan could result. * 注意:数据结构假定附加的rel成员是独立的baserels。 * 这对于继承来说是可以的,但是如果UNION ALL member子查询包含一个join, * 那么它将阻止我们提取UNION ALL member子查询。 * 虽然可以用更复杂的数据结构解决这个问题,但目前没有太大意义,因为该计划可能不会有任何改进。 */typedef struct AppendRelInfo{ NodeTag type; /* * These fields uniquely identify this append relationship. There can be * (in fact, always should be) multiple AppendRelInfos for the same * parent_relid, but never more than one per child_relid, since a given * RTE cannot be a child of more than one append parent. * 这些字段惟一地标识这个append relationship。 * 对于同一个parent_relid可以有(实际上应该总是)多个AppendRelInfos, * 但是每个child_relid不能有多个AppendRelInfos, * 因为给定的RTE不能是多个append parent的子节点。 */ Index parent_relid; /* parent rel的RT索引;RT index of append parent rel */ Index child_relid; /* child rel的RT索引;RT index of append child rel */ /* * For an inheritance appendrel, the parent and child are both regular * relations, and we store their rowtype OIDs here for use in translating * whole-row Vars. For a UNION-ALL appendrel, the parent and child are * both subqueries with no named rowtype, and we store InvalidOid here. * 对于继承appendrel,父类和子类都是普通关系, * 我们将它们的rowtype OIDs存储在这里,用于转换whole-row Vars。 * 对于UNION-ALL appendrel,父查询和子查询都是没有指定行类型的子查询, * 我们在这里存储InvalidOid。 */ Oid parent_reltype; /* OID of parent's composite type */ Oid child_reltype; /* OID of child's composite type */ /* * The N'th element of this list is a Var or expression representing the * child column corresponding to the N'th column of the parent. This is * used to translate Vars referencing the parent rel into references to * the child. A list element is NULL if it corresponds to a dropped * column of the parent (this is only possible for inheritance cases, not * UNION ALL). The list elements are always simple Vars for inheritance * cases, but can be arbitrary expressions in UNION ALL cases. * 这个列表的第N个元素是一个Var或表达式,表示与父元素的第N列对应的子列。 * 这用于将引用parent rel的Vars转换为对子rel的引用。 * 如果链表元素与父元素的已删除列相对应,则该元素为NULL * (这只适用于继承情况,而不是UNION ALL)。 * 对于继承情况,链表元素总是简单的变量,但是可以是UNION ALL情况下的任意表达式。 * * Notice we only store entries for user columns (attno > 0). Whole-row * Vars are special-cased, and system columns (attno < 0) need no special * translation since their attnos are the same for all tables. * 注意,我们只存储用户列的条目(attno > 0)。 * Whole-row Vars是大小写敏感的,系统列(attno < 0)不需要特别的转换, * 因为它们的attno对所有表都是相同的。 * * Caution: the Vars have varlevelsup = 0. Be careful to adjust as needed * when copying into a subquery. * 注意:Vars的varlevelsup = 0。 * 在将数据复制到子查询时,要注意根据需要进行调整。 */ //child's Vars中的表达式 List *translated_vars; /* Expressions in the child's Vars */ /* * We store the parent table's OID here for inheritance, or InvalidOid for * UNION ALL. This is only needed to help in generating error messages if * an attempt is made to reference a dropped parent column. * 我们将父表的OID存储在这里用于继承, * 如为UNION ALL,则这里存储的是InvalidOid。 * 只有在试图引用已删除的父列时,才需要这样做来帮助生成错误消息。 */ Oid parent_reloid; /* OID of parent relation */} AppendRelInfo;
PlannerInfo
该数据结构用于存储查询语句在规划/优化过程中的相关信息
/*---------- * PlannerInfo * Per-query information for planning/optimization * 用于规划/优化的每个查询信息 * * This struct is conventionally called "root" in all the planner routines. * It holds links to all of the planner's working state, in addition to the * original Query. Note that at present the planner extensively modifies * the passed-in Query data structure; someday that should stop. * 在所有计划程序例程中,这个结构通常称为"root"。 * 除了原始查询之外,它还保存到所有计划器工作状态的链接。 * 注意,目前计划器会毫无节制的修改传入的查询数据结构,相信总有一天这种情况会停止的。 *---------- */struct AppendRelInfo;typedef struct PlannerInfo{ NodeTag type;//Node标识 //查询树 Query *parse; /* the Query being planned */ //当前的planner全局信息 PlannerGlobal *glob; /* global info for current planner run */ //查询层次,1标识最高层 Index query_level; /* 1 at the outermost Query */ // 如为子计划,则这里存储父计划器指针,NULL标识最高层 struct PlannerInfo *parent_root; /* NULL at outermost Query */ /* * plan_params contains the expressions that this query level needs to * make available to a lower query level that is currently being planned. * outer_params contains the paramIds of PARAM_EXEC Params that outer * query levels will make available to this query level. * plan_params包含该查询级别需要提供给当前计划的较低查询级别的表达式。 * outer_params包含PARAM_EXEC Params的参数,外部查询级别将使该查询级别可用这些参数。 */ List *plan_params; /* list of PlannerParamItems, see below */ Bitmapset *outer_params; /* * simple_rel_array holds pointers to "base rels" and "other rels" (see * comments for RelOptInfo for more info). It is indexed by rangetable * index (so entry 0 is always wasted). Entries can be NULL when an RTE * does not correspond to a base relation, such as a join RTE or an * unreferenced view RTE; or if the RelOptInfo hasn't been made yet. * simple_rel_array保存指向"base rels"和"other rels"的指针 * (有关RelOptInfo的更多信息,请参见注释)。 * 它由可范围表索引建立索引(因此条目0总是被浪费)。 * 当RTE与基本关系(如JOIN RTE或未被引用的视图RTE时)不相对应 * 或者如果RelOptInfo还没有生成,条目可以为NULL。 */ //RelOptInfo数组,存储"base rels",比如基表/子查询等. //该数组与RTE的顺序一一对应,而且是从1开始,因此[0]无用 */ struct RelOptInfo **simple_rel_array; /* All 1-rel RelOptInfos */ int simple_rel_array_size; /* 数组大小,allocated size of array */ /* * simple_rte_array is the same length as simple_rel_array and holds * pointers to the associated rangetable entries. This lets us avoid * rt_fetch(), which can be a bit slow once large inheritance sets have * been expanded. * simple_rte_array的长度与simple_rel_array相同, * 并保存指向相应范围表条目的指针。 * 这使我们可以避免执行rt_fetch(),因为一旦扩展了大型继承集,rt_fetch()可能会有点慢。 */ //RTE数组 RangeTblEntry **simple_rte_array; /* rangetable as an array */ /* * append_rel_array is the same length as the above arrays, and holds * pointers to the corresponding AppendRelInfo entry indexed by * child_relid, or NULL if none. The array itself is not allocated if * append_rel_list is empty. * append_rel_array与上述数组的长度相同, * 并保存指向对应的AppendRelInfo条目的指针,该条目由child_relid索引, * 如果没有索引则为NULL。 * 如果append_rel_list为空,则不分配数组本身。 */ //处理集合操作如UNION ALL时使用和分区表时使用 struct AppendRelInfo **append_rel_array; /* * all_baserels is a Relids set of all base relids (but not "other" * relids) in the query; that is, the Relids identifier of the final join * we need to form. This is computed in make_one_rel, just before we * start making Paths. * all_baserels是查询中所有base relids(但不是"other" relids)的一个Relids集合; * 也就是说,这是需要形成的最终连接的Relids标识符。 * 这是在开始创建路径之前在make_one_rel中计算的。 */ Relids all_baserels;//"base rels" /* * nullable_baserels is a Relids set of base relids that are nullable by * some outer join in the jointree; these are rels that are potentially * nullable below the WHERE clause, SELECT targetlist, etc. This is * computed in deconstruct_jointree. * nullable_baserels是由jointree中的某些外连接中值可为空的base Relids集合; * 这些是在WHERE子句、SELECT targetlist等下面可能为空的树。 * 这是在deconstruct_jointree中处理获得的。 */ //Nullable-side端的"base rels" Relids nullable_baserels; /* * join_rel_list is a list of all join-relation RelOptInfos we have * considered in this planning run. For small problems we just scan the * list to do lookups, but when there are many join relations we build a * hash table for faster lookups. The hash table is present and valid * when join_rel_hash is not NULL. Note that we still maintain the list * even when using the hash table for lookups; this simplifies life for * GEQO. * join_rel_list是在计划执行中考虑的所有连接关系RelOptInfos的链表。 * 对于小问题,只需要扫描链表执行查找,但是当存在许多连接关系时, * 需要构建一个散列表来进行更快的查找。 * 当join_rel_hash不为空时,哈希表是有效可用于查询的。 * 注意,即使在使用哈希表进行查找时,仍然维护该链表;这简化了GEQO(遗传算法)的生命周期。 */ //参与连接的Relation的RelOptInfo链表 List *join_rel_list; /* list of join-relation RelOptInfos */ //可加快链表访问的hash表 struct HTAB *join_rel_hash; /* optional hashtable for join relations */ /* * When doing a dynamic-programming-style join search, join_rel_level[k] * is a list of all join-relation RelOptInfos of level k, and * join_cur_level is the current level. New join-relation RelOptInfos are * automatically added to the join_rel_level[join_cur_level] list. * join_rel_level is NULL if not in use. * 在执行动态规划算法的连接搜索时,join_rel_level[k]是k级的所有连接关系RelOptInfos的列表, * join_cur_level是当前级别。 * 新的连接关系RelOptInfos会自动添加到join_rel_level[join_cur_level]链表中。 * 如果不使用join_rel_level,则为NULL。 */ //RelOptInfo指针链表数组,k层的join存储在[k]中 List **join_rel_level; /* lists of join-relation RelOptInfos */ //当前的join层次 int join_cur_level; /* index of list being extended */ //查询的初始化计划链表 List *init_plans; /* init SubPlans for query */ //CTE子计划ID链表 List *cte_plan_ids; /* per-CTE-item list of subplan IDs */ //MULTIEXPR子查询输出的参数链表的链表 List *multiexpr_params; /* List of Lists of Params for MULTIEXPR * subquery outputs */ //活动的等价类链表 List *eq_classes; /* list of active EquivalenceClasses */ //规范化的PathKey链表 List *canon_pathkeys; /* list of "canonical" PathKeys */ //外连接约束条件链表(左) List *left_join_clauses; /* list of RestrictInfos for mergejoinable * outer join clauses w/nonnullable var on * left */ //外连接约束条件链表(右) List *right_join_clauses; /* list of RestrictInfos for mergejoinable * outer join clauses w/nonnullable var on * right */ //全连接约束条件链表 List *full_join_clauses; /* list of RestrictInfos for mergejoinable * full join clauses */ //特殊连接信息链表 List *join_info_list; /* list of SpecialJoinInfos */ //AppendRelInfo链表 List *append_rel_list; /* list of AppendRelInfos */ //PlanRowMarks链表 List *rowMarks; /* list of PlanRowMarks */ //PHI链表 List *placeholder_list; /* list of PlaceHolderInfos */ // 外键信息链表 List *fkey_list; /* list of ForeignKeyOptInfos */ //query_planner()要求的PathKeys链表 List *query_pathkeys; /* desired pathkeys for query_planner() */ //分组子句路径键 List *group_pathkeys; /* groupClause pathkeys, if any */ //窗口函数路径键 List *window_pathkeys; /* pathkeys of bottom window, if any */ //distinctClause路径键 List *distinct_pathkeys; /* distinctClause pathkeys, if any */ //排序路径键 List *sort_pathkeys; /* sortClause pathkeys, if any */ //已规范化的分区Schema List *part_schemes; /* Canonicalised partition schemes used in the * query. */ //尝试连接的RelOptInfo链表 List *initial_rels; /* RelOptInfos we are now trying to join */ /* Use fetch_upper_rel() to get any particular upper rel */ //上层的RelOptInfo链表 List *upper_rels[UPPERREL_FINAL + 1]; /* upper-rel RelOptInfos */ /* Result tlists chosen by grouping_planner for upper-stage processing */ //grouping_planner为上层处理选择的结果tlists struct PathTarget *upper_targets[UPPERREL_FINAL + 1];// /* * grouping_planner passes back its final processed targetlist here, for * use in relabeling the topmost tlist of the finished Plan. * grouping_planner在这里传回它最终处理过的targetlist,用于重新标记已完成计划的最顶层tlist。 */ ////最后需处理的投影列 List *processed_tlist; /* Fields filled during create_plan() for use in setrefs.c */ //setrefs.c中在create_plan()函数调用期间填充的字段 //分组函数属性映射 AttrNumber *grouping_map; /* for GroupingFunc fixup */ //MinMaxAggInfos链表 List *minmax_aggs; /* List of MinMaxAggInfos */ //内存上下文 MemoryContext planner_cxt; /* context holding PlannerInfo */ //关系的page计数 double total_table_pages; /* # of pages in all tables of query */ //query_planner输入参数:元组处理比例 double tuple_fraction; /* tuple_fraction passed to query_planner */ //query_planner输入参数:limit_tuple double limit_tuples; /* limit_tuples passed to query_planner */ //表达式的最小安全等级 Index qual_security_level; /* minimum security_level for quals */ /* Note: qual_security_level is zero if there are no securityQuals */ //注意:如果没有securityQuals, 则qual_security_level是NULL(0) //如目标relation是分区表的child/partition/分区表,则通过此字段标记 InheritanceKind inhTargetKind; /* indicates if the target relation is an * inheritance child or partition or a * partitioned table */ //是否存在RTE_JOIN的RTE bool hasJoinRTEs; /* true if any RTEs are RTE_JOIN kind */ //是否存在标记为LATERAL的RTE bool hasLateralRTEs; /* true if any RTEs are marked LATERAL */ //是否存在已在jointree删除的RTE bool hasDeletedRTEs; /* true if any RTE was deleted from jointree */ //是否存在Having子句 bool hasHavingQual; /* true if havingQual was non-null */ //如约束条件中存在pseudoconstant = true,则此字段为T bool hasPseudoConstantQuals; /* true if any RestrictInfo has * pseudoconstant = true */ //是否存在递归语句 bool hasRecursion; /* true if planning a recursive WITH item */ /* These fields are used only when hasRecursion is true: */ //这些字段仅在hasRecursion为T时使用: //工作表的PARAM_EXEC ID int wt_param_id; /* PARAM_EXEC ID for the work table */ //非递归模式的访问路径 struct Path *non_recursive_path; /* a path for non-recursive term */ /* These fields are workspace for createplan.c */ //这些字段用于createplan.c //当前节点之上的外部rels Relids curOuterRels; /* outer rels above current node */ //未赋值的NestLoopParams参数 List *curOuterParams; /* not-yet-assigned NestLoopParams */ /* optional private data for join_search_hook, e.g., GEQO */ //可选的join_search_hook私有数据,例如GEQO void *join_search_private; /* Does this query modify any partition key columns? */ //该查询是否更新分区键列? bool partColsUpdated;} PlannerInfo;
二、源码解读
expand_inherited_tables函数将表示继承集合的每个范围表条目展开为"append relation"。
/* * expand_inherited_tables * Expand each rangetable entry that represents an inheritance set * into an "append relation". At the conclusion of this process, * the "inh" flag is set in all and only those RTEs that are append * relation parents. * 将表示继承集合的每个范围表条目展开为"append relation"。 * 在这个过程结束时,"inh"标志被设置在所有且只有那些作为append * relation parents的RTEs中。 */voidexpand_inherited_tables(PlannerInfo *root){ Index nrtes; Index rti; ListCell *rl; /* * expand_inherited_rtentry may add RTEs to parse->rtable. The function is * expected to recursively handle any RTEs that it creates with inh=true. * So just scan as far as the original end of the rtable list. * expand_inherited_rtentry可以添加RTEs到parse->rtable中。 * 这个函数被期望递归地处理它用inh = true创建的所有RTEs。 * 所以只要扫描到rtable链表最开始的末尾即可。 */ nrtes = list_length(root->parse->rtable); rl = list_head(root->parse->rtable); for (rti = 1; rti <= nrtes; rti++) { RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl); expand_inherited_rtentry(root, rte, rti); rl = lnext(rl); }}/* * expand_inherited_rtentry * Check whether a rangetable entry represents an inheritance set. * If so, add entries for all the child tables to the query's * rangetable, and build AppendRelInfo nodes for all the child tables * and add them to root->append_rel_list. If not, clear the entry's * "inh" flag to prevent later code from looking for AppendRelInfos. * 检查范围表条目是否表示继承集合。 * 如是,将所有子表的条目添加到查询的范围表中, * 并为所有子表构建AppendRelInfo节点,并将它们添加到root->append_rel_list。 * 如没有,清除条目的"inh"标志,以防止以后的代码寻找AppendRelInfos。 * * Note that the original RTE is considered to represent the whole * inheritance set. The first of the generated RTEs is an RTE for the same * table, but with inh = false, to represent the parent table in its role * as a simple member of the inheritance set. * 注意,原始的RTEs被认为代表了整个继承集合。 * 生成的第一个RTE是同一个表的RTE,但inh = false表示父表作为继承集的一个简单成员的角色。 * * A childless table is never considered to be an inheritance set. For * regular inheritance, a parent RTE must always have at least two associated * AppendRelInfos: one corresponding to the parent table as a simple member of * inheritance set and one or more corresponding to the actual children. * Since a partitioned table is not scanned, it might have only one associated * AppendRelInfo. * 无子表的关系永远不会被认为是继承集合。 * 对于常规继承,父RTE必须始终至少有两个相关的AppendRelInfos: * 一个作为继承集的简单成员与父表相对应, * 另一个或多个与实际的子表相对应。 * 因为没有扫描分区表,所以它可能只有一个关联的AppendRelInfo。 */static voidexpand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti){ Oid parentOID; PlanRowMark *oldrc; Relation oldrelation; LOCKMODE lockmode; List *inhOIDs; ListCell *l; /* Does RT entry allow inheritance? */ //是否分区表? if (!rte->inh) return; /* Ignore any already-expanded UNION ALL nodes */ //忽略所有已扩展的UNION ALL节点 if (rte->rtekind != RTE_RELATION) { Assert(rte->rtekind == RTE_SUBQUERY); return;//返回 } /* Fast path for common case of childless table */ //对于常规的无子表的关系,快速判断 parentOID = rte->relid; if (!has_subclass(parentOID)) { /* Clear flag before returning */ //无子表,设置标记并返回 rte->inh = false; return; } /* * The rewriter should already have obtained an appropriate lock on each * relation named in the query. However, for each child relation we add * to the query, we must obtain an appropriate lock, because this will be * the first use of those relations in the parse/rewrite/plan pipeline. * Child rels should use the same lockmode as their parent. * 查询rewriter程序应该已经在查询中命名的每个关系上获得了适当的锁。 * 但是,对于添加到查询中的每个子关系,必须获得适当的锁, * 因为这将是解析/重写/计划过程中这些关系的第一次使用。 * 子树应该使用与父树相同的锁模式。 */ lockmode = rte->rellockmode; /* Scan for all members of inheritance set, acquire needed locks */ //扫描继承集的所有成员,获取所需的锁 inhOIDs = find_all_inheritors(parentOID, lockmode, NULL); /* * Check that there's at least one descendant, else treat as no-child * case. This could happen despite above has_subclass() check, if table * once had a child but no longer does. * 检查是否至少有一个后代,否则视为无子女情况。 * 尽管上面有has_subclass()检查,但如果table曾经有一个子元素, * 但现在不再有了,则可能发生这种情况。 */ if (list_length(inhOIDs) < 2) { /* Clear flag before returning */ //清除标记,返回 rte->inh = false; return; } /* * If parent relation is selected FOR UPDATE/SHARE, we need to mark its * PlanRowMark as isParent = true, and generate a new PlanRowMark for each * child. * 如果父关系是 selected FOR UPDATE/SHARE, * 则需要将其PlanRowMark标记为isParent = true, * 并为每个子关系生成一个新的PlanRowMark。 */ oldrc = get_plan_rowmark(root->rowMarks, rti); if (oldrc) oldrc->isParent = true; /* * Must open the parent relation to examine its tupdesc. We need not lock * it; we assume the rewriter already did. * 必须打开父关系以检查其tupdesc。 * 不需要锁定,我们假设查询重写已经这么做了。 */ oldrelation = heap_open(parentOID, NoLock); /* Scan the inheritance set and expand it */ //扫描继承集合并扩展之 if (RelationGetPartitionDesc(oldrelation) != NULL)// { Assert(rte->relkind == RELKIND_PARTITIONED_TABLE); /* * If this table has partitions, recursively expand them in the order * in which they appear in the PartitionDesc. While at it, also * extract the partition key columns of all the partitioned tables. * 如果这个表有分区,则按分区在PartitionDesc中出现的顺序递归展开它们。 * 同时,还提取所有分区表的分区键列。 */ expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc, lockmode, &root->append_rel_list); } else { //分区描述符获取不成功(没有分区信息) List *appinfos = NIL; RangeTblEntry *childrte; Index childRTindex; /* * This table has no partitions. Expand any plain inheritance * children in the order the OIDs were returned by * find_all_inheritors. * 这个表没有分区。 * 按find_all_inheritors返回的OIDs的顺序展开所有普通继承子元素。 */ foreach(l, inhOIDs)//遍历OIDs { Oid childOID = lfirst_oid(l); Relation newrelation; /* Open rel if needed; we already have required locks */ //如有需要,打开rel(已获得锁) if (childOID != parentOID) newrelation = heap_open(childOID, NoLock); else newrelation = oldrelation; /* * It is possible that the parent table has children that are temp * tables of other backends. We cannot safely access such tables * (because of buffering issues), and the best thing to do seems * to be to silently ignore them. * 父表的子表可能是其他后台的临时表。 * 我们不能安全地访问这些表(因为存在缓冲问题),最好的办法似乎是悄悄地忽略它们。 */ if (childOID != parentOID && RELATION_IS_OTHER_TEMP(newrelation)) { heap_close(newrelation, lockmode);//忽略它们 continue; } expand_single_inheritance_child(root, rte, rti, oldrelation, oldrc, newrelation, &appinfos, &childrte, &childRTindex);//展开 /* Close child relations, but keep locks */ //关闭子表,但仍持有锁 if (childOID != parentOID) heap_close(newrelation, NoLock); } /* * If all the children were temp tables, pretend it's a * non-inheritance situation; we don't need Append node in that case. * The duplicate RTE we added for the parent table is harmless, so we * don't bother to get rid of it; ditto for the useless PlanRowMark * node. * 如果所有的子表都是临时表,则假设这是非继承情况; * 在这种情况下,不需要APPEND NODE。 * 我们为父表添加重复的RTE是无关紧要的, * 因此我们不必费心删除它;无用的PlanRowMark节点也是如此。 */ if (list_length(appinfos) < 2) rte->inh = false;//设置标记 else root->append_rel_list = list_concat(root->append_rel_list, appinfos);//添加到链表中 } heap_close(oldrelation, NoLock);//关闭relation}/* * expand_partitioned_rtentry * Recursively expand an RTE for a partitioned table. * 递归扩展分区表RTE */static voidexpand_partitioned_rtentry(PlannerInfo *root, RangeTblEntry *parentrte, Index parentRTindex, Relation parentrel, PlanRowMark *top_parentrc, LOCKMODE lockmode, List **appinfos){ int i; RangeTblEntry *childrte; Index childRTindex; PartitionDesc partdesc = RelationGetPartitionDesc(parentrel); check_stack_depth(); /* A partitioned table should always have a partition descriptor. */ //分配表通常应具备分区描述符 Assert(partdesc); Assert(parentrte->inh); /* * Note down whether any partition key cols are being updated. Though it's * the root partitioned table's updatedCols we are interested in, we * instead use parentrte to get the updatedCols. This is convenient * because parentrte already has the root partrel's updatedCols translated * to match the attribute ordering of parentrel. * 请注意是否正在更新分区键cols。 * 虽然感兴趣的是根分区表的updatedCols,但是使用parentrte来获取updatedCols。 * 这很方便,因为parentrte已经将root partrel的updatedCols转换为匹配parentrel的属性顺序。 */ if (!root->partColsUpdated) root->partColsUpdated = has_partition_attrs(parentrel, parentrte->updatedCols, NULL); /* First expand the partitioned table itself. */ // expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel, top_parentrc, parentrel, appinfos, &childrte, &childRTindex); /* * If the partitioned table has no partitions, treat this as the * non-inheritance case. * 如果分区表没有分区,则将其视为非继承情况。 */ if (partdesc->nparts == 0) { parentrte->inh = false; return; } for (i = 0; i < partdesc->nparts; i++) { Oid childOID = partdesc->oids[i]; Relation childrel; /* Open rel; we already have required locks */ //打开rel childrel = heap_open(childOID, NoLock); /* * Temporary partitions belonging to other sessions should have been * disallowed at definition, but for paranoia's sake, let's double * check. * 属于其他会话的临时分区在定义时应该是不允许的,但是出于偏执狂的考虑,再检查一下。 */ if (RELATION_IS_OTHER_TEMP(childrel)) elog(ERROR, "temporary relation from another session found as partition"); //扩展之 expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel, top_parentrc, childrel, appinfos, &childrte, &childRTindex); /* If this child is itself partitioned, recurse */ //子关系是分区表,递归扩展 if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) expand_partitioned_rtentry(root, childrte, childRTindex, childrel, top_parentrc, lockmode, appinfos); /* Close child relation, but keep locks */ //关闭子关系,但仍持有锁 heap_close(childrel, NoLock); }} /* expand_single_inheritance_child * Build a RangeTblEntry and an AppendRelInfo, if appropriate, plus * maybe a PlanRowMark. * 构建一个RangeTblEntry和一个AppendRelInfo,如果合适的话,再加上一个PlanRowMark。 * * We now expand the partition hierarchy level by level, creating a * corresponding hierarchy of AppendRelInfos and RelOptInfos, where each * partitioned descendant acts as a parent of its immediate partitions. * (This is a difference from what older versions of PostgreSQL did and what * is still done in the case of table inheritance for unpartitioned tables, * where the hierarchy is flattened during RTE expansion.) * 现在我们逐层扩展分区层次结构,创建一个对应的AppendRelInfos和RelOptInfos层次结构, * 其中每个分区的后代充当其直接分区的父级。 * (在未分区表的表继承中, * 层次结构在RTE扩展期间被扁平化,这与老版本的PostgreSQL有所不同。) * * PlanRowMarks still carry the top-parent's RTI, and the top-parent's * allMarkTypes field still accumulates values from all descendents. * PlanRowMarks仍然具有顶级父类的RTI信息, * 而顶级父类的allMarkTypes字段仍然从所有子类累积。 * * "parentrte" and "parentRTindex" are immediate parent's RTE and * RTI. "top_parentrc" is top parent's PlanRowMark. * "parentrte"和"parentRTindex"是直接父级的RTE和RTI。 * "top_parentrc"是top父类的PlanRowMark。 * * The child RangeTblEntry and its RTI are returned in "childrte_p" and * "childRTindex_p" resp. * 子RTE及其RTI在"childrte_p"和"childRTindex_p"resp中返回。 */static voidexpand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte, Index parentRTindex, Relation parentrel, PlanRowMark *top_parentrc, Relation childrel, List **appinfos, RangeTblEntry **childrte_p, Index *childRTindex_p){ Query *parse = root->parse; Oid parentOID = RelationGetRelid(parentrel);//父关系 Oid childOID = RelationGetRelid(childrel);//子关系 RangeTblEntry *childrte; Index childRTindex; AppendRelInfo *appinfo; /* * Build an RTE for the child, and attach to query's rangetable list. We * copy most fields of the parent's RTE, but replace relation OID and * relkind, and set inh = false. Also, set requiredPerms to zero since * all required permissions checks are done on the original RTE. Likewise, * set the child's securityQuals to empty, because we only want to apply * the parent's RLS conditions regardless of what RLS properties * individual children may have. (This is an intentional choice to make * inherited RLS work like regular permissions checks.) The parent * securityQuals will be propagated to children along with other base * restriction clauses, so we don't need to do it here. * 为子元素构建一个RTE,并附加到query的范围表链表中。 * 我们复制父RTE的大部分字段,但是替换关系OID和relkind,并设置inh = false。 * 另外,将requiredPerms设置为0,因为所有需要的权限检查都是在原始RTE上完成的。 * 同样,将子元素securityQuals设置为空,因为只想应用父元素的RLS条件, * 而不管每个子元素可能具有什么RLS属性。 * (这是一种有意的选择,目的是让继承的RLS像常规权限检查一样工作。) * 父安全条件quals将与其他基本限制条款一起传播到子级,因此不需要在这里这样做。 */ childrte = copyObject(parentrte); *childrte_p = childrte; childrte->relid = childOID; childrte->relkind = childrel->rd_rel->relkind; /* A partitioned child will need to be expanded further. */ //分区表的子关系会在"将来"扩展 if (childOID != parentOID && childrte->relkind == RELKIND_PARTITIONED_TABLE) childrte->inh = true; else childrte->inh = false; childrte->requiredPerms = 0; childrte->securityQuals = NIL; parse->rtable = lappend(parse->rtable, childrte); childRTindex = list_length(parse->rtable); *childRTindex_p = childRTindex; /* * We need an AppendRelInfo if paths will be built for the child RTE. If * childrte->inh is true, then we'll always need to generate append paths * for it. If childrte->inh is false, we must scan it if it's not a * partitioned table; but if it is a partitioned table, then it never has * any data of its own and need not be scanned. * 如果要为子RTE构建路径,则需要一个AppendRelInfo。 * 如果children ->inh为真,那么我们总是需要为它生成APPEND访问路径。 * 如果children ->inh为假,则必须扫描它,如果它不是分区表; * 但是如果它是一个分区表,那么它永远不会有任何自己的数据,也不需要扫描。 */ if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh) { appinfo = makeNode(AppendRelInfo); appinfo->parent_relid = parentRTindex; appinfo->child_relid = childRTindex; appinfo->parent_reltype = parentrel->rd_rel->reltype; appinfo->child_reltype = childrel->rd_rel->reltype; make_inh_translation_list(parentrel, childrel, childRTindex, &appinfo->translated_vars); appinfo->parent_reloid = parentOID; *appinfos = lappend(*appinfos, appinfo); /* * Translate the column permissions bitmaps to the child's attnums (we * have to build the translated_vars list before we can do this). But * if this is the parent table, leave copyObject's result alone. * 将列权限位图转换为子节点的attnums(在此之前必须构建translated_vars列表)。 * 但是,如果这是父表,则不要理会copyObject的结果。 * * Note: we need to do this even though the executor won't run any * permissions checks on the child RTE. The insertedCols/updatedCols * bitmaps may be examined for trigger-firing purposes. * 注意:即使执行程序不会在子RTE上运行任何权限检查,我们也需要这样做。 * 可以检查插入的tedcols /updatedCols位图是否具有触发目的。 */ if (childOID != parentOID) { childrte->selectedCols = translate_col_privs(parentrte->selectedCols, appinfo->translated_vars); childrte->insertedCols = translate_col_privs(parentrte->insertedCols, appinfo->translated_vars); childrte->updatedCols = translate_col_privs(parentrte->updatedCols, appinfo->translated_vars); } } /* * Build a PlanRowMark if parent is marked FOR UPDATE/SHARE. * 如父关系标记为FOR UPDATE/SHARE,则创建PlanRowMark */ if (top_parentrc) { PlanRowMark *childrc = makeNode(PlanRowMark); childrc->rti = childRTindex; childrc->prti = top_parentrc->rti; childrc->rowmarkId = top_parentrc->rowmarkId; /* Reselect rowmark type, because relkind might not match parent */ //重新选择rowmark类型,因为relkind可能与父类不匹配 childrc->markType = select_rowmark_type(childrte, top_parentrc->strength); childrc->allMarkTypes = (1 << childrc->markType); childrc->strength = top_parentrc->strength; childrc->waitPolicy = top_parentrc->waitPolicy; /* * We mark RowMarks for partitioned child tables as parent RowMarks so * that the executor ignores them (except their existence means that * the child tables be locked using appropriate mode). * 我们将分区的子表的RowMarks标记为父RowMarks, * 以便执行程序忽略它们(除非它们的存在意味着子表使用适当的模式被锁定)。 */ childrc->isParent = (childrte->relkind == RELKIND_PARTITIONED_TABLE); /* Include child's rowmark type in top parent's allMarkTypes */ //在父类的allMarkTypes中包含子类的rowmark类型 top_parentrc->allMarkTypes |= childrc->allMarkTypes; root->rowMarks = lappend(root->rowMarks, childrc); }}
三、跟踪分析
测试脚本如下
testdb=# explain verbose select * from t_hash_partition where c1 = 1 OR c1 = 2; QUERY PLAN ------------------------------------------------------------------------------------- Append (cost=0.00..30.53 rows=6 width=200) -> Seq Scan on public.t_hash_partition_1 (cost=0.00..15.25 rows=3 width=200) Output: t_hash_partition_1.c1, t_hash_partition_1.c2, t_hash_partition_1.c3 Filter: ((t_hash_partition_1.c1 = 1) OR (t_hash_partition_1.c1 = 2)) -> Seq Scan on public.t_hash_partition_3 (cost=0.00..15.25 rows=3 width=200) Output: t_hash_partition_3.c1, t_hash_partition_3.c2, t_hash_partition_3.c3 Filter: ((t_hash_partition_3.c1 = 1) OR (t_hash_partition_3.c1 = 2))(7 rows)
启动gdb,设置断点
(gdb) b expand_inherited_tablesBreakpoint 1 at 0x7e53ba: file prepunion.c, line 1483.(gdb) cContinuing.Breakpoint 1, expand_inherited_tables (root=0x28fcdc8) at prepunion.c:14831483 nrtes = list_length(root->parse->rtable);
获取RTE的个数和链表元素
(gdb) n1484 rl = list_head(root->parse->rtable);(gdb) 1485 for (rti = 1; rti <= nrtes; rti++)(gdb) p nrtes$1 = 1(gdb) p *rl$2 = {data = {ptr_value = 0x28d83d0, int_value = 42828752, oid_value = 42828752}, next = 0x0}(gdb)
循环处理RTE
(gdb) n1487 RangeTblEntry *rte = (RangeTblEntry *) lfirst(rl);(gdb) 1489 expand_inherited_rtentry(root, rte, rti);(gdb) p *rte$3 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16986, relkind = 112 'p', tablesample = 0x0, subquery = 0x0, security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28d84e8, lateral = false, inh = true, inFromCl = true, requiredPerms = 2, checkAsUser = 0, selectedCols = 0x28d8c40, insertedCols = 0x0, updatedCols = 0x0, securityQuals = 0x0}
进入expand_inherited_rtentry
(gdb) stepexpand_inherited_rtentry (root=0x28fcdc8, rte=0x28d83d0, rti=1) at prepunion.c:15171517 Query *parse = root->parse;
expand_inherited_rtentry->分区表标记为T
1526 if (!rte->inh)(gdb) p rte->inh$4 = true
expand_inherited_rtentry->执行相关判断
(gdb) n1529 if (rte->rtekind != RTE_RELATION)(gdb) p rte->rtekind$5 = RTE_RELATION(gdb) n1535 parentOID = rte->relid;(gdb) 1536 if (!has_subclass(parentOID))(gdb) p parentOID$6 = 16986(gdb) n1556 oldrc = get_plan_rowmark(root->rowMarks, rti);(gdb) 1557 if (rti == parse->resultRelation)(gdb) p *oldrcCannot access memory at address 0x0
expand_inherited_rtentry->扫描继承集的所有成员,获取所需的锁,并构建OIDs链表
(gdb) n1559 else if (oldrc && RowMarkRequiresRowShareLock(oldrc->markType))(gdb) 1562 lockmode = AccessShareLock;(gdb) 1565 inhOIDs = find_all_inheritors(parentOID, lockmode, NULL);(gdb) 1572 if (list_length(inhOIDs) < 2)(gdb) p inhOIDs$7 = (List *) 0x28fd208(gdb) p *inhOIDs$8 = {type = T_OidList, length = 7, head = 0x28fd1e0, tail = 0x28fd778}(gdb)
expand_inherited_rtentry->打开relation
(gdb) n1584 if (oldrc)(gdb) 1591 oldrelation = heap_open(parentOID, NoLock);
expand_inherited_rtentry->成功获取分区描述符,调用expand_partitioned_rtentry
(gdb) 1594 if (RelationGetPartitionDesc(oldrelation) != NULL)(gdb) 1596 Assert(rte->relkind == RELKIND_PARTITIONED_TABLE);(gdb) 1603 expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,(gdb)
expand_inherited_rtentry->进入expand_partitioned_rtentry
(gdb) stepexpand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:16841684 PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);
expand_partitioned_rtentry->获取分区描述符
1684 PartitionDesc partdesc = RelationGetPartitionDesc(parentrel);(gdb) n1686 check_stack_depth();(gdb) p *partdesc$9 = {nparts = 6, oids = 0x298e4f8, boundinfo = 0x298e530}
expand_partitioned_rtentry->执行相关校验
(gdb) n1689 Assert(partdesc);(gdb) 1691 Assert(parentrte->inh);(gdb) 1700 if (!root->partColsUpdated)(gdb) 1702 has_partition_attrs(parentrel, parentrte->updatedCols, NULL);(gdb) 1701 root->partColsUpdated =(gdb) 1705 expand_single_inheritance_child(root, parentrte, parentRTindex, parentrel,
expand_partitioned_rtentry->首先展开分区表本身,进入expand_single_inheritance_child
(gdb) stepexpand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, childrel=0x7f4e66827980, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4) at prepunion.c:17781778 Query *parse = root->parse;
expand_single_inheritance_child->执行相关初始化(childrte)
(gdb) n1779 Oid parentOID = RelationGetRelid(parentrel);(gdb) 1780 Oid childOID = RelationGetRelid(childrel);(gdb) 1797 childrte = copyObject(parentrte);(gdb) p parentOID$10 = 16986(gdb) p childOID$11 = 16986(gdb) n1798 *childrte_p = childrte;(gdb) 1799 childrte->relid = childOID;(gdb) 1800 childrte->relkind = childrel->rd_rel->relkind;(gdb) 1802 if (childOID != parentOID &&(gdb) 1806 childrte->inh = false;(gdb) 1807 childrte->requiredPerms = 0;(gdb) 1808 childrte->securityQuals = NIL;(gdb) 1809 parse->rtable = lappend(parse->rtable, childrte);(gdb) 1810 childRTindex = list_length(parse->rtable);(gdb) 1811 *childRTindex_p = childRTindex;(gdb) p *childrte -->relid = 16986,仍为分区表$12 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16986, relkind = 112 'p', tablesample = 0x0, subquery = 0x0, security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28fd268, lateral = false, inh = false, inFromCl = true, requiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fd898, insertedCols = 0x0, updatedCols = 0x0, securityQuals = 0x0}(gdb) p *childRTindex_p$13 = 0
expand_single_inheritance_child->完成分区表本身的扩展,回到expand_partitioned_rtentry
(gdb) p *childRTindex_p$13 = 0(gdb) n1820 if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)(gdb) 1855 if (top_parentrc)(gdb) 1881 }(gdb) expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:17131713 if (partdesc->nparts == 0)
expand_partitioned_rtentry->开始遍历分区描述符中的分区
1713 if (partdesc->nparts == 0)(gdb) n1719 for (i = 0; i < partdesc->nparts; i++)(gdb) 1721 Oid childOID = partdesc->oids[i];(gdb) 1725 childrel = heap_open(childOID, NoLock);(gdb) 1732 if (RELATION_IS_OTHER_TEMP(childrel))(gdb) 1735 expand_single_inheritance_child(root, parentrte, parentRTindex,(gdb) p childOID$14 = 16989 ----------------------------------------testdb=# select relname from pg_class where oid=16989; relname -------------------- t_hash_partition_1(1 row)----------------------------------------
expand_single_inheritance_child->再次进入expand_single_inheritance_child
(gdb) stepexpand_single_inheritance_child (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, childrel=0x7f4e668306a0, appinfos=0x28fce98, childrte_p=0x7ffd1928d2f8, childRTindex_p=0x7ffd1928d2f4) at prepunion.c:17781778 Query *parse = root->parse;
expand_single_inheritance_child->开始构建AppendRelInfo
...1820 if (childrte->relkind != RELKIND_PARTITIONED_TABLE || childrte->inh)(gdb) 1822 appinfo = makeNode(AppendRelInfo);(gdb) p *childrte$17 = {type = T_RangeTblEntry, rtekind = RTE_RELATION, relid = 16989, relkind = 114 'r', tablesample = 0x0, subquery = 0x0, security_barrier = false, jointype = JOIN_INNER, joinaliasvars = 0x0, functions = 0x0, funcordinality = false, tablefunc = 0x0, values_lists = 0x0, ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes = 0x0, coltypmods = 0x0, colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias = 0x0, eref = 0x28fd9d0, lateral = false, inh = false, inFromCl = true, requiredPerms = 0, checkAsUser = 0, selectedCols = 0x28fdbc8, insertedCols = 0x0, updatedCols = 0x0, securityQuals = 0x0}(gdb) p *childrte->relkindCannot access memory at address 0x72(gdb) p childrte->relkind$18 = 114 'r'(gdb) p childrte->inh$19 = false
expand_single_inheritance_child->构建完毕,查看AppendRelInfo结构体
(gdb) n1823 appinfo->parent_relid = parentRTindex;(gdb) 1824 appinfo->child_relid = childRTindex;(gdb) 1825 appinfo->parent_reltype = parentrel->rd_rel->reltype;(gdb) 1826 appinfo->child_reltype = childrel->rd_rel->reltype;(gdb) 1827 make_inh_translation_list(parentrel, childrel, childRTindex,(gdb) 1829 appinfo->parent_reloid = parentOID;(gdb) 1830 *appinfos = lappend(*appinfos, appinfo);(gdb) 1841 if (childOID != parentOID)(gdb) 1843 childrte->selectedCols = translate_col_privs(parentrte->selectedCols,(gdb) 1845 childrte->insertedCols = translate_col_privs(parentrte->insertedCols,(gdb) 1847 childrte->updatedCols = translate_col_privs(parentrte->updatedCols,(gdb) 1855 if (top_parentrc)(gdb) p *appinfo$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, translated_vars = 0x28fdc90, parent_reloid = 16986}
expand_single_inheritance_child->完成调用,返回
(gdb) 1855 if (top_parentrc)(gdb) p *appinfo$20 = {type = T_AppendRelInfo, parent_relid = 1, child_relid = 3, parent_reltype = 16988, child_reltype = 16991, translated_vars = 0x28fdc90, parent_reloid = 16986}(gdb) n1881 }(gdb) expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:17401740 if (childrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
expand_inherited_rtentry->完成expand_partitioned_rtentry过程调用,回到expand_inherited_rtentry
(gdb) finishRun till exit from #0 expand_partitioned_rtentry (root=0x28fcdc8, parentrte=0x28d83d0, parentRTindex=1, parentrel=0x7f4e66827980, top_parentrc=0x0, lockmode=1, appinfos=0x28fce98) at prepunion.c:17400x00000000007e55e3 in expand_inherited_rtentry (root=0x28fcdc8, rte=0x28d83d0, rti=1) at prepunion.c:16031603 expand_partitioned_rtentry(root, rte, rti, oldrelation, oldrc,(gdb)
expand_inherited_rtentry->完成expand_inherited_rtentry的调用,回到expand_inherited_tables
(gdb) n1665 heap_close(oldrelation, NoLock);(gdb) 1666 }(gdb) expand_inherited_tables (root=0x28fcdc8) at prepunion.c:14901490 rl = lnext(rl);(gdb)
expand_inherited_tables->完成expand_inherited_tables调用,回到subquery_planner
(gdb) n1485 for (rti = 1; rti <= nrtes; rti++)(gdb) 1492 }(gdb) subquery_planner (glob=0x28fcd30, parse=0x28d82b8, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:719719 root->hasHavingQual = (parse->havingQual != NULL);(gdb)
DONE!
四、参考资料
Parallel Append implementation
Partition Elimination in PostgreSQL 11