千家信息网

PostgreSQL的set_base_rel_pathlists函数及其子函数分析

发表于:2024-11-22 作者:千家信息网编辑
千家信息网最后更新 2024年11月22日,这篇文章主要讲解了"PostgreSQL的set_base_rel_pathlists函数及其子函数分析",文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习"
千家信息网最后更新 2024年11月22日PostgreSQL的set_base_rel_pathlists函数及其子函数分析

这篇文章主要讲解了"PostgreSQL的set_base_rel_pathlists函数及其子函数分析",文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着小编的思路慢慢深入,一起来研究和学习"PostgreSQL的set_base_rel_pathlists函数及其子函数分析"吧!

set_base_rel_pathlists函数的目的是为每一个base rel找出所有可用的访问路径(包括顺序扫描和所有可用的索引),每一个可用的路径都会添加到pathlist链表中。这一小节主要介绍常规(区别于并行)顺序扫描部分。

make_one_rel源代码:

 RelOptInfo * make_one_rel(PlannerInfo *root, List *joinlist) {     //...     /*      * Compute size estimates and consider_parallel flags for each base rel,      * then generate access paths.      */     set_base_rel_sizes(root);//估算Relation的Size并且设置consider_parallel标记     set_base_rel_pathlists(root);//生成Relation的扫描(访问)路径      /*      * Generate access paths for the entire join tree.      * 通过动态规划或遗传算法生成连接路径       */     rel = make_rel_from_joinlist(root, joinlist);      /*      * The result should join all and only the query's base rels.      */     Assert(bms_equal(rel->relids, root->all_baserels));   //返回最上层的RelOptInfo     return rel; }

一、数据结构

RelOptInfo

 typedef struct RelOptInfo {     NodeTag     type;//节点标识      RelOptKind  reloptkind;//RelOpt类型      /* all relations included in this RelOptInfo */     Relids      relids;         /*Relids(rtindex)集合 set of base relids (rangetable indexes) */      /* size estimates generated by planner */     double      rows;           /*结果元组的估算数量 estimated number of result tuples */      /* per-relation planner control flags */     bool        consider_startup;   /*是否考虑启动成本?是,需要保留启动成本低的路径 keep cheap-startup-cost paths? */     bool        consider_param_startup; /*是否考虑参数化?的路径 ditto, for parameterized paths? */     bool        consider_parallel;  /*是否考虑并行处理路径 consider parallel paths? */      /* default result targetlist for Paths scanning this relation */     struct PathTarget *reltarget;   /*扫描该Relation时默认的结果 list of Vars/Exprs, cost, width */      /* materialization information */     List       *pathlist;       /*访问路径链表 Path structures */     List       *ppilist;        /*路径链表中使用参数化路径进行 ParamPathInfos used in pathlist */     List       *partial_pathlist;   /* partial Paths */     struct Path *cheapest_startup_path;//代价最低的启动路径     struct Path *cheapest_total_path;//代价最低的整体路径     struct Path *cheapest_unique_path;//代价最低的获取唯一值的路径     List       *cheapest_parameterized_paths;//代价最低的参数化路径链表      /* parameterization information needed for both base rels and join rels */     /* (see also lateral_vars and lateral_referencers) */     Relids      direct_lateral_relids;  /*使用lateral语法,需依赖的Relids rels directly laterally referenced */     Relids      lateral_relids; /* minimum parameterization of rel */      /* information about a base rel (not set for join rels!) */     //reloptkind=RELOPT_BASEREL时使用的数据结构     Index       relid;          /* Relation ID */     Oid         reltablespace;  /* 表空间 containing tablespace */     RTEKind     rtekind;        /* 基表?子查询?还是函数等等?RELATION, SUBQUERY, FUNCTION, etc */     AttrNumber  min_attr;       /* 最小的属性编号 smallest attrno of rel (often <0) */     AttrNumber  max_attr;       /* 最大的属性编号 largest attrno of rel */     Relids     *attr_needed;    /* 数组 array indexed [min_attr .. max_attr] */     int32      *attr_widths;    /* 属性宽度 array indexed [min_attr .. max_attr] */     List       *lateral_vars;   /* 关系依赖的Vars/PHVs LATERAL Vars and PHVs referenced by rel */     Relids      lateral_referencers;    /*依赖该关系的Relids rels that reference me laterally */     List       *indexlist;      /* 该关系的IndexOptInfo链表 list of IndexOptInfo */     List       *statlist;       /* 统计信息链表 list of StatisticExtInfo */     BlockNumber pages;          /* 块数 size estimates derived from pg_class */     double      tuples;         /* 元组数 */     double      allvisfrac;     /* ? */     PlannerInfo *subroot;       /* 如为子查询,存储子查询的root if subquery */     List       *subplan_params; /* 如为子查询,存储子查询的参数 if subquery */     int         rel_parallel_workers;   /* 并行执行,需要多少个workers? wanted number of parallel workers */      /* Information about foreign tables and foreign joins */     //FDW相关信息     Oid         serverid;       /* identifies server for the table or join */     Oid         userid;         /* identifies user to check access as */     bool        useridiscurrent;    /* join is only valid for current user */     /* use "struct FdwRoutine" to avoid including fdwapi.h here */     struct FdwRoutine *fdwroutine;     void       *fdw_private;      /* cache space for remembering if we have proven this relation unique */     //已知的,可保证唯一元组返回的Relids链表     List       *unique_for_rels;    /* known unique for these other relid                                      * set(s) */     List       *non_unique_for_rels;    /* 已知的,返回的数据不唯一的Relids链表 known not unique for these set(s) */      /* used by various scans and joins: */     List       *baserestrictinfo;   /* 如为基本关系,则存储约束条件 RestrictInfo structures (if base rel) */     QualCost    baserestrictcost;   /* 解析约束表达式的成本? cost of evaluating the above */     Index       baserestrict_min_security;  /* 最低安全等级 min security_level found in                                              * baserestrictinfo */     List       *joininfo;       /* 连接语句的约束条件信息 RestrictInfo structures for join clauses                                  * involving this rel */     bool        has_eclass_joins;   /* 是否存在等价类连接? True意味着joininfo并不完整,,T means joininfo is incomplete */      /* used by partitionwise joins: */       //是否尝试partitionwise连接,这是PG 11的一个新特性.     bool        consider_partitionwise_join;    /* consider partitionwise                                                  * join paths? (if                                                  * partitioned rel) */     Relids      top_parent_relids;  /* Relids of topmost parents (if "other"                                      * rel) */      /* used for partitioned relations */     //分区表使用     PartitionScheme part_scheme;    /* 分区的schema Partitioning scheme. */     int         nparts;         /* 分区数 number of partitions */     struct PartitionBoundInfoData *boundinfo;   /* 分区边界信息 Partition bounds */     List       *partition_qual; /* 分区约束 partition constraint */     struct RelOptInfo **part_rels;  /* 分区的RelOptInfo数组 Array of RelOptInfos of partitions,                                      * stored in the same order of bounds */     List      **partexprs;      /* 非空分区键表达式 Non-nullable partition key expressions. */     List      **nullable_partexprs; /* 可为空的分区键表达式 Nullable partition key expressions. */     List       *partitioned_child_rels; /* RT Indexes链表 List of RT indexes. */ } RelOptInfo;

ParamPathInfo

 /*  * ParamPathInfo  *  * All parameterized paths for a given relation with given required outer rels  * link to a single ParamPathInfo, which stores common information such as  * the estimated rowcount for this parameterization.  We do this partly to  * avoid recalculations, but mostly to ensure that the estimated rowcount  * is in fact the same for every such path.  *  * Note: ppi_clauses is only used in ParamPathInfos for base relation paths;  * in join cases it's NIL because the set of relevant clauses varies depending  * on how the join is formed.  The relevant clauses will appear in each  * parameterized join path's joinrestrictinfo list, instead.  */ typedef struct ParamPathInfo {     NodeTag     type;//节点类型      Relids      ppi_req_outer;  /* 访问路径需要使用的Relids参数,rels supplying parameters used by path */     double      ppi_rows;       /* 估算的结果元组数,estimated number of result tuples */     List       *ppi_clauses;    /* 外部rels提供的可用连接条件链表,join clauses available from outer rels */ } ParamPathInfo;

Cost相关
注意:实际使用的参数值通过系统配置文件定义,而不是这里的常量定义!

/*  * The cost estimate produced by cost_qual_eval() includes both a one-time  * (startup) cost, and a per-tuple cost.  */ typedef struct QualCost {     Cost        startup;        /* 启动成本,one-time cost */     Cost        per_tuple;      /* 每个元组的成本,per-evaluation cost */ } QualCost; typedef double Cost; /* execution cost (in page-access units) */ /* defaults for costsize.c's Cost parameters */ /* NB: cost-estimation code should use the variables, not these constants! */ /* 注意:实际值通过系统配置文件定义,而不是这里的常量定义! */ /* If you change these, update backend/utils/misc/postgresql.sample.conf */ #define DEFAULT_SEQ_PAGE_COST  1.0       //顺序扫描page的成本 #define DEFAULT_RANDOM_PAGE_COST  4.0      //随机扫描page的成本 #define DEFAULT_CPU_TUPLE_COST  0.01     //处理一个元组的CPU成本 #define DEFAULT_CPU_INDEX_TUPLE_COST 0.005   //处理一个索引元组的CPU成本 #define DEFAULT_CPU_OPERATOR_COST  0.0025    //执行一次操作或函数的CPU成本 #define DEFAULT_PARALLEL_TUPLE_COST 0.1    //并行执行,从一个worker传输一个元组到另一个worker的成本 #define DEFAULT_PARALLEL_SETUP_COST  1000.0  //构建并行执行环境的成本  #define DEFAULT_EFFECTIVE_CACHE_SIZE  524288    /*先前已有介绍, measured in pages */ double      seq_page_cost = DEFAULT_SEQ_PAGE_COST; double      random_page_cost = DEFAULT_RANDOM_PAGE_COST; double      cpu_tuple_cost = DEFAULT_CPU_TUPLE_COST; double      cpu_index_tuple_cost = DEFAULT_CPU_INDEX_TUPLE_COST; double      cpu_operator_cost = DEFAULT_CPU_OPERATOR_COST; double      parallel_tuple_cost = DEFAULT_PARALLEL_TUPLE_COST; double      parallel_setup_cost = DEFAULT_PARALLEL_SETUP_COST;  int         effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;  Cost        disable_cost = 1.0e10;//1后面10个0,通过设置一个巨大的成本,让优化器自动放弃此路径  int         max_parallel_workers_per_gather = 2;//每次gather使用的worker数

二、源码解读

set_base_rel_pathlists函数遍历RelOptInfo数组,为每一个Rel构造访问路径.

//-------------------------------------------------------- /*  * set_base_rel_pathlists  *    Finds all paths available for scanning each base-relation entry.  *    Sequential scan and any available indices are considered.  *    Each useful path is attached to its relation's 'pathlist' field.  *  *    为每一个base rels找出所有可用的访问路径(包括顺序扫描和所有可用的索引)  *    每一个可用的路径都会添加到pathlist链表中  *  */ static void set_base_rel_pathlists(PlannerInfo *root) {     Index       rti;      for (rti = 1; rti < root->simple_rel_array_size; rti++)//遍历RelOptInfo数组     {         RelOptInfo *rel = root->simple_rel_array[rti];          /* there may be empty slots corresponding to non-baserel RTEs */         if (rel == NULL)             continue;          Assert(rel->relid == rti);  /* sanity check on array */          /* ignore RTEs that are "other rels" */         if (rel->reloptkind != RELOPT_BASEREL)             continue;          set_rel_pathlist(root, rel, rti, root->simple_rte_array[rti]);     } }//-------------------------------------------------------- set_rel_pathlist /*  * set_rel_pathlist  *    Build access paths for a base relation  */ static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,                  Index rti, RangeTblEntry *rte) {     if (IS_DUMMY_REL(rel))     {         /* We already proved the relation empty, so nothing more to do */     }     else if (rte->inh)//inherit     {         /* It's an "append relation", process accordingly */         set_append_rel_pathlist(root, rel, rti, rte);     }     else//常规     {         switch (rel->rtekind)         {             case RTE_RELATION://数据表                 if (rte->relkind == RELKIND_FOREIGN_TABLE)//FDW                 {                     /* Foreign table */                     set_foreign_pathlist(root, rel, rte);                 }                 else if (rte->tablesample != NULL)//采样表                 {                     /* Sampled relation */                     set_tablesample_rel_pathlist(root, rel, rte);                 }                 else//常规数据表                 {                     /* Plain relation */                     set_plain_rel_pathlist(root, rel, rte);                 }                 break;             case RTE_SUBQUERY://子查询                 /* Subquery --- 已在set_rel_size处理,fully handled during set_rel_size */         /* 函数:set_subquery_pathlist */                 break;             case RTE_FUNCTION:                 /* RangeFunction */                 set_function_pathlist(root, rel, rte);                 break;             case RTE_TABLEFUNC:                 /* Table Function */                 set_tablefunc_pathlist(root, rel, rte);                 break;             case RTE_VALUES:                 /* Values list */                 set_values_pathlist(root, rel, rte);                 break;             case RTE_CTE:                 /* CTE reference --- fully handled during set_rel_size */                 break;             case RTE_NAMEDTUPLESTORE:                 /* tuplestore reference --- fully handled during set_rel_size */                 break;             default:                 elog(ERROR, "unexpected rtekind: %d", (int) rel->rtekind);                 break;         }     }      /*      * If this is a baserel, we should normally consider gathering any partial      * paths we may have created for it.      *      * However, if this is an inheritance child, skip it.  Otherwise, we could      * end up with a very large number of gather nodes, each trying to grab      * its own pool of workers.  Instead, we'll consider gathering partial      * paths for the parent appendrel.      *      * Also, if this is the topmost scan/join rel (that is, the only baserel),      * we postpone this until the final scan/join targelist is available (see      * grouping_planner).      */     if (rel->reloptkind == RELOPT_BASEREL &&         bms_membership(root->all_baserels) != BMS_SINGLETON)         generate_gather_paths(root, rel, false);      /*      * Allow a plugin to editorialize on the set of Paths for this base      * relation.  It could add new paths (such as CustomPaths) by calling      * add_path(), or delete or modify paths added by the core code.      */     if (set_rel_pathlist_hook)//钩子函数         (*set_rel_pathlist_hook) (root, rel, rti, rte);      /* Now find the cheapest of the paths for this rel */     set_cheapest(rel);//找出代价最低的访问路径  #ifdef OPTIMIZER_DEBUG     debug_print_rel(root, rel); #endif }//-------------------------------------------------------- set_plain_rel_pathlist /*  * set_plain_rel_pathlist  *    Build access paths for a plain relation (no subquery, no inheritance)  */ static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte) {     Relids      required_outer;      /*      * We don't support pushing join clauses into the quals of a seqscan, but      * it could still have required parameterization due to LATERAL refs in      * its tlist.      */     required_outer = rel->lateral_relids;//需依赖的上层Relids      /* 顺序扫描,Consider sequential scan */     add_path(rel, create_seqscan_path(root, rel, required_outer, 0));      /* 如合适,尝试并行顺序扫描,If appropriate, consider parallel sequential scan */     if (rel->consider_parallel && required_outer == NULL)         create_plain_partial_paths(root, rel);      /* 索引扫描,Consider index scans */     create_index_paths(root, rel);      /* TID扫描,Consider TID scans */     create_tidscan_paths(root, rel); }//-------------------------------------------------------- create_seqscan_path /*  * create_seqscan_path  *    Creates a path corresponding to a sequential scan, returning the  *    pathnode.  */ Path * create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,                     Relids required_outer, int parallel_workers) {     Path       *pathnode = makeNode(Path);//顺序扫描路径      pathnode->pathtype = T_SeqScan;//顺序扫描     pathnode->parent = rel;//RelOptInfo     pathnode->pathtarget = rel->reltarget;//投影列     pathnode->param_info = get_baserel_parampathinfo(root, rel,                                                      required_outer);//获取参数化路径信息ParamPathInfo     pathnode->parallel_aware = parallel_workers > 0 ? true : false;//并行     pathnode->parallel_safe = rel->consider_parallel;//     pathnode->parallel_workers = parallel_workers;//并行数     pathnode->pathkeys = NIL;   /* 顺序扫描,不执行排序,seqscan has unordered result */      cost_seqscan(pathnode, root, rel, pathnode->param_info);      return pathnode; }//-------------------------------------------- get_baserel_parampathinfo /*  * get_baserel_parampathinfo  *      Get the ParamPathInfo for a parameterized path for a base relation,  *      constructing one if we don't have one already.  *  *    获取base rel的参数化路径,存储在结构体ParamPathInfo中.如果没有,那么构造一个.  *  * This centralizes estimating the rowcounts for parameterized paths.  * We need to cache those to be sure we use the same rowcount for all paths  * of the same parameterization for a given rel.  This is also a convenient  * place to determine which movable join clauses the parameterized path will  * be responsible for evaluating.  *  * 统一/集中估计参数化路径的行数。通过缓存这些数据,对于相同的参数,可以快速给出相应的行数。  */ ParamPathInfo * get_baserel_parampathinfo(PlannerInfo *root, RelOptInfo *baserel,                           Relids required_outer) {     ParamPathInfo *ppi;//ppi变量     Relids      joinrelids;//参与连接的relids     List       *pclauses;//条件链表     double      rows;//行数     ListCell   *lc;//临时变量      /* Unparameterized paths have no ParamPathInfo */     if (bms_is_empty(required_outer))         return NULL;      Assert(!bms_overlap(baserel->relids, required_outer));      /* If we already have a PPI for this parameterization, just return it */     if ((ppi = find_param_path_info(baserel, required_outer)))//已有缓存?         return ppi;//直接返回      /*      * Identify all joinclauses that are movable to this base rel given this      * parameterization.      * 在给定参数条件下,识别所有可移动到此基础关系的连接子句。      */     joinrelids = bms_union(baserel->relids, required_outer);//合并relids     pclauses = NIL;     foreach(lc, baserel->joininfo)//遍历连接条件     {         RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);//连接条件          if (join_clause_is_movable_into(rinfo,                                         baserel->relids,                                         joinrelids))             pclauses = lappend(pclauses, rinfo);//如可以移动,添加到链表中     }      /*      * Add in joinclauses generated by EquivalenceClasses, too.  (These      * necessarily satisfy join_clause_is_movable_into.)      * 通过等价类产生的连接条件一并合并到链表中      */     pclauses = list_concat(pclauses,                            generate_join_implied_equalities(root,                                                             joinrelids,                                                             required_outer,                                                             baserel));      /* Estimate the number of rows returned by the parameterized scan */     rows = get_parameterized_baserel_size(root, baserel, pclauses);//获取估算行数      /* And now we can build the ParamPathInfo */     ppi = makeNode(ParamPathInfo);//构造PPI返回节点     ppi->ppi_req_outer = required_outer;     ppi->ppi_rows = rows;     ppi->ppi_clauses = pclauses;     baserel->ppilist = lappend(baserel->ppilist, ppi);      return ppi; } //--------------------------------- get_parameterized_baserel_size /*  * get_parameterized_baserel_size  *      Make a size estimate for a parameterized scan of a base relation.  *    估算参数化扫描基础关系的大小  *  * 'param_clauses' lists the additional join clauses to be used.  * param_clauses是使用的连接条件链表  *   * set_baserel_size_estimates must have been applied already.  * 调用此函数前,要求已调用set_baserel_size_estimates函数  */ double get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,                                List *param_clauses) {     List       *allclauses;     double      nrows;      /*      * Estimate the number of rows returned by the parameterized scan, knowing      * that it will apply all the extra join clauses as well as the rel's own      * restriction clauses.  Note that we force the clauses to be treated as      * non-join clauses during selectivity estimation.      */     allclauses = list_concat(list_copy(param_clauses),                              rel->baserestrictinfo);//合并条件     nrows = rel->tuples *         clauselist_selectivity(root,                                allclauses,                                rel->relid,  /* do not use 0! */                                JOIN_INNER,                                NULL);//获取行数     nrows = clamp_row_est(nrows);     /* For safety, make sure result is not more than the base estimate */     if (nrows > rel->rows)         nrows = rel->rows;     return nrows;//返回 }//-------------------------------------------- cost_seqscan /*  * cost_seqscan  *    Determines and returns the cost of scanning a relation sequentially.  *    计算顺序扫描Rel的成本并返回  *  * 'baserel' is the relation to be scanned  * baserel:即将被扫描的Rel  *  * 'param_info' is the ParamPathInfo if this is a parameterized path, else NULL  * param_info:ppi结构体  */ void cost_seqscan(Path *path, PlannerInfo *root,              RelOptInfo *baserel, ParamPathInfo *param_info) {     Cost        startup_cost = 0;//启动成本     Cost        cpu_run_cost;//CPU成本     Cost        disk_run_cost;//IO成本     double      spc_seq_page_cost;//     QualCost    qpqual_cost;//表达式成本     Cost        cpu_per_tuple;//每个元组的CPU成本      /* Should only be applied to base relations */     Assert(baserel->relid > 0);     Assert(baserel->rtekind == RTE_RELATION);      /* Mark the path with the correct row estimate */     if (param_info)//存在PPI         path->rows = param_info->ppi_rows;//行数     else         path->rows = baserel->rows;//直接取基础关系行数      if (!enable_seqscan)         startup_cost += disable_cost;//不允许顺序扫描,disable_cost=1.0e10,即1后面10个0,这样的路径无需考虑      /* fetch estimated page cost for tablespace containing table */     get_tablespace_page_costs(baserel->reltablespace,                               NULL,                               &spc_seq_page_cost);//获取顺序访问表空间page的成本      /*      * disk costs      */     disk_run_cost = spc_seq_page_cost * baserel->pages;//IO成本      /* CPU costs */     get_restriction_qual_cost(root, baserel, param_info, &qpqual_cost);//CPU成本      startup_cost += qpqual_cost.startup;//启动成本     cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;//处理每个元组的成本     cpu_run_cost = cpu_per_tuple * baserel->tuples;//CPU执行过程中的成本     /* tlist eval costs are paid per output row, not per tuple scanned */     startup_cost += path->pathtarget->cost.startup;//加上获取最终投影列的成本     cpu_run_cost += path->pathtarget->cost.per_tuple * path->rows;//加上获取最终投影列的成本      /* Adjust costing for parallelism, if used. */     if (path->parallel_workers > 0)//并行执行     {         double      parallel_divisor = get_parallel_divisor(path);//拆分多少份          /* The CPU cost is divided among all the workers. */         cpu_run_cost /= parallel_divisor;//每一份的成本          /*          * It may be possible to amortize some of the I/O cost, but probably          * not very much, because most operating systems already do aggressive          * prefetching.  For now, we assume that the disk run cost can't be          * amortized at all.          */          /*          * In the case of a parallel plan, the row count needs to represent          * the number of tuples processed per worker.          */         path->rows = clamp_row_est(path->rows / parallel_divisor);//每一份的行数     }      path->startup_cost = startup_cost;//赋值     path->total_cost = startup_cost + cpu_run_cost + disk_run_cost;//总成本=启动 + 执行期的CPU和IO成本 }//-------------------------------- get_tablespace_page_costs /*  * get_tablespace_page_costs  *      Return random and/or sequential page costs for a given tablespace.  *    返回给定表空间的随机/顺序读取page的成本  *  *      This value is not locked by the transaction, so this value may  *      be changed while a SELECT that has used these values for planning  *      is still executing.  */ void get_tablespace_page_costs(Oid spcid,//表空间Oid                           double *spc_random_page_cost,                           double *spc_seq_page_cost) {     TableSpaceCacheEntry *spc = get_tablespace(spcid);      Assert(spc != NULL);      if (spc_random_page_cost)//随机读取     {         if (!spc->opts || spc->opts->random_page_cost < 0)             *spc_random_page_cost = random_page_cost;         else             *spc_random_page_cost = spc->opts->random_page_cost;     }      if (spc_seq_page_cost)//顺序读取     {         if (!spc->opts || spc->opts->seq_page_cost < 0)             *spc_seq_page_cost = seq_page_cost;         else             *spc_seq_page_cost = spc->opts->seq_page_cost;     } }//-------------------------------- get_restriction_qual_cost /*  * get_restriction_qual_cost  *    Compute evaluation costs of a baserel's restriction quals, plus any  *    movable join quals that have been pushed down to the scan.  *    Results are returned into *qpqual_cost.  *    计算base rel限制条件的估算成本,包括被下推到限制条件的连接条件  *  * This is a convenience subroutine that works for seqscans and other cases  * where all the given quals will be evaluated the hard way.  It's not useful  * for cost_index(), for example, where the index machinery takes care of  * some of the quals.  We assume baserestrictcost was previously set by  * set_baserel_size_estimates().  */ static void get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,                           ParamPathInfo *param_info,                           QualCost *qpqual_cost) {     if (param_info)//参数化信息     {         /* Include costs of pushed-down clauses */         cost_qual_eval(qpqual_cost, param_info->ppi_clauses, root);//评估成本          qpqual_cost->startup += baserel->baserestrictcost.startup;         qpqual_cost->per_tuple += baserel->baserestrictcost.per_tuple;     }     else         *qpqual_cost = baserel->baserestrictcost;//没有参数化信息,直接返回 }//------------------- cost_qual_eval /*  * cost_qual_eval  *      Estimate the CPU costs of evaluating a WHERE clause.  *      The input can be either an implicitly-ANDed list of boolean  *      expressions, or a list of RestrictInfo nodes.  (The latter is  *      preferred since it allows caching of the results.)  *      The result includes both a one-time (startup) component,  *      and a per-evaluation component.  */ void cost_qual_eval(QualCost *cost, List *quals, PlannerInfo *root) {     cost_qual_eval_context context;     ListCell   *l;      context.root = root;     context.total.startup = 0;     context.total.per_tuple = 0;      /* We don't charge any cost for the implicit ANDing at top level ... */      foreach(l, quals)//遍历链表     {         Node       *qual = (Node *) lfirst(l);          cost_qual_eval_walker(qual, &context);//遍历表达式     }      *cost = context.total;//返回总成本 } //------------ cost_qual_eval_walker static bool cost_qual_eval_walker(Node *node, cost_qual_eval_context *context) {     if (node == NULL)         return false;      /*      * RestrictInfo nodes contain an eval_cost field reserved for this      * routine's use, so that it's not necessary to evaluate the qual clause's      * cost more than once.  If the clause's cost hasn't been computed yet,      * the field's startup value will contain -1.      */     if (IsA(node, RestrictInfo))//约束条件     {         RestrictInfo *rinfo = (RestrictInfo *) node;          if (rinfo->eval_cost.startup < 0)//未计算成本,初始值为-1         {             cost_qual_eval_context locContext;              locContext.root = context->root;             locContext.total.startup = 0;             locContext.total.per_tuple = 0;              /*              * For an OR clause, recurse into the marked-up tree so that we              * set the eval_cost for contained RestrictInfos too.              */             if (rinfo->orclause)                 cost_qual_eval_walker((Node *) rinfo->orclause, &locContext);//递归OR条件             else                 cost_qual_eval_walker((Node *) rinfo->clause, &locContext);//递归              /*              * If the RestrictInfo is marked pseudoconstant, it will be tested              * only once, so treat its cost as all startup cost.              */             if (rinfo->pseudoconstant)//pseudoconstant标志为T             {                 /* count one execution during startup */                 locContext.total.startup += locContext.total.per_tuple;                 locContext.total.per_tuple = 0;             }             rinfo->eval_cost = locContext.total;         }         context->total.startup += rinfo->eval_cost.startup;         context->total.per_tuple += rinfo->eval_cost.per_tuple;         /* do NOT recurse into children */         return false;     }      /*      * For each operator or function node in the given tree, we charge the      * estimated execution cost given by pg_proc.procost (remember to multiply      * this by cpu_operator_cost).      *      * Vars and Consts are charged zero, and so are boolean operators (AND,      * OR, NOT). Simplistic, but a lot better than no model at all.      *      * Should we try to account for the possibility of short-circuit      * evaluation of AND/OR?  Probably *not*, because that would make the      * results depend on the clause ordering, and we are not in any position      * to expect that the current ordering of the clauses is the one that's      * going to end up being used.  The above per-RestrictInfo caching would      * not mix well with trying to re-order clauses anyway.      *      * Another issue that is entirely ignored here is that if a set-returning      * function is below top level in the tree, the functions/operators above      * it will need to be evaluated multiple times.  In practical use, such      * cases arise so seldom as to not be worth the added complexity needed;      * moreover, since our rowcount estimates for functions tend to be pretty      * phony, the results would also be pretty phony.      */     if (IsA(node, FuncExpr))//函数表达式     {         context->total.per_tuple +=             get_func_cost(((FuncExpr *) node)->funcid) * cpu_operator_cost;//调用get_func_cost函数     }     else if (IsA(node, OpExpr) ||              IsA(node, DistinctExpr) ||              IsA(node, NullIfExpr))//操作符/Distinct/NullIf判断,调用get_func_cost     {         /* rely on struct equivalence to treat these all alike */         set_opfuncid((OpExpr *) node);         context->total.per_tuple +=             get_func_cost(((OpExpr *) node)->opfuncid) * cpu_operator_cost;     }     else if (IsA(node, ScalarArrayOpExpr))//数组     {         /*          * Estimate that the operator will be applied to about half of the          * array elements before the answer is determined.          */         ScalarArrayOpExpr *saop = (ScalarArrayOpExpr *) node;         Node       *arraynode = (Node *) lsecond(saop->args);          set_sa_opfuncid(saop);         context->total.per_tuple += get_func_cost(saop->opfuncid) *             cpu_operator_cost * estimate_array_length(arraynode) * 0.5;     }     else if (IsA(node, Aggref) ||              IsA(node, WindowFunc))//聚合函数或者窗口函数     {         /*          * Aggref and WindowFunc nodes are (and should be) treated like Vars,          * ie, zero execution cost in the current model, because they behave          * essentially like Vars at execution.  We disregard the costs of          * their input expressions for the same reason.  The actual execution          * costs of the aggregate/window functions and their arguments have to          * be factored into plan-node-specific costing of the Agg or WindowAgg          * plan node.          */         return false;           /* 0成本,不再递归到子节点中,don't recurse into children */     }     else if (IsA(node, CoerceViaIO))//CoerceViaIO     {         CoerceViaIO *iocoerce = (CoerceViaIO *) node;         Oid         iofunc;         Oid         typioparam;         bool        typisvarlena;          /* check the result type's input function */         getTypeInputInfo(iocoerce->resulttype,                          &iofunc, &typioparam);         context->total.per_tuple += get_func_cost(iofunc) * cpu_operator_cost;         /* check the input type's output function */         getTypeOutputInfo(exprType((Node *) iocoerce->arg),                           &iofunc, &typisvarlena);         context->total.per_tuple += get_func_cost(iofunc) * cpu_operator_cost;     }     else if (IsA(node, ArrayCoerceExpr))//ArrayCoerceExpr     {         ArrayCoerceExpr *acoerce = (ArrayCoerceExpr *) node;         QualCost    perelemcost;          cost_qual_eval_node(&perelemcost, (Node *) acoerce->elemexpr,                             context->root);         context->total.startup += perelemcost.startup;         if (perelemcost.per_tuple > 0)             context->total.per_tuple += perelemcost.per_tuple *                 estimate_array_length((Node *) acoerce->arg);     }     else if (IsA(node, RowCompareExpr))//RowCompareExpr     {         /* Conservatively assume we will check all the columns */         RowCompareExpr *rcexpr = (RowCompareExpr *) node;         ListCell   *lc;          foreach(lc, rcexpr->opnos)         {             Oid         opid = lfirst_oid(lc);              context->total.per_tuple += get_func_cost(get_opcode(opid)) *                 cpu_operator_cost;         }     }     else if (IsA(node, MinMaxExpr) ||              IsA(node, SQLValueFunction) ||              IsA(node, XmlExpr) ||              IsA(node, CoerceToDomain) ||              IsA(node, NextValueExpr))//最小最大值/SQLValueFunction/XML表达式/CoerceToDomain/NextValueExpr     {         /* Treat all these as having cost 1 */         context->total.per_tuple += cpu_operator_cost;     }     else if (IsA(node, CurrentOfExpr))//CurrentOfExpr     {         /* Report high cost to prevent selection of anything but TID scan */         context->total.startup += disable_cost;//不考虑顺序扫描,使用TID扫描     }     else if (IsA(node, SubLink))     {         /* This routine should not be applied to un-planned expressions */         elog(ERROR, "cannot handle unplanned sub-select");//子链接,报错     }     else if (IsA(node, SubPlan))//子计划     {         /*          * A subplan node in an expression typically indicates that the          * subplan will be executed on each evaluation, so charge accordingly.          * (Sub-selects that can be executed as InitPlans have already been          * removed from the expression.)          */         SubPlan    *subplan = (SubPlan *) node;          context->total.startup += subplan->startup_cost;//直接相加         context->total.per_tuple += subplan->per_call_cost;          /*          * We don't want to recurse into the testexpr, because it was already          * counted in the SubPlan node's costs.  So we're done.          */         return false;     }     else if (IsA(node, AlternativeSubPlan))//AlternativeSubPlan     {         /*          * Arbitrarily use the first alternative plan for costing.  (We should          * certainly only include one alternative, and we don't yet have          * enough information to know which one the executor is most likely to          * use.)          */         AlternativeSubPlan *asplan = (AlternativeSubPlan *) node;          return cost_qual_eval_walker((Node *) linitial(asplan->subplans),                                      context);     }     else if (IsA(node, PlaceHolderVar))//PHV,成本为0     {         /*          * A PlaceHolderVar should be given cost zero when considering general          * expression evaluation costs.  The expense of doing the contained          * expression is charged as part of the tlist eval costs of the scan          * or join where the PHV is first computed (see set_rel_width and          * add_placeholders_to_joinrel).  If we charged it again here, we'd be          * double-counting the cost for each level of plan that the PHV          * bubbles up through.  Hence, return without recursing into the          * phexpr.          */         return false;     }      /* recurse into children */     return expression_tree_walker(node, cost_qual_eval_walker,                                   (void *) context);//递归到子节点中 }//------- get_func_cost /*  * get_func_cost  *      Given procedure id, return the function's procost field.  */ float4 get_func_cost(Oid funcid) {     HeapTuple   tp;     float4      result;      tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));//获取函数Oid     if (!HeapTupleIsValid(tp))         elog(ERROR, "cache lookup failed for function %u", funcid);   //查询数据字典:select proname,procost from pg_proc where procost > 1;     result = ((Form_pg_proc) GETSTRUCT(tp))->procost;//直接获取函数的procost     ReleaseSysCache(tp);     return result; }

三、跟踪分析

启动gdb:

(gdb) b set_base_rel_pathlistsBreakpoint 1 at 0x73bfb5: file allpaths.c, line 296.(gdb) cContinuing.Breakpoint 1, set_base_rel_pathlists (root=0x2fd9418) at allpaths.c:296296   for (rti = 1; rti < root->simple_rel_array_size; rti++)

进入set_plain_rel_pathlist:

(gdb) 452           set_plain_rel_pathlist(root, rel, rte);(gdb) stepset_plain_rel_pathlist (root=0x2fd9418, rel=0x2f98278, rte=0x2eaa5d8) at allpaths.c:704704   required_outer = rel->lateral_relids;

进入create_seqscan_path:

(gdb) stepcreate_seqscan_path (root=0x2fd9418, rel=0x2f98278, required_outer=0x0, parallel_workers=0) at pathnode.c:957957   Path     *pathnode = makeNode(Path);(gdb) 969   cost_seqscan(pathnode, root, rel, pathnode->param_info);(gdb) p *pathnode$2 = {type = T_Path, pathtype = T_SeqScan, parent = 0x2f98278, pathtarget = 0x2f98488, param_info = 0x0,   parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 0, startup_cost = 0, total_cost = 0,   pathkeys = 0x0}

进入cost_seqscan:

...(gdb) 230     path->rows = baserel->rows;#rows的获得上一节已作介绍(gdb) p baserel->rows$4 = 10926...#表空间顺序扫描的成本(gdb) p spc_seq_page_cost$5 = 1#IO成本(gdb) p disk_run_cost$6 = 726

进入get_restriction_qual_cost:

(gdb) stepget_restriction_qual_cost (root=0x2fd9418, baserel=0x2f98278, param_info=0x0, qpqual_cost=0x7ffe12ca4620) at costsize.c:39993999    if (param_info)#没有参数化信息,直接使用baserel->baserestrictcost(gdb) n4008      *qpqual_cost = baserel->baserestrictcost;(gdb) p baserel->baserestrictcost$7 = {startup = 0, per_tuple = 0.0050000000000000001}

回到cost_seqscan

(gdb) cost_seqscan (path=0x2f98990, root=0x2fd9418, baserel=0x2f98278, param_info=0x0) at costsize.c:248248   startup_cost += qpqual_cost.startup;...

执行cost_seqscan,最终的path:

(gdb) p *path$8 = {type = T_Path, pathtype = T_SeqScan, parent = 0x2f98278, pathtarget = 0x2f98488, param_info = 0x0,   parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 10926, startup_cost = 0, total_cost = 2226,   pathkeys = 0x0}(gdb) p cpu_run_cost$9 = 1500(gdb) p disk_run_cost$10 = 726

回到上层函数:

(gdb) ncreate_seqscan_path (root=0x2fd9418, rel=0x2f98278, required_outer=0x0, parallel_workers=0) at pathnode.c:971971   return pathnode;(gdb) 972 }(gdb) set_plain_rel_pathlist (root=0x2fd9418, rel=0x2f98278, rte=0x2eaa5d8) at allpaths.c:710710   if (rel->consider_parallel && required_outer == NULL)

继续执行构建索引扫描路径/TID扫描路径函数:

714   create_index_paths(root, rel);(gdb) 717   create_tidscan_paths(root, rel);(gdb) n718 }

索引扫描路径的结果,rows = 10926, startup_cost = 324.40899999999999,total_cost = 1214.299

(gdb) p *rel->pathlist$14 = {type = T_List, length = 1, head = 0x2fe8d10, tail = 0x2fe8d10}(gdb) p *(Path *)rel->pathlist->head->data.ptr_value$15 = {type = T_BitmapHeapPath, pathtype = T_BitmapHeapScan, parent = 0x2f98278, pathtarget = 0x2f98488, param_info = 0x0,   parallel_aware = false, parallel_safe = true, parallel_workers = 0, rows = 10926, startup_cost = 324.40899999999999,   total_cost = 1214.299, pathkeys = 0x0}

结束调用

(gdb) set_base_rel_pathlists (root=0x2fd9418) at allpaths.c:296296   for (rti = 1; rti < root->simple_rel_array_size; rti++)(gdb) 312 }(gdb) make_one_rel (root=0x2fd9418, joinlist=0x2f985d8) at allpaths.c:185185   rel = make_rel_from_joinlist(root, joinlist);#DONE!

相应的SQL执行计划,cost=324.41..1214.30请参照索引扫描路径结果(这部分源码下一节分析):

testdb=# explain analyze verbose select t1.* from t_dwxx t1 where dwbh > '10000' and dwbh < '20000';                                                           QUERY PLAN                                                           -------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.t_dwxx t1  (cost=324.41..1214.30 rows=10926 width=23) (actual time=3.196..4.959 rows=11111 loops=1)   Output: dwmc, dwbh, dwdz   Recheck Cond: (((t1.dwbh)::text > '10000'::text) AND ((t1.dwbh)::text < '20000'::text))   Heap Blocks: exact=85   ->  Bitmap Index Scan on t_dwxx_pkey  (cost=0.00..321.68 rows=10926 width=0) (actual time=3.159..3.159 rows=11111 loops=1)         Index Cond: (((t1.dwbh)::text > '10000'::text) AND ((t1.dwbh)::text < '20000'::text)) Planning Time: 0.315 ms Execution Time: 5.673 ms(8 rows)

感谢各位的阅读,以上就是"PostgreSQL的set_base_rel_pathlists函数及其子函数分析"的内容了,经过本文的学习后,相信大家对PostgreSQL的set_base_rel_pathlists函数及其子函数分析这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是,小编将为大家推送更多相关知识点的文章,欢迎关注!

0