千家信息网

PostgreSQL源码中NOT IN的作用是什么

发表于:2025-01-24 作者:千家信息网编辑
千家信息网最后更新 2025年01月24日,本篇内容主要讲解"PostgreSQL源码中NOT IN的作用是什么",感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让小编来带大家学习"PostgreSQL源码中NOT IN的
千家信息网最后更新 2025年01月24日PostgreSQL源码中NOT IN的作用是什么

本篇内容主要讲解"PostgreSQL源码中NOT IN的作用是什么",感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷,实用性强。下面就让小编来带大家学习"PostgreSQL源码中NOT IN的作用是什么"吧!

一、数据结构

SubPlanState
子计划运行期状态

/* ---------------- *        SubPlanState node * ---------------- */typedef struct SubPlanState{    NodeTag        type;    SubPlan    *subplan;        /* expression plan node */    struct PlanState *planstate;    /* subselect plan's state tree */    struct PlanState *parent;    /* parent plan node's state tree */    ExprState  *testexpr;        /* 组合表达式状态;state of combining expression */    List       *args;            /* 参数表达式状态;states of argument _expression(s) */    HeapTuple    curTuple;        /* subplan最近的元组;copy of most recent tuple from subplan */    Datum        curArray;        /* most recent array from ARRAY() subplan */    /* these are used when hashing the subselect's output: */    TupleDesc    descRight;        /* 投影后的子查询描述符;subselect desc after projection */    ProjectionInfo *projLeft;    /* for projecting lefthand exprs */    ProjectionInfo *projRight;    /* for projecting subselect output */    TupleHashTable hashtable;    /* hash table for no-nulls subselect rows */    TupleHashTable hashnulls;    /* hash table for rows with null(s) */    bool        havehashrows;    /* true if hashtable is not empty */    bool        havenullrows;    /* true if hashnulls is not empty */    MemoryContext hashtablecxt; /* memory context containing hash tables */    MemoryContext hashtempcxt;    /* temp memory context for hash tables */    ExprContext *innerecontext; /* econtext for computing inner tuples */    AttrNumber *keyColIdx;        /* control data for hash tables */    Oid           *tab_eq_funcoids;    /* equality func oids for table                                     * datatype(s) */    Oid           *tab_collations; /* collations for hash and comparison */    FmgrInfo   *tab_hash_funcs; /* hash functions for table datatype(s) */    FmgrInfo   *tab_eq_funcs;    /* equality functions for table datatype(s) */    FmgrInfo   *lhs_hash_funcs; /* hash functions for lefthand datatype(s) */    FmgrInfo   *cur_eq_funcs;    /* equality functions for LHS vs. table */    ExprState  *cur_eq_comp;    /* equality comparator for LHS vs. table */} SubPlanState;

SubPlan
子查询计划

/* * SubPlan - executable expression node for a subplan (sub-SELECT) * * The planner replaces SubLink nodes in expression trees with SubPlan * nodes after it has finished planning the subquery.  SubPlan references * a sub-plantree stored in the subplans list of the toplevel PlannedStmt. * (We avoid a direct link to make it easier to copy expression trees * without causing multiple processing of the subplan.) * 查询规划器在完成子查询的规划后使用SubPlan节点替换表达式树中的SubLink节点。 * SubPlan引用了存储在高层PlannedStmt中的subplans链表中的sub-plantree。 * (避免使用直接链接,从而使得拷贝表达式树相对比较简单) * * In an ordinary subplan, testexpr points to an executable expression * (OpExpr, an AND/OR tree of OpExprs, or RowCompareExpr) for the combining * operator(s); the left-hand arguments are the original lefthand expressions, * and the right-hand arguments are PARAM_EXEC Param nodes representing the * outputs of the sub-select.  (NOTE: runtime coercion functions may be * inserted as well.)  This is just the same expression tree as testexpr in * the original SubLink node, but the PARAM_SUBLINK nodes are replaced by * suitably numbered PARAM_EXEC nodes. * 常规情况下,testexpr指向用于组合操作的可执行表达式(OpExpr、OpExprs的AND/OR树或者RowCompareExpr); * 左参数是原始的左表达式,右参数是PARAM_EXEC参数节点用以表示子查询的输出。 * 与原始SubLink节点的testexpr具有相同的表达式树,但PARAM_SUBLINK节点则使用合适的已编号PARAM_EXEC节点替代。 * * If the sub-select becomes an initplan rather than a subplan, the executable * expression is part of the outer plan's expression tree (and the SubPlan * node itself is not, but rather is found in the outer plan's initPlan * list).  In this case testexpr is NULL to avoid duplication. * 如果子查询成了initplan而不是subplan,可执行的表达式是外层plan表达式树的一部分。 * 这种情况下,testexpr为NULL以避免重复。 * * The planner also derives lists of the values that need to be passed into * and out of the subplan.  Input values are represented as a list "args" of * expressions to be evaluated in the outer-query context (currently these * args are always just Vars, but in principle they could be any expression). * The values are assigned to the global PARAM_EXEC params indexed by parParam * (the parParam and args lists must have the same ordering).  setParam is a * list of the PARAM_EXEC params that are computed by the sub-select, if it * is an initplan; they are listed in order by sub-select output column * position.  (parParam and setParam are integer Lists, not Bitmapsets, * because their ordering is significant.) * 规划器还派生了需要传入和传出子计划的值的链表。 * 输入值标识位表达式的"args"链表,在外层查询上下文中进行解析。 * (这些args通常是Vars,但原则上它们可以是任意表达式) * 这些值以parParam为索引给全局PARAM_EXEC参数赋值。 * setParam是PARAM_EXEC参数链表,通过子查询(如为initplan)计算所得。 * 它们按子查询输出列的位置进行排序组织为链表形式。 * (parParam和setParam是整型链表,而不是Bitmapsets链表) * * Also, the planner computes startup and per-call costs for use of the * SubPlan.  Note that these include the cost of the subquery proper, * evaluation of the testexpr if any, and any hashtable management overhead. * 同时,规划器计算SubPlan启动和每次调用的成本。注意:包括子查询正常解析testexpr的成本以及哈希表管理成本。 */typedef struct SubPlan{    Expr        xpr;//表达式    /* Fields copied from original SubLink: */    //从SubLink中拷贝而来    SubLinkType subLinkType;    /* see above */    /* The combining operators, transformed to an executable expression: */    //组合操作符,转换为可执行的表达式    Node       *testexpr;        /* OpExpr or RowCompareExpr expression tree */    List       *paramIds;        /* 参数IDs;IDs of Params embedded in the above */    /* Identification of the Plan tree to use: */    //Plan tree标识    int            plan_id;        /* Index (from 1) in PlannedStmt.subplans */    /* Identification of the SubPlan for EXPLAIN and debugging purposes: */    //EXPLAIN和debug目的的SubPlan标识    char       *plan_name;        /* A name assigned during planning */    /* Extra data useful for determining subplan's output type: */    //用于确定subplan输出类型的额外信息    Oid            firstColType;    /* subplan结果的第一个列类型;Type of first column of subplan result */    int32        firstColTypmod; /* 第一列的Typmod;Typmod of first column of subplan result */    Oid            firstColCollation;    /* 第一列的Collation;Collation of first column of subplan                                     * result */    /* Information about execution strategy: */    //执行阶段的相关信息    bool        useHashTable;    /* 是否使用哈希表存储子查询输出;true to store subselect output in a hash                                 * table (implies we are doing "IN") */    bool        unknownEqFalse; /* 如OK为T,如为未知则为F;快速处理null值;true if it's okay to return FALSE when the                                 * spec result is UNKNOWN; this allows much                                 * simpler handling of null values */    bool        parallel_safe;    /* 是否并行安全?is the subplan parallel-safe? */    /* Note: parallel_safe does not consider contents of testexpr or args */    /* Information for passing params into and out of the subselect: */    //用于给子查询传入和传出参数的信息    /* setParam and parParam are lists of integers (param IDs) */    //setParam和parParam是整型链表(param IDs)    List       *setParam;        /* initplan subqueries have to set these                                 * Params for parent plan */    List       *parParam;        /* indices of input Params from parent plan */    List       *args;            /* 以parParam值进行传递的表达式;exprs to pass as parParam values */    /* Estimated execution costs: */    //估算执行成本    Cost        startup_cost;    /* one-time setup cost */    Cost        per_call_cost;    /* cost for each subplan evaluation */} SubPlan;

SubLinkType
SubLink类型

/* * SubLink * * A SubLink represents a subselect appearing in an expression, and in some * cases also the combining operator(s) just above it.  The subLinkType * indicates the form of the expression represented: *    EXISTS_SUBLINK        EXISTS(SELECT ...) *    ALL_SUBLINK            (lefthand) op ALL (SELECT ...) *    ANY_SUBLINK            (lefthand) op ANY (SELECT ...) *    ROWCOMPARE_SUBLINK    (lefthand) op (SELECT ...) *    EXPR_SUBLINK        (SELECT with single targetlist item ...) *    MULTIEXPR_SUBLINK    (SELECT with multiple targetlist items ...) *    ARRAY_SUBLINK        ARRAY(SELECT with single targetlist item ...) *    CTE_SUBLINK            WITH query (never actually part of an expression) * For ALL, ANY, and ROWCOMPARE, the lefthand is a list of expressions of the * same length as the subselect's targetlist.  ROWCOMPARE will *always* have * a list with more than one entry; if the subselect has just one target * then the parser will create an EXPR_SUBLINK instead (and any operator * above the subselect will be represented separately). * ROWCOMPARE, EXPR, and MULTIEXPR require the subselect to deliver at most * one row (if it returns no rows, the result is NULL). * ALL, ANY, and ROWCOMPARE require the combining operators to deliver boolean * results.  ALL and ANY combine the per-row results using AND and OR * semantics respectively. * ARRAY requires just one target column, and creates an array of the target * column's type using any number of rows resulting from the subselect. * * SubLink is classed as an Expr node, but it is not actually executable; * it must be replaced in the expression tree by a SubPlan node during * planning. * * NOTE: in the raw output of gram.y, testexpr contains just the raw form * of the lefthand _expression (if any), and operName is the String name of * the combining operator.  Also, subselect is a raw parsetree.  During parse * analysis, the parser transforms testexpr into a complete boolean expression * that compares the lefthand value(s) to PARAM_SUBLINK nodes representing the * output columns of the subselect.  And subselect is transformed to a Query. * This is the representation seen in saved rules and in the rewriter. * * In EXISTS, EXPR, MULTIEXPR, and ARRAY SubLinks, testexpr and operName * are unused and are always null. * * subLinkId is currently used only for MULTIEXPR SubLinks, and is zero in * other SubLinks.  This number identifies different multiple-assignment * subqueries within an UPDATE statement's SET list.  It is unique only * within a particular targetlist.  The output column(s) of the MULTIEXPR * are referenced by PARAM_MULTIEXPR Params appearing elsewhere in the tlist. * * The CTE_SUBLINK case never occurs in actual SubLink nodes, but it is used * in SubPlans generated for WITH subqueries. */typedef enum SubLinkType{    EXISTS_SUBLINK,    ALL_SUBLINK,    ANY_SUBLINK,    ROWCOMPARE_SUBLINK,    EXPR_SUBLINK,    MULTIEXPR_SUBLINK,    ARRAY_SUBLINK,    CTE_SUBLINK                    /* for SubPlans only */} SubLinkType;

SubLink
SubLink结构体

typedef struct SubLink{    Expr        xpr;    SubLinkType subLinkType;    /* see above */    int            subLinkId;        /* ID (1..n); 0 if not MULTIEXPR */    Node       *testexpr;        /* outer-query test for ALL/ANY/ROWCOMPARE */    List       *operName;        /* originally specified operator name */    Node       *subselect;        /* subselect as Query* or raw parsetree */    int            location;        /* token location, or -1 if unknown */} SubLink;

二、源码解读

ExecScanSubPlan

/* * ExecScanSubPlan: default case where we have to rescan subplan each time * 默认情况下每次都不得不重新扫描subplan */static DatumExecScanSubPlan(SubPlanState *node,                ExprContext *econtext,                bool *isNull){    SubPlan    *subplan = node->subplan;//子计划    PlanState  *planstate = node->planstate;//计划运行期状态    SubLinkType subLinkType = subplan->subLinkType;//子链接类型    MemoryContext oldcontext;//原内存上下文    TupleTableSlot *slot;//元组slot    Datum        result;//结果指针    bool        found = false;    /* 如找到至少一个元组,则返回T;true if got at least one subplan tuple */    ListCell   *pvar;//临时变量    ListCell   *l;//临时变量    ArrayBuildStateAny *astate = NULL;//    /*     * MULTIEXPR subplans, when "executed", just return NULL; but first we     * mark the subplan's output parameters as needing recalculation.  (This     * is a bit of a hack: it relies on the subplan appearing later in its     * targetlist than any of the referencing Params, so that all the Params     * have been evaluated before we re-mark them for the next evaluation     * cycle.  But in general resjunk tlist items appear after non-resjunk     * ones, so this should be safe.)  Unlike ExecReScanSetParamPlan, we do     * *not* set bits in the parent plan node's chgParam, because we don't     * want to cause a rescan of the parent.     *     * MULTIEXPR处理逻辑     */    if (subLinkType == MULTIEXPR_SUBLINK)    {        EState       *estate = node->parent->state;        foreach(l, subplan->setParam)        {            int            paramid = lfirst_int(l);            ParamExecData *prm = &(estate->es_param_exec_vals[paramid]);            prm->execPlan = node;        }        *isNull = true;        return (Datum) 0;    }    /* Initialize ArrayBuildStateAny in caller's context, if needed */    //数组    if (subLinkType == ARRAY_SUBLINK)        astate = initArrayResultAny(subplan->firstColType,                                    CurrentMemoryContext, true);    /*     * We are probably in a short-lived expression-evaluation context. Switch     * to the per-query context for manipulating the child plan's chgParam,     * calling ExecProcNode on it, etc.     */    //切换上下文    oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);    /*     * Set Params of this plan from parent plan correlation values. (Any     * calculation we have to do is done in the parent econtext, since the     * Param values don't need to have per-query lifetime.)     */    //通过父计划相关值中设置子计划参数    Assert(list_length(subplan->parParam) == list_length(node->args));    forboth(l, subplan->parParam, pvar, node->args)    {        int            paramid = lfirst_int(l);        ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);        prm->value = ExecEvalExprSwitchContext((ExprState *) lfirst(pvar),                                               econtext,                                               &(prm->isnull));        planstate->chgParam = bms_add_member(planstate->chgParam, paramid);    }    /*     * Now that we've set up its parameters, we can reset the subplan.     */    //执行ReScan    //Reset a plan node so that its output can be re-scanned.    ExecReScan(planstate);    /*     * For all sublink types except EXPR_SUBLINK and ARRAY_SUBLINK, the result     * is boolean as are the results of the combining operators. We combine     * results across tuples (if the subplan produces more than one) using OR     * semantics for ANY_SUBLINK or AND semantics for ALL_SUBLINK.     * (ROWCOMPARE_SUBLINK doesn't allow multiple tuples from the subplan.)     * NULL results from the combining operators are handled according to the     * usual SQL semantics for OR and AND.  The result for no input tuples is     * FALSE for ANY_SUBLINK, TRUE for ALL_SUBLINK, NULL for     * ROWCOMPARE_SUBLINK.     * 除EXPR_SUBLINK和ARRAY_SUBLINK外的所有sublink,结果是布尔值(组合运算符的结果).     * PG通过跨元组(如子计划产生多个元组)合并结果,对于ANY_SUBLINK使用OR语义,ALL_SUBLINK则使用AND语义.     * (ROWCOMPARE_SUBLINK不允许子计划返回多个元组)     * 从组合操作符中返回的NULL遵循SQL中的OR和AND语义.     * 如没有输入元组,ANY_SUBLINK为FALSE,ALL_SUBLINK为TRUE,ROWCOMPARE_SUBLINK为NULL.     *     * For EXPR_SUBLINK we require the subplan to produce no more than one     * tuple, else an error is raised.  If zero tuples are produced, we return     * NULL.  Assuming we get a tuple, we just use its first column (there can     * be only one non-junk column in this case).     * 对于EXPR_SUBLINK,需要subplan产生不超过一个元组,否则报错.如果没有元组产生,返回NULL.     * 假定获取到一个元组,则使用第一个列(这种情况下只有一个non-junk列).     *     * For ARRAY_SUBLINK we allow the subplan to produce any number of tuples,     * and form an array of the first column's values.  Note in particular     * that we produce a zero-element array if no tuples are produced (this is     * a change from pre-8.3 behavior of returning NULL).     * 对于ARRAY_SUBLINK,允许subplan产生任意数目的元组,使用第一个列值组成数组.     * 特别注意的是如没有元组产生则产生0个元素的数组(8.3以前是返回NULL).     */    result = BoolGetDatum(subLinkType == ALL_SUBLINK);//ALL为T,否则为F    *isNull = false;    for (slot = ExecProcNode(planstate);         !TupIsNull(slot);         slot = ExecProcNode(planstate))//循环获取元组,直至没有元组为NULL(即已完成)    {        //元组描述符        TupleDesc    tdesc = slot->tts_tupleDescriptor;        Datum        rowresult;//结果        bool        rownull;//是否为空?        int            col;//列计数器        ListCell   *plst;//临时变量        if (subLinkType == EXISTS_SUBLINK)//EXISTS        {            found = true;            result = BoolGetDatum(true);            break;        }        if (subLinkType == EXPR_SUBLINK)//EXPR表达式        {            /* cannot allow multiple input tuples for EXPR sublink */            if (found)                ereport(ERROR,                        (errcode(ERRCODE_CARDINALITY_VIOLATION),                         errmsg("more than one row returned by a subquery used as an expression")));            found = true;            /*             * We need to copy the subplan's tuple in case the result is of             * pass-by-ref type --- our return value will point into this             * copied tuple!  Can't use the subplan's instance of the tuple             * since it won't still be valid after next ExecProcNode() call.             * node->curTuple keeps track of the copied tuple for eventual             * freeing.             */            if (node->curTuple)                heap_freetuple(node->curTuple);            node->curTuple = ExecCopySlotHeapTuple(slot);            result = heap_getattr(node->curTuple, 1, tdesc, isNull);            /* keep scanning subplan to make sure there's only one tuple */            continue;        }        if (subLinkType == ARRAY_SUBLINK)//数组        {            Datum        dvalue;            bool        disnull;            found = true;            /* stash away current value */            Assert(subplan->firstColType == TupleDescAttr(tdesc, 0)->atttypid);            dvalue = slot_getattr(slot, 1, &disnull);            astate = accumArrayResultAny(astate, dvalue, disnull,                                         subplan->firstColType, oldcontext);            /* keep scanning subplan to collect all values */            continue;        }        /* cannot allow multiple input tuples for ROWCOMPARE sublink either */        if (subLinkType == ROWCOMPARE_SUBLINK && found)//行比较            ereport(ERROR,                    (errcode(ERRCODE_CARDINALITY_VIOLATION),                     errmsg("more than one row returned by a subquery used as an expression")));        found = true;//初始为T        /*         * For ALL, ANY, and ROWCOMPARE sublinks, load up the Params         * representing the columns of the sub-select, and then evaluate the         * combining expression.         * 对于ALL,ANY和ROWCOMPARE子链接,加载表示子查询列的Params,并解析组合表达式         */        col = 1;//列从1计数        foreach(plst, subplan->paramIds)//循环遍历子查询参数        {            int            paramid = lfirst_int(plst);            ParamExecData *prmdata;            prmdata = &(econtext->ecxt_param_exec_vals[paramid]);            Assert(prmdata->execPlan == NULL);            //获取参数值            prmdata->value = slot_getattr(slot, col, &(prmdata->isnull));            //下一个列            col++;        }        //解析表达式        rowresult = ExecEvalExprSwitchContext(node->testexpr, econtext,                                              &rownull);        if (subLinkType == ANY_SUBLINK)        {            //ANY : 使用OR语义组合            /* combine across rows per OR semantics */            if (rownull)                *isNull = true;            else if (DatumGetBool(rowresult))            {                result = BoolGetDatum(true);                *isNull = false;                break;            /* needn't look at any more rows */            }        }        else if (subLinkType == ALL_SUBLINK)        {            //ALL : 使用AND语义            /* combine across rows per AND semantics */            if (rownull)                *isNull = true;            else if (!DatumGetBool(rowresult))            {                result = BoolGetDatum(false);                *isNull = false;                break;            /* needn't look at any more rows */            }        }        else        {            /* must be ROWCOMPARE_SUBLINK */            //这里一定是ROWCOMPARE            result = rowresult;            *isNull = rownull;        }    }    MemoryContextSwitchTo(oldcontext);    if (subLinkType == ARRAY_SUBLINK)    {        /* We return the result in the caller's context */        //在调用者上下文中返回结果        result = makeArrayResultAny(astate, oldcontext, true);    }    else if (!found)    {        /*         * deal with empty subplan result.  result/isNull were previously         * initialized correctly for all sublink types except EXPR and         * ROWCOMPARE; for those, return NULL.         * subplan没有结果返回,设置result&isNull值         */        if (subLinkType == EXPR_SUBLINK ||            subLinkType == ROWCOMPARE_SUBLINK)        {            result = (Datum) 0;            *isNull = true;        }    }    //返回结果    return result;}

ExecProcNode
执行输入参数node的ExecProcNode方法。

/* ---------------------------------------------------------------- *        ExecProcNode * *        Execute the given node to return a(nother) tuple. * ---------------------------------------------------------------- */#ifndef FRONTENDstatic inline TupleTableSlot *ExecProcNode(PlanState *node){    if (node->chgParam != NULL) /* something changed? */        ExecReScan(node);        /* let ReScan handle this */    return node->ExecProcNode(node);}#endif

三、跟踪分析

执行SQL:

[pg12@localhost ~]$ psql -d testdbTiming is on.Expanded display is used automatically.psql (12.0)Type "help" for help.[local]:5432 pg12@testdb=# [local]:5432 pg12@testdb=# select * from tbl; id | value ----+-------  1 |     2(1 row)Time: 2.678 ms[local]:5432 pg12@testdb=# select count(*) from t_big_null;  count   ---------- 10000001(1 row)Time: 679.972 ms[local]:5432 pg12@testdb=# analyze tbl;ANALYZETime: 64.442 ms[local]:5432 pg12@testdb=# analyze t_big_null;ANALYZETime: 434.702 ms[local]:5432 pg12@testdb=# [local]:5432 pg12@testdb=# select pg_backend_pid(); pg_backend_pid ----------------          18758(1 row)Time: 1.990 ms[local]:5432 pg12@testdb=# select * from tbl a where a.id not in (select b.id from t_big_null b);

启动gdb跟踪

(gdb) b ExecScanSubPlanBreakpoint 1 at 0x73014b: file nodeSubplan.c, line 228.(gdb) cContinuing.Breakpoint 1, ExecScanSubPlan (node=0x1c89158, econtext=0x1c88990, isNull=0x1c88cad)    at nodeSubplan.c:228228        SubPlan    *subplan = node->subplan;(gdb) (gdb) n229        PlanState  *planstate = node->planstate;(gdb) 230        SubLinkType subLinkType = subplan->subLinkType;(gdb) 234        bool        found = false;    /* true if got at least one subplan tuple */

subplan变量值(结构体SubPlan)
其中testexpr是测试表达式,操作符是整型的等值比较,
左操作符是Var,数据表的第一个列,类型为23-int4(select * from pg_type where oid=23;)
有操作符是Param,执行期设置,对应参数索引编号为0,类型为23-int4
paramIds是链表,第一项ID值为0(int_value = 0)
子查询结果的第一个列类型为23-int4

(gdb) p *subplan$1 = {xpr = {type = T_SubPlan}, subLinkType = ANY_SUBLINK, testexpr = 0x1cb8790,   paramIds = 0x1cb8758, plan_id = 1, plan_name = 0x1cb8a80 "SubPlan 1",   firstColType = 23, firstColTypmod = -1, firstColCollation = 0, useHashTable = false,   unknownEqFalse = false, parallel_safe = true, setParam = 0x0, parParam = 0x0,   args = 0x0, startup_cost = 0, per_call_cost = 129155.31875000001}(gdb) p *subplan->testexpr$2 = {type = T_OpExpr}(gdb) p *(OpExpr *)subplan->testexpr$3 = {xpr = {type = T_OpExpr}, opno = 96, opfuncid = 65, opresulttype = 16,   opretset = false, opcollid = 0, inputcollid = 0, args = 0x1cb8868, location = 31}(gdb) set $expr=(OpExpr *)subplan->testexpr(gdb) p *$expr->args$4 = {type = T_List, length = 2, head = 0x1cb8840, tail = 0x1cb88d8}(gdb) p *$expr->args->head$5 = {data = {ptr_value = 0x1cb87e8, int_value = 30115816, oid_value = 30115816},   next = 0x1cb88d8}(gdb) p *(Node *)$expr->args->head->data.ptr_value$6 = {type = T_Var}(gdb) p *(Var *)$expr->args->head->data.ptr_value$7 = {xpr = {type = T_Var}, varno = 1, varattno = 1, vartype = 23, vartypmod = -1,   varcollid = 0, varlevelsup = 0, varnoold = 1, varoattno = 1, location = 26}(gdb) p *(Var *)$expr->args->head->next->data.ptr_value$8 = {xpr = {type = T_Param}, varno = 1, varattno = 0, vartype = 23, vartypmod = -1,   varcollid = 0, varlevelsup = 4294967295, varnoold = 2139062142, varoattno = 16,   location = 0}(gdb) p *(Param *)$expr->args->head->next->data.ptr_value$9 = {xpr = {type = T_Param}, paramkind = PARAM_EXEC, paramid = 0, paramtype = 23,   paramtypmod = -1, paramcollid = 0, location = -1}(gdb) ###[local]:5432 pg12@testdb=# select * from pg_proc where oid = 65; --> opfuncid = 65,函数-[ RECORD 1 ]---+-------oid             | 65proname         | int4eqpronamespace    | 11proowner        | 10prolang         | 12procost         | 1prorows         | 0provariadic     | 0prosupport      | -prokind         | fprosecdef       | fproleakproof    | tproisstrict     | tproretset       | fprovolatile     | iproparallel     | spronargs        | 2pronargdefaults | 0prorettype      | 16proargtypes     | 23 23proallargtypes  | proargmodes     | proargnames     | proargdefaults  | protrftypes     | prosrc          | int4eqprobin          | proconfig       | proacl          | Time: 6.253 ms###(gdb) p *subplan->paramIds$11 = {type = T_IntList, length = 1, head = 0x1cb8730, tail = 0x1cb8730}(gdb) p *subplan->paramIds->head$12 = {data = {ptr_value = 0x7f7f7f7f00000000, int_value = 0, oid_value = 0}, next = 0x0}(gdb)

planstate变量值,实际的执行节点是ExecMaterial

(gdb) p *planstate$15 = {type = T_MaterialState, plan = 0x1cb8900, state = 0x1c87da8,   ExecProcNode = 0x6f802a ,   ExecProcNodeReal = 0x720ecf , instrument = 0x0, worker_instrument = 0x0,   worker_jit_instrument = 0x0, qual = 0x0, lefttree = 0x1c88160, righttree = 0x0,   initPlan = 0x0, subPlan = 0x0, chgParam = 0x0, ps_ResultTupleDesc = 0x1c88580,   ps_ResultTupleSlot = 0x1c88698, ps_ExprContext = 0x0, ps_ProjInfo = 0x0,   scandesc = 0x1c88468, scanops = 0xc3e720 , outerops = 0x0,   innerops = 0x0, resultops = 0xc3e720 , scanopsfixed = true,   outeropsfixed = false, inneropsfixed = false, resultopsfixed = true, scanopsset = true,   outeropsset = false, inneropsset = false, resultopsset = true}(gdb)

subLinkType变量,类型为ANY_SUBLINK,把NOT IN转为ANY_SUBLINK

(gdb) p subLinkType$16 = ANY_SUBLINK

设置变量,执行相关检查等

(gdb) n237        ArrayBuildStateAny *astate = NULL;(gdb) 250        if (subLinkType == MULTIEXPR_SUBLINK)(gdb) 266        if (subLinkType == ARRAY_SUBLINK)(gdb) 275        oldcontext = MemoryContextSwitchTo(econtext->ecxt_per_query_memory);(gdb) 282        Assert(list_length(subplan->parParam) == list_length(node->args));(gdb) 284        forboth(l, subplan->parParam, pvar, node->args)(gdb) p subplan->parParam$17 = (List *) 0x0

执行ExecReScan,planstate->ExecProcNode实际为ExecMaterial

(gdb) n298        ExecReScan(planstate);(gdb) p *planstate$20 = {type = T_MaterialState, plan = 0x1cb8900, state = 0x1c87da8,   ExecProcNode = 0x720ecf , ExecProcNodeReal = 0x720ecf ,   instrument = 0x0, worker_instrument = 0x0, worker_jit_instrument = 0x0, qual = 0x0,   lefttree = 0x1c88160, righttree = 0x0, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0,   ps_ResultTupleDesc = 0x1c88580, ps_ResultTupleSlot = 0x1c88698, ps_ExprContext = 0x0,   ps_ProjInfo = 0x0, scandesc = 0x1c88468, scanops = 0xc3e720 ,   outerops = 0x0, innerops = 0x0, resultops = 0xc3e720 ,   scanopsfixed = true, outeropsfixed = false, inneropsfixed = false,   resultopsfixed = true, scanopsset = true, outeropsset = false, inneropsset = false,   resultopsset = true}(gdb)

初始化result&isNull变量

(gdb) n321        result = BoolGetDatum(subLinkType == ALL_SUBLINK);(gdb) 322        *isNull = false;(gdb) p result$18 = 0

执行循环,slot从planstate->ExecProcNode(即ExecMaterial)中获取

(gdb) n324        for (slot = ExecProcNode(planstate);(gdb) 325             !TupIsNull(slot);(gdb) (gdb) p *slot$22 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 0,   tts_ops = 0xc3e720 , tts_tupleDescriptor = 0x1c88580,   tts_values = 0x1c88708, tts_isnull = 0x1c88710, tts_mcxt = 0x1c87c90, tts_tid = {    ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}(gdb) p slot->tts_values[0]$23 = 0(gdb)

获取slot的描述符

(gdb) 328            TupleDesc    tdesc = slot->tts_tupleDescriptor;(gdb) p tdesc$21 = (TupleDesc) 0x206e6f6900000000

判断subLink类型执行相关逻辑,循环paramIds,填充参数值

(gdb) n334            if (subLinkType == EXISTS_SUBLINK)(gdb) 341            if (subLinkType == EXPR_SUBLINK)(gdb) 367            if (subLinkType == ARRAY_SUBLINK)(gdb) 383            if (subLinkType == ROWCOMPARE_SUBLINK && found)(gdb) 388            found = true;(gdb) 395            col = 1;(gdb) 396            foreach(plst, subplan->paramIds)(gdb) p *slot$22 = {type = T_TupleTableSlot, tts_flags = 20, tts_nvalid = 0,   tts_ops = 0xc3e720 , tts_tupleDescriptor = 0x1c88580,   tts_values = 0x1c88708, tts_isnull = 0x1c88710, tts_mcxt = 0x1c87c90, tts_tid = {    ip_blkid = {bi_hi = 65535, bi_lo = 65535}, ip_posid = 0}, tts_tableOid = 0}(gdb) p *slot->tts_values[0]Cannot access memory at address 0x0(gdb) p slot->tts_values[0]$23 = 0(gdb) n398                int            paramid = lfirst_int(plst);(gdb) 401                prmdata = &(econtext->ecxt_param_exec_vals[paramid]);(gdb) p paramid$24 = 0(gdb) n402                Assert(prmdata->execPlan == NULL);(gdb) p *prmdata$25 = {execPlan = 0x0, value = 0, isnull = false}(gdb) n403                prmdata->value = slot_getattr(slot, col, &(prmdata->isnull));(gdb) n404                col++;(gdb) p *prmdata$26 = {execPlan = 0x0, value = 2, isnull = false}(gdb) n396            foreach(plst, subplan->paramIds)(gdb)

解析表达式,亦即解析表达式tbl.id = 2,获取结果

(gdb) 407            rowresult = ExecEvalExprSwitchContext(node->testexpr, econtext,(gdb) p *(Var *)$expr->args->head->data.ptr_value$28 = {xpr = {type = T_Var}, varno = 1, varattno = 1, vartype = 23, vartypmod = -1,   varcollid = 0, varlevelsup = 0, varnoold = 1, varoattno = 1, location = 26}(gdb)  p *(Param *)$expr->args->head->next->data.ptr_value$29 = {xpr = {type = T_Param}, paramkind = PARAM_EXEC, paramid = 0, paramtype = 23,   paramtypmod = -1, paramcollid = 0, location = -1}(gdb)  p econtext->ecxt_param_exec_vals[0]$30 = {execPlan = 0x0, value = 2, isnull = false}

由于tbl.id = 1,因此表达式解析结果为F

(gdb) n410            if (subLinkType == ANY_SUBLINK)(gdb) p rowresult$31 = 0(gdb)

判断subLink类型,如为ANY则进入相应逻辑

(gdb) n410            if (subLinkType == ANY_SUBLINK)(gdb) p rowresult$31 = 0(gdb) n413                if (rownull)(gdb) p rownull$32 = false(gdb) n415                else if (DatumGetBool(rowresult))(gdb)

结果不为T,因此再次循环执行,获取slot,这次的值应为3,因为3 <> 1,因此再次循环直至t_big_null中出现值1或完成扫描

326             slot = ExecProcNode(planstate))(gdb) ...(gdb) 403                prmdata->value = slot_getattr(slot, col, &(prmdata->isnull));(gdb) 404                col++;(gdb) p prmdata->value$36 = 3(gdb) p slot->tts_values[0]$37 = 3(gdb) ...

完成执行

(gdb) cContinuing.###[local]:5432 pg12@testdb=# select * from tbl a where a.id not in (select b.id from t_big_null b); id | value ----+-------(0 rows)Time: 3089703.031 ms (51:29.703)[local]:5432 pg12@testdb=# [local]:5432 pg12@testdb=# ###

调用栈

Breakpoint 1, ExecScanSubPlan (node=0x1c89158, econtext=0x1c88990, isNull=0x1c88cad)    at nodeSubplan.c:228228        SubPlan    *subplan = node->subplan;(gdb) bt#0  ExecScanSubPlan (node=0x1c89158, econtext=0x1c88990, isNull=0x1c88cad)    at nodeSubplan.c:228#1  0x000000000072fe5e in ExecSubPlan (node=0x1c89158, econtext=0x1c88990,     isNull=0x1c88cad) at nodeSubplan.c:90#2  0x00000000006e90a8 in ExecEvalSubPlan (state=0x1c88ca8, op=0x1c88d80,     econtext=0x1c88990) at execExprInterp.c:3783#3  0x00000000006e48b2 in ExecInterpExpr (state=0x1c88ca8, econtext=0x1c88990,     isnull=0x7fff7b9514f7) at execExprInterp.c:1484#4  0x00000000006e50b8 in ExecInterpExprStillValid (state=0x1c88ca8, econtext=0x1c88990,     isNull=0x7fff7b9514f7) at execExprInterp.c:1769#5  0x00000000006f9ec0 in ExecEvalExprSwitchContext (state=0x1c88ca8, econtext=0x1c88990,     isNull=0x7fff7b9514f7) at ../../../src/include/executor/executor.h:307#6  0x00000000006f9fb8 in ExecQual (state=0x1c88ca8, econtext=0x1c88990)    at ../../../src/include/executor/executor.h:376#7  0x00000000006fa37d in ExecScan (node=0x1c88878, accessMtd=0x72b84c ,     recheckMtd=0x72b8f1 ) at execScan.c:228#8  0x000000000072b93b in ExecSeqScan (pstate=0x1c88878) at nodeSeqscan.c:112#9  0x00000000006f8077 in ExecProcNodeFirst (node=0x1c88878) at execProcnode.c:445#10 0x00000000006ed3cd in ExecProcNode (node=0x1c88878)    at ../../../src/include/executor/executor.h:239#11 0x00000000006efbb3 in ExecutePlan (estate=0x1c87da8, planstate=0x1c88878,     use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0,     direction=ForwardScanDirection, dest=0x1cbc7c0, execute_once=true) at execMain.c:1646#12 0x00000000006ed9df in standard_ExecutorRun (queryDesc=0x1be78b8,     direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:364#13 0x00000000006ed815 in ExecutorRun (queryDesc=0x1be78b8,     direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:308#14 0x00000000008f1010 in PortalRunSelect (portal=0x1c27d78, forward=true, count=0,     dest=0x1cbc7c0) at pquery.c:929#15 0x00000000008f0cae in PortalRun (portal=0x1c27d78, count=9223372036854775807,     isTopLevel=true, run_once=true, dest=0x1cbc7c0, altdest=0x1cbc7c0,     completionTag=0x7fff7b951890 "") at pquery.c:770#16 0x00000000008ead35 in exec_simple_query (    query_string=0x1bc1d88 "select * from tbl a where a.id not in (select b.id from t_big_null b);") at postgres.c:1215#17 0x00000000008eefa5 in PostgresMain (argc=1, argv=0x1bedf18,     dbname=0x1bedd60 "testdb", username=0x1bbeaa8 "pg12") at postgres.c:4236#18 0x0000000000845915 in BackendRun (port=0x1be3d30) at postmaster.c:4431#19 0x00000000008450f3 in BackendStartup (port=0x1be3d30) at postmaster.c:4122#20 0x000000000084132f in ServerLoop () at postmaster.c:1704#21 0x0000000000840be5 in PostmasterMain (argc=1, argv=0x1bbca60) at postmaster.c:1377#22 0x0000000000761469 in main (argc=1, argv=0x1bbca60) at main.c:228(gdb)

到此,相信大家对"PostgreSQL源码中NOT IN的作用是什么"有了更深的了解,不妨来实际操作一番吧!这里是网站,更多相关内容可以进入相关频道进行查询,关注我们,继续学习!

0