导航：首页 > 数据库 >

PostgreSQL 源码解读（20）- 查询语句#5（查询树Query详解）

发表于：2025-02-03 作者：千家信息网编辑

千家信息网最后更新 2025年02月03日，查询树注意：RangeTblRef中的rtindex指向的是rtable链表RangeTblEntry的Index。二、数据结构RangeTblEntry/*--------------------

千家信息网最后更新 2025年02月03日PostgreSQL 源码解读（20）- 查询语句#5（查询树Query详解）

查询树

注意：RangeTblRef中的rtindex指向的是rtable链表RangeTblEntry的Index。

二、数据结构

RangeTblEntry

/*-------------------- * RangeTblEntry - * A range table is a List of RangeTblEntry nodes. * * A range table entry may represent a plain relation, a sub-select in * FROM, or the result of a JOIN clause. (Only explicit JOIN syntax * produces an RTE, not the implicit join resulting from multiple FROM * items. This is because we only need the RTE to deal with SQL features * like outer joins and join-output-column aliasing.) Other special * RTE types also exist, as indicated by RTEKind. * * Note that we consider RTE_RELATION to cover anything that has a pg_class * entry. relkind distinguishes the sub-cases. * * alias is an Alias node representing the AS alias-clause attached to the * FROM expression, or NULL if no clause. * * eref is the table reference name and column reference names (either * real or aliases). Note that system columns (OID etc) are not included * in the column list. * eref->aliasname is required to be present, and should generally be used * to identify the RTE for error messages etc. * * In RELATION RTEs, the colnames in both alias and eref are indexed by * physical attribute number; this means there must be colname entries for * dropped columns. When building an RTE we insert empty strings ("") for * dropped columns. Note however that a stored rule may have nonempty * colnames for columns dropped since the rule was created (and for that * matter the colnames might be out of date due to column renamings). * The same comments apply to FUNCTION RTEs when a function's return type * is a named composite type. * * In JOIN RTEs, the colnames in both alias and eref are one-to-one with * joinaliasvars entries. A JOIN RTE will omit columns of its inputs when * those columns are known to be dropped at parse time. Again, however, * a stored rule might contain entries for columns dropped since the rule * was created. (This is only possible for columns not actually referenced * in the rule.) When loading a stored rule, we replace the joinaliasvars * items for any such columns with null pointers. (We can't simply delete * them from the joinaliasvars list, because that would affect the attnums * of Vars referencing the rest of the list.) * * inh is true for relation references that should be expanded to include * inheritance children, if the rel has any. This *must* be false for * RTEs other than RTE_RELATION entries. * * inFromCl marks those range variables that are listed in the FROM clause. * It's false for RTEs that are added to a query behind the scenes, such * as the NEW and OLD variables for a rule, or the subqueries of a UNION. * This flag is not used anymore during parsing, since the parser now uses * a separate "namespace" data structure to control visibility, but it is * needed by ruleutils.c to determine whether RTEs should be shown in * decompiled queries. * * requiredPerms and checkAsUser specify run-time access permissions * checks to be performed at query startup. The user must have *all* * of the permissions that are OR'd together in requiredPerms (zero * indicates no permissions checking). If checkAsUser is not zero, * then do the permissions checks using the access rights of that user, * not the current effective user ID. (This allows rules to act as * setuid gateways.) Permissions checks only apply to RELATION RTEs. * * For SELECT/INSERT/UPDATE permissions, if the user doesn't have * table-wide permissions then it is sufficient to have the permissions * on all columns identified in selectedCols (for SELECT) and/or * insertedCols and/or updatedCols (INSERT with ON CONFLICT DO UPDATE may * have all 3). selectedCols, insertedCols and updatedCols are bitmapsets, * which cannot have negative integer members, so we subtract * FirstLowInvalidHeapAttributeNumber from column numbers before storing * them in these fields. A whole-row Var reference is represented by * setting the bit for InvalidAttrNumber. * * securityQuals is a list of security barrier quals (boolean expressions), * to be tested in the listed order before returning a row from the * relation. It is always NIL in parser output. Entries are added by the * rewriter to implement security-barrier views and/or row-level security. * Note that the planner turns each boolean expression into an implicitly * AND'ed sublist, as is its usual habit with qualification expressions. *-------------------- */ typedef enum RTEKind { RTE_RELATION, /* ordinary relation reference */ RTE_SUBQUERY, /* subquery in FROM */ RTE_JOIN, /* join */ RTE_FUNCTION, /* function in FROM */ RTE_TABLEFUNC, /* TableFunc(.., column list) */ RTE_VALUES, /* VALUES (), (), ... */ RTE_CTE, /* common table expr (WITH list element) */ RTE_NAMEDTUPLESTORE /* tuplestore, e.g. for AFTER triggers */ } RTEKind; typedef struct RangeTblEntry { NodeTag type; RTEKind rtekind; /* see above */ /* * XXX the fields applicable to only some rte kinds should be merged into * a union. I didn't do this yet because the diffs would impact a lot of * code that is being actively worked on. FIXME someday. */ /* * Fields valid for a plain relation RTE (else zero): * * As a special case, RTE_NAMEDTUPLESTORE can also set relid to indicate * that the tuple format of the tuplestore is the same as the referenced * relation. This allows plans referencing AFTER trigger transition * tables to be invalidated if the underlying table is altered. */ Oid relid; /* OID of the relation */ char relkind; /* relation kind (see pg_class.relkind) */ struct TableSampleClause *tablesample; /* sampling info, or NULL */ /* * Fields valid for a subquery RTE (else NULL): */ Query *subquery; /* the sub-query */ bool security_barrier; /* is from security_barrier view? */ /* * Fields valid for a join RTE (else NULL/zero): * * joinaliasvars is a list of (usually) Vars corresponding to the columns * of the join result. An alias Var referencing column K of the join * result can be replaced by the K'th element of joinaliasvars --- but to * simplify the task of reverse-listing aliases correctly, we do not do * that until planning time. In detail: an element of joinaliasvars can * be a Var of one of the join's input relations, or such a Var with an * implicit coercion to the join's output column type, or a COALESCE * expression containing the two input column Vars (possibly coerced). * Within a Query loaded from a stored rule, it is also possible for * joinaliasvars items to be null pointers, which are placeholders for * (necessarily unreferenced) columns dropped since the rule was made. * Also, once planning begins, joinaliasvars items can be almost anything, * as a result of subquery-flattening substitutions. */ JoinType jointype; /* type of join */ List *joinaliasvars; /* list of alias-var expansions */ /* * Fields valid for a function RTE (else NIL/zero): * * When funcordinality is true, the eref->colnames list includes an alias * for the ordinality column. The ordinality column is otherwise * implicit, and must be accounted for "by hand" in places such as * expandRTE(). */ List *functions; /* list of RangeTblFunction nodes */ bool funcordinality; /* is this called WITH ORDINALITY? */ /* * Fields valid for a TableFunc RTE (else NULL): */ TableFunc *tablefunc; /* * Fields valid for a values RTE (else NIL): */ List *values_lists; /* list of expression lists */ /* * Fields valid for a CTE RTE (else NULL/zero): */ char *ctename; /* name of the WITH list item */ Index ctelevelsup; /* number of query levels up */ bool self_reference; /* is this a recursive self-reference? */ /* * Fields valid for table functions, values, CTE and ENR RTEs (else NIL): * * We need these for CTE RTEs so that the types of self-referential * columns are well-defined. For VALUES RTEs, storing these explicitly * saves having to re-determine the info by scanning the values_lists. For * ENRs, we store the types explicitly here (we could get the information * from the catalogs if 'relid' was supplied, but we'd still need these * for TupleDesc-based ENRs, so we might as well always store the type * info here). * * For ENRs only, we have to consider the possibility of dropped columns. * A dropped column is included in these lists, but it will have zeroes in * all three lists (as well as an empty-string entry in eref). Testing * for zero coltype is the standard way to detect a dropped column. */ List *coltypes; /* OID list of column type OIDs */ List *coltypmods; /* integer list of column typmods */ List *colcollations; /* OID list of column collation OIDs */ /* * Fields valid for ENR RTEs (else NULL/zero): */ char *enrname; /* name of ephemeral named relation */ double enrtuples; /* estimated or actual from caller */ /* * Fields valid in all RTEs: */ Alias *alias; /* user-written alias clause, if any */ Alias *eref; /* expanded reference names */ bool lateral; /* subquery, function, or values is LATERAL? */ bool inh; /* inheritance requested? */ bool inFromCl; /* present in FROM clause? */ AclMode requiredPerms; /* bitmask of required access permissions */ Oid checkAsUser; /* if valid, check access as this role */ Bitmapset *selectedCols; /* columns needing SELECT permission */ Bitmapset *insertedCols; /* columns needing INSERT permission */ Bitmapset *updatedCols; /* columns needing UPDATE permission */ List *securityQuals; /* security barrier quals to apply, if any */ } RangeTblEntry;

FromExpr/JoinExpr

 /*  * RangeTblRef - reference to an entry in the query's rangetable  *  * We could use direct pointers to the RT entries and skip having these  * nodes, but multiple pointers to the same node in a querytree cause  * lots of headaches, so it seems better to store an index into the RT.  */ typedef struct RangeTblRef {     NodeTag     type;     int         rtindex; } RangeTblRef;  /*----------  * JoinExpr - for SQL JOIN expressions  *  * isNatural, usingClause, and quals are interdependent.  The user can write  * only one of NATURAL, USING(), or ON() (this is enforced by the grammar).  * If he writes NATURAL then parse analysis generates the equivalent USING()  * list, and from that fills in "quals" with the right equality comparisons.  * If he writes USING() then "quals" is filled with equality comparisons.  * If he writes ON() then only "quals" is set.  Note that NATURAL/USING  * are not equivalent to ON() since they also affect the output column list.  *  * alias is an Alias node representing the AS alias-clause attached to the  * join expression, or NULL if no clause.  NB: presence or absence of the  * alias has a critical impact on semantics, because a join with an alias  * restricts visibility of the tables/columns inside it.  *  * During parse analysis, an RTE is created for the Join, and its index  * is filled into rtindex.  This RTE is present mainly so that Vars can  * be created that refer to the outputs of the join.  The planner sometimes  * generates JoinExprs internally; these can have rtindex = 0 if there are  * no join alias variables referencing such joins.  *----------  */ typedef struct JoinExpr {     NodeTag     type;     JoinType    jointype;       /* type of join */     bool        isNatural;      /* Natural join? Will need to shape table */     Node       *larg;           /* left subtree */     Node       *rarg;           /* right subtree */     List       *usingClause;    /* USING clause, if any (list of String) */     Node       *quals;          /* qualifiers on join, if any */     Alias      *alias;          /* user-written alias clause, if any */     int         rtindex;        /* RT index assigned for join, or 0 */ } JoinExpr;  /*----------  * FromExpr - represents a FROM ... WHERE ... construct  *  * This is both more flexible than a JoinExpr (it can have any number of  * children, including zero) and less so --- we don't need to deal with  * aliases and so on.  The output column set is implicitly just the union  * of the outputs of the children.  *----------  */ typedef struct FromExpr {     NodeTag     type;     List       *fromlist;       /* List of join subtrees */     Node       *quals;          /* qualifiers on join, if any */ } FromExpr;

TargetEntry

  /*--------------------  * TargetEntry -  *     a target entry (used in query target lists)  *  * Strictly speaking, a TargetEntry isn't an expression node (since it can't  * be evaluated by ExecEvalExpr).  But we treat it as one anyway, since in  * very many places it's convenient to process a whole query targetlist as a  * single expression tree.  *  * In a SELECT's targetlist, resno should always be equal to the item's  * ordinal position (counting from 1).  However, in an INSERT or UPDATE  * targetlist, resno represents the attribute number of the destination  * column for the item; so there may be missing or out-of-order resnos.  * It is even legal to have duplicated resnos; consider  *      UPDATE table SET arraycol[1] = ..., arraycol[2] = ..., ...  * The two meanings come together in the executor, because the planner  * transforms INSERT/UPDATE tlists into a normalized form with exactly  * one entry for each column of the destination table.  Before that's  * happened, however, it is risky to assume that resno == position.  * Generally get_tle_by_resno() should be used rather than list_nth()  * to fetch tlist entries by resno, and only in SELECT should you assume  * that resno is a unique identifier.  *  * resname is required to represent the correct column name in non-resjunk  * entries of top-level SELECT targetlists, since it will be used as the  * column title sent to the frontend.  In most other contexts it is only  * a debugging aid, and may be wrong or even NULL.  (In particular, it may  * be wrong in a tlist from a stored rule, if the referenced column has been  * renamed by ALTER TABLE since the rule was made.  Also, the planner tends  * to store NULL rather than look up a valid name for tlist entries in  * non-toplevel plan nodes.)  In resjunk entries, resname should be either  * a specific system-generated name (such as "ctid") or NULL; anything else  * risks confusing ExecGetJunkAttribute!  *  * ressortgroupref is used in the representation of ORDER BY, GROUP BY, and  * DISTINCT items.  Targetlist entries with ressortgroupref=0 are not  * sort/group items.  If ressortgroupref>0, then this item is an ORDER BY,  * GROUP BY, and/or DISTINCT target value.  No two entries in a targetlist  * may have the same nonzero ressortgroupref --- but there is no particular  * meaning to the nonzero values, except as tags.  (For example, one must  * not assume that lower ressortgroupref means a more significant sort key.)  * The order of the associated SortGroupClause lists determine the semantics.  *  * resorigtbl/resorigcol identify the source of the column, if it is a  * simple reference to a column of a base table (or view).  If it is not  * a simple reference, these fields are zeroes.  *  * If resjunk is true then the column is a working column (such as a sort key)  * that should be removed from the final output of the query.  Resjunk columns  * must have resnos that cannot duplicate any regular column's resno.  Also  * note that there are places that assume resjunk columns come after non-junk  * columns.  *--------------------  */ typedef struct TargetEntry {     Expr        xpr;     Expr       *expr;           /* expression to evaluate */     AttrNumber  resno;          /* attribute number (see notes above) */     char       *resname;        /* name of the column (could be NULL) */     Index       ressortgroupref;    /* nonzero if referenced by a sort/group                                      * clause */     Oid         resorigtbl;     /* OID of column's source table */     AttrNumber  resorigcol;     /* column's number in source table */     bool        resjunk;        /* set to true to eliminate the attribute from                                  * final target list */ } TargetEntry;

OpExpr

/*  * OpExpr - expression node for an operator invocation  *  * Semantically, this is essentially the same as a function call.  *  * Note that opfuncid is not necessarily filled in immediately on creation  * of the node.  The planner makes sure it is valid before passing the node  * tree to the executor, but during parsing/planning opfuncid can be 0.  */ typedef struct OpExpr {     Expr        xpr;     Oid         opno;           /* PG_OPERATOR OID of the operator */     Oid         opfuncid;       /* PG_PROC OID of underlying function */     Oid         opresulttype;   /* PG_TYPE OID of result value */     bool        opretset;       /* true if operator returns set */     Oid         opcollid;       /* OID of collation of result */     Oid         inputcollid;    /* OID of collation that operator should use */     List       *args;           /* arguments to the operator (1 or 2) */     int         location;       /* token location, or -1 if unknown */ } OpExpr;