PostgreSQL 源码解读(180)- 内核研发#4(如何实现自定义系统函数)
本节以实现Oracle中的add_months函数为例介绍如何通过改造内核实现自定义系统函数.
一、基础知识
在实现之前有必要先行介绍一些基础知识,包括Oid/函数注册等.
Oid
Oid即Object identifier,对象标识符,在PostgreSQL中,每个对象都一个Oid,系统表对象之间以Oid进行关联.
函数作为PostgreSQL中的一种对象,每个函数都存在Oid,通过查询pg_proc可获得相关信息:
postgres=# select oid,proname from pg_proc order by oid; oid | proname -------+---------------------------------------------- 31 | byteaout 33 | charout 34 | namein 35 | nameout 38 | int2in 39 | int2out 40 | int2vectorin 41 | int2vectorout 42 | int4in 43 | int4out 44 | regprocin 45 | regprocout 46 | textin 47 | textout 48 | tidin 49 | tidout 50 | xidin 51 | xidout 52 | cidin 53 | cidout 54 | oidvectorin 55 | oidvectorout 56 | boollt 57 | boolgt 60 | booleq--More--
函数注册
假设我们已经实现了一个自定义系统函数,比如add_months,PostgreSQL如何才能感知该函数的存在?答案是通过函数注册实现.
PostgreSQL在编译的时候,会用perl脚本根据预置的记录,生成src/backend/catalog/postgres.bki文件,该文件在initdb时被解析成一条条的SQL,插入到系统表中.因此自定义的系统函数,需要在通过initdb新建的数据库实例中才能被"感知".
二、实现步骤
有了上面的基础知识,接下来我们step by step的实现add_months自定义函数.
1.获取函数Oid
PostgreSQL提供了unused_oids工具用于快速检索未使用的Oid,该文件位于src/include/catalog目录下
find -name unused_oids./src/include/catalog/unused_oids[root@localhost pg11]# ./src/include/catalog/unused_oids2 - 93423 - 3436399639984001 - 40134142 - 41994217 - 45654572 - 49995017 - 50275029 - 59996015 - 6099610361056107 - 610961166122 - 9999
我们选择了Oid = 5100
2.注册函数
在文件pg_proc.dat中添加add_months函数
#src/include/catalog/pg_proc.dat...{ oid => '5100', descr => 'oracle-like add_months function', proname => 'add_months', provariadic => '0', proisstrict => 'f', prorettype => 'date', proargtypes => 'date int4', prosrc => 'add_months'},
该文件中的条目对应结构体Form_pg_proc
/* ---------------- * pg_proc definition. cpp turns this into * typedef struct FormData_pg_proc * pg_proc定义* ---------------- */CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,ProcedureRelation_Rowtype_Id) BKI_SCHEMA_MACRO{ /* procedure name */ //过程名称 NameData proname; /* OID of namespace containing this proc */ //系统OID Oid pronamespace BKI_DEFAULT(PGNSP); /* procedure owner */ //拥有者Owner Oid proowner BKI_DEFAULT(PGUID); /* OID of pg_language entry */ //实现语言调用接口,pg_language中的OID. //默认为12-internal,其他选项包括13-c语言,14-sql,13275-plpgsql Oid prolang BKI_DEFAULT(12); /* estimated execution cost */ //估算的执行成本,默认为1 float4 procost BKI_DEFAULT(1); /* estimated # of rows out (if proretset) */ //估算的结果行数,默认为0 float4 prorows BKI_DEFAULT(0); /* element type of variadic array, or 0 */ //可变数组参数元素类型,默认为0 Oid provariadic BKI_DEFAULT(0) BKI_LOOKUP(pg_type); /* transforms calls to it during planning */ //在计划期间的转换调用,默认为0 //可通过此列指定的函数来简化 regproc protransform BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); /* see PROKIND_ categories below */ //详见下面的PROKIND_XXX char prokind BKI_DEFAULT(f); /* security definer */ //安全定义器 bool prosecdef BKI_DEFAULT(f); /* is it a leak-proof function? */ //弱认证函数?除了返回值,没有关系参数的信息被传播 bool proleakproof BKI_DEFAULT(f); /* strict with respect to NULLs? */ //NULLs的处理(严格还是不严格) bool proisstrict BKI_DEFAULT(t); /* returns a set? */ //返回集合?默认为F bool proretset BKI_DEFAULT(f); /* see PROVOLATILE_ categories below */ //详见下面的PROVOLATILE_XXX char provolatile BKI_DEFAULT(i); /* see PROPARALLEL_ categories below */ //详见下面的PROPARALLEL_XXX char proparallel BKI_DEFAULT(s); /* number of arguments */ /* Note: need not be given in pg_proc.dat; genbki.pl will compute it */ //参数个数 //注意:不需要在pg_proc.dat中指定,genbki.pl会自动计算 int16 pronargs; /* number of arguments with defaults */ //有默认值的参数个数 int16 pronargdefaults BKI_DEFAULT(0); /* OID of result type */ //结果类型OID Oid prorettype BKI_LOOKUP(pg_type); /* * variable-length fields start here, but we allow direct access to * proargtypes * 从这里开始为可变长字段,但我们运行直接访问原型类型 */ /* parameter types (excludes OUT params) */ //参数类型(剔除了OUT参数) //只包括输入参数(含INOUT和VARIADIC参数 oidvector proargtypes BKI_LOOKUP(pg_type);#ifdef CATALOG_VARLEN /* all param types (NULL if IN only) */ //所有参数类型(数组),包括所有参数(含OUT和INOUT参数) //如都为IN类型,则为NULL Oid proallargtypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type); /* parameter modes (NULL if IN only) */ //参数模式数组(如都为IN参数,则为NULL) // i表示IN参数 , o表示OUT参数, b表示INOUT参数, v表示VARIADIC参数, t表示TABLE参数 char proargmodes[1] BKI_DEFAULT(_null_); /* parameter names (NULL if no names) */ //参数名称数组(如无则为NULL) //这里的下标对应着proallargtypes而不是proargtypes中的位置 text proargnames[1] BKI_DEFAULT(_null_); /* list of expression trees for argument defaults (NULL if none) */ //参数默认值表达式树链表(如无则为NULL) //对应proargtypes pg_node_tree proargdefaults BKI_DEFAULT(_null_); /* types for which to apply transforms */ //应用变换的类型 Oid protrftypes[1] BKI_DEFAULT(_null_); /* procedure source text */ //过程实现文本(如为c,则可为函数名称) text prosrc BKI_FORCE_NOT_NULL; /* secondary procedure info (can be NULL) */ //第二个过程信息,即附加信息(可为NULL) text probin BKI_DEFAULT(_null_); /* procedure-local GUC settings */ //与过程相关的本地GUC设置 text proconfig[1] BKI_DEFAULT(_null_); /* access permissions */ //访问权限 aclitem proacl[1] BKI_DEFAULT(_null_);#endif} FormData_pg_proc;/* ---------------- * Form_pg_proc corresponds to a pointer to a tuple with * the format of pg_proc relation. * ---------------- */typedef FormData_pg_proc *Form_pg_proc;#ifdef EXPOSE_TO_CLIENT_CODE/* * Symbolic values for prokind column */#define PROKIND_FUNCTION 'f'#define PROKIND_AGGREGATE 'a'#define PROKIND_WINDOW 'w'#define PROKIND_PROCEDURE 'p'/* * Symbolic values for provolatile column: these indicate whether the result * of a function is dependent *only* on the values of its explicit arguments, * or can change due to outside factors (such as parameter variables or * table contents). NOTE: functions having side-effects, such as setval(), * must be labeled volatile to ensure they will not get optimized away, * even if the actual return value is not changeable. */#define PROVOLATILE_IMMUTABLE 'i' /* never changes for given input */#define PROVOLATILE_STABLE 's' /* does not change within a scan */#define PROVOLATILE_VOLATILE 'v' /* can change even within a scan *//* * Symbolic values for proparallel column: these indicate whether a function * can be safely be run in a parallel backend, during parallelism but * necessarily in the master, or only in non-parallel mode. */#define PROPARALLEL_SAFE 's' /* can run in worker or master */#define PROPARALLEL_RESTRICTED 'r' /* can run in parallel master only */#define PROPARALLEL_UNSAFE 'u' /* banned while in parallel mode *//* * Symbolic values for proargmodes column. Note that these must agree with * the FunctionParameterMode enum in parsenodes.h; we declare them here to * be accessible from either header. */#define PROARGMODE_IN 'i'#define PROARGMODE_OUT 'o'#define PROARGMODE_INOUT 'b'#define PROARGMODE_VARIADIC 'v'#define PROARGMODE_TABLE 't'#endif /* EXPOSE_TO_CLIENT_CODE */
通过perl脚本,PG会把该定义文件中的条目生成postgres.bki文件,查看Makefile文件(src/backend/catalog/Makefile)中的注释:
# bki-stamp records the last time we ran genbki.pl. We don't rely on# the timestamps of the individual output files, because the Perl script# won't update them if they didn't change (to avoid unnecessary recompiles).# Technically, this should depend on Makefile.global which supplies# $(MAJORVERSION); but then genbki.pl would need to be re-run after every# configure run, even in distribution tarballs. So depending on configure.in# instead is cheating a bit, but it will achieve the goal of updating the# version number when it changes.bki-stamp: genbki.pl Catalog.pm $(POSTGRES_BKI_SRCS) $(POSTGRES_BKI_DATA) $(top_srcdir)/configure.in $(PERL) -I $(catalogdir) $< --set-version=$(MAJORVERSION) $(POSTGRES_BKI_SRCS) touch $@
编译成功后,生成的src/backend/catalog/postgres.bki中包含了我们添加的条目(OID = 5100):
...insert OID = 5028 ( satisfies_hash_partition 11 10 12 1 0 2276 0 f f f f f i s 4 0 16 "26 23 23 2276" _null_ "{i,i,i,v}" _null_ _null_ _null_ satisfies_hash_partition _null_ _null_ _null_ )insert OID = 5100 ( add_months 11 10 12 1 0 0 0 f f f f f i s 2 0 1082 "1082 23" _null_ _null_ _null_ _null_ _null_ add_months _null_ _null_ _null_ )close pg_proc...
3.实现功能
在src/backend/utils/adt/date.c文件的最后添加逻辑实现.
该实现使用了Github开源项目orafce中的实现( 感谢开源! )
Datumadd_months(PG_FUNCTION_ARGS){ DateADT day = PG_GETARG_DATEADT(0); int n = PG_GETARG_INT32(1); int y, m, d; int days; DateADT result; div_t v; bool last_day; j2date(day + POSTGRES_EPOCH_JDATE, &y, &m, &d); last_day = (d == days_of_month(y, m)); v = div(y * 12 + m - 1 + n, 12); y = v.quot; if (y < 0) y += 1; /* offset because of year 0 */ m = v.rem + 1; days = days_of_month(y, m); if (last_day || d > days) d = days; result = date2j(y, m, d) - POSTGRES_EPOCH_JDATE; PG_RETURN_DATEADT (result);}intdays_of_month(int y, int m){ int month_days[] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}; int days; if (m < 0 || 12 < m) ereport(ERROR, (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE), errmsg("date out of range"))); days = month_days[m - 1]; if (m == 2 && (y % 400 == 0 || (y % 4 == 0 && y % 100 != 0))) days += 1; /* February 29 in leap year */ return days;}
在头文件src/include/utils/date.h中添加函数声明
extern Datum add_months(PG_FUNCTION_ARGS);
4.编译安装
make cleanmakemake install
5.初始化数据库
initdb -D /data/pgsql/tmpdbpg_ctl start -D /data/pgsql/tmpdb
6.检查验证
postgres=# select add_months(current_date,12); add_months ------------ 2020-04-28(1 row)
DONE!
三、参考资料
Oid
orafce