千家信息网

PostgreSQL中ReadBuffer_common函数有什么作用

发表于:2025-02-01 作者:千家信息网编辑
千家信息网最后更新 2025年02月01日,这篇文章主要介绍"PostgreSQL中ReadBuffer_common函数有什么作用",在日常操作中,相信很多人在PostgreSQL中ReadBuffer_common函数有什么作用问题上存在疑
千家信息网最后更新 2025年02月01日PostgreSQL中ReadBuffer_common函数有什么作用

这篇文章主要介绍"PostgreSQL中ReadBuffer_common函数有什么作用",在日常操作中,相信很多人在PostgreSQL中ReadBuffer_common函数有什么作用问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答"PostgreSQL中ReadBuffer_common函数有什么作用"的疑惑有所帮助!接下来,请跟着小编一起来学习吧!

一、数据结构

BufferDesc
共享缓冲区的共享描述符(状态)数据

/* * Flags for buffer descriptors * buffer描述器标记 * * Note: TAG_VALID essentially means that there is a buffer hashtable * entry associated with the buffer's tag. * 注意:TAG_VALID本质上意味着有一个与缓冲区的标记相关联的缓冲区散列表条目。 *///buffer header锁定#define BM_LOCKED               (1U << 22)  /* buffer header is locked *///数据需要写入(标记为DIRTY)#define BM_DIRTY                (1U << 23)  /* data needs writing *///数据是有效的#define BM_VALID                (1U << 24)  /* data is valid *///已分配buffer tag#define BM_TAG_VALID            (1U << 25)  /* tag is assigned *///正在R/W#define BM_IO_IN_PROGRESS       (1U << 26)  /* read or write in progress *///上一个I/O出现错误#define BM_IO_ERROR             (1U << 27)  /* previous I/O failed *///开始写则变DIRTY#define BM_JUST_DIRTIED         (1U << 28)  /* dirtied since write started *///存在等待sole pin的其他进程#define BM_PIN_COUNT_WAITER     (1U << 29)  /* have waiter for sole pin *///checkpoint发生,必须刷到磁盘上#define BM_CHECKPOINT_NEEDED    (1U << 30)  /* must write for checkpoint *///持久化buffer(不是unlogged或者初始化fork)#define BM_PERMANENT            (1U << 31)  /* permanent buffer (not unlogged,                                             * or init fork) *//* *  BufferDesc -- shared descriptor/state data for a single shared buffer. *  BufferDesc -- 共享缓冲区的共享描述符(状态)数据 * * Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change * the tag, state or wait_backend_pid fields.  In general, buffer header lock * is a spinlock which is combined with flags, refcount and usagecount into * single atomic variable.  This layout allow us to do some operations in a * single atomic operation, without actually acquiring and releasing spinlock; * for instance, increase or decrease refcount.  buf_id field never changes * after initialization, so does not need locking.  freeNext is protected by * the buffer_strategy_lock not buffer header lock.  The LWLock can take care * of itself.  The buffer header lock is *not* used to control access to the * data in the buffer! * 注意:必须持有Buffer header锁(BM_LOCKED标记)才能检查或修改tag/state/wait_backend_pid字段. * 通常来说,buffer header lock是spinlock,它与标记位/参考计数/使用计数组合到单个原子变量中. * 这个布局设计允许我们执行原子操作,而不需要实际获得或者释放spinlock(比如,增加或者减少参考计数). * buf_id字段在初始化后不会出现变化,因此不需要锁定. * freeNext通过buffer_strategy_lock锁而不是buffer header lock保护. * LWLock可以很好的处理自己的状态. * 务请注意的是:buffer header lock不用于控制buffer中的数据访问! * * It's assumed that nobody changes the state field while buffer header lock * is held.  Thus buffer header lock holder can do complex updates of the * state variable in single write, simultaneously with lock release (cleaning * BM_LOCKED flag).  On the other hand, updating of state without holding * buffer header lock is restricted to CAS, which insure that BM_LOCKED flag * is not set.  Atomic increment/decrement, OR/AND etc. are not allowed. * 假定在持有buffer header lock的情况下,没有人改变状态字段. * 持有buffer header lock的进程可以执行在单个写操作中执行复杂的状态变量更新, *   同步的释放锁(清除BM_LOCKED标记). * 换句话说,如果没有持有buffer header lock的状态更新,会受限于CAS, *   这种情况下确保BM_LOCKED没有被设置. * 比如原子的增加/减少(AND/OR)等操作是不允许的. * * An exception is that if we have the buffer pinned, its tag can't change * underneath us, so we can examine the tag without locking the buffer header. * Also, in places we do one-time reads of the flags without bothering to * lock the buffer header; this is generally for situations where we don't * expect the flag bit being tested to be changing. * 一种例外情况是如果我们已有buffer pinned,该buffer的tag不能改变(在本进程之下), *   因此不需要锁定buffer header就可以检查tag了. * 同时,在执行一次性的flags读取时不需要锁定buffer header. * 这种情况通常用于我们不希望正在测试的flag bit将被改变. * * We can't physically remove items from a disk page if another backend has * the buffer pinned.  Hence, a backend may need to wait for all other pins * to go away.  This is signaled by storing its own PID into * wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER.  At present, * there can be only one such waiter per buffer. * 如果其他进程有buffer pinned,那么进程不能物理的从磁盘页面中删除items. * 因此,后台进程需要等待其他pins清除.这可以通过存储它自己的PID到wait_backend_pid中, *   并设置标记位BM_PIN_COUNT_WAITER. * 目前,每个缓冲区只能由一个等待进程. * * We use this same struct for local buffer headers, but the locks are not * used and not all of the flag bits are useful either. To avoid unnecessary * overhead, manipulations of the state field should be done without actual * atomic operations (i.e. only pg_atomic_read_u32() and * pg_atomic_unlocked_write_u32()). * 本地缓冲头部使用同样的结构,但并不需要使用locks,而且并不是所有的标记位都使用. * 为了避免不必要的负载,状态域的维护不需要实际的原子操作 * (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32()) * * Be careful to avoid increasing the size of the struct when adding or * reordering members.  Keeping it below 64 bytes (the most common CPU * cache line size) is fairly important for performance. * 在增加或者记录成员变量时,小心避免增加结构体的大小. * 保持结构体大小在64字节内(通常的CPU缓存线大小)对于性能是非常重要的. */typedef struct BufferDesc{    //buffer tag    BufferTag   tag;            /* ID of page contained in buffer */    //buffer索引编号(0开始)    int         buf_id;         /* buffer's index number (from 0) */    /* state of the tag, containing flags, refcount and usagecount */    //tag状态,包括flags/refcount和usagecount    pg_atomic_uint32 state;    //pin-count等待进程ID    int         wait_backend_pid;   /* backend PID of pin-count waiter */    //空闲链表链中下一个空闲的buffer    int         freeNext;       /* link in freelist chain */    //缓冲区内容锁    LWLock      content_lock;   /* to lock access to buffer contents */} BufferDesc;

BufferTag
Buffer tag标记了buffer存储的是磁盘中哪个block

/* * Buffer tag identifies which disk block the buffer contains. * Buffer tag标记了buffer存储的是磁盘中哪个block * * Note: the BufferTag data must be sufficient to determine where to write the * block, without reference to pg_class or pg_tablespace entries.  It's * possible that the backend flushing the buffer doesn't even believe the * relation is visible yet (its xact may have started before the xact that * created the rel).  The storage manager must be able to cope anyway. * 注意:BufferTag必须足以确定如何写block而不需要参照pg_class或者pg_tablespace数据字典信息. * 有可能后台进程在刷新缓冲区的时候深圳不相信关系是可见的(事务可能在创建rel的事务之前). * 存储管理器必须可以处理这些事情. * * Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have * to be fixed to zero them, since this struct is used as a hash key. * 注意:如果在结构体中有填充的字节,INIT_BUFFERTAG必须将它们固定为零,因为这个结构体用作散列键. */typedef struct buftag{    //物理relation标识符    RelFileNode rnode;          /* physical relation identifier */    ForkNumber  forkNum;    //相对于relation起始的块号    BlockNumber blockNum;       /* blknum relative to begin of reln */} BufferTag;

二、源码解读

ReadBuffer_common函数是所有ReadBuffer相关的通用逻辑,其实现逻辑如下:
1.初始化相关变量和执行相关判断(是否扩展isExtend?是否临时表isLocalBuf?)
2.如为临时表,则调用LocalBufferAlloc获取描述符;否则调用BufferAlloc获取描述符;
同时,设置是否在缓存命中的标记(变量found)
3.如在缓存中命中
3.1如非扩展buffer,更新统计信息,如有需要,锁定buffer并返回
3.2如为扩展buffer,则获取block
3.2.1如PageIsNew返回F,则报错
3.2.2如为本地buffer(临时表),则调整标记
3.2.3如非本地buffer,则清除BM_VALID标记
4.没有在缓存中命中,则获取block
4.1如为扩展buffer,通过填充0初始化buffer,调用smgrextend扩展
4.2如为普通buffer
4.2.1如模式为RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否则,通过smgr(存储管理器)读取block,如需要,则跟踪I/O时间,同时检查垃圾数据
5.已扩展了buffer或者已读取了block
5.1如需要,锁定buffer
5.2如为临时表,则调整标记;否则设置BM_VALID,中断IO,唤醒等待的进程
5.3更新统计信息
5.4返回buffer

/* * ReadBuffer_common -- common logic for all ReadBuffer variants * ReadBuffer_common -- 所有ReadBuffer相关的通用逻辑 * * *hit is set to true if the request was satisfied from shared buffer cache. * *hit设置为T,如shared buffer中已存在此buffer */static BufferReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,                  BlockNumber blockNum, ReadBufferMode mode,                  BufferAccessStrategy strategy, bool *hit){    BufferDesc *bufHdr;//buffer描述符    Block       bufBlock;//相应的block    bool        found;//是否命中?    bool        isExtend;//扩展?    bool        isLocalBuf = SmgrIsTemp(smgr);//本地buffer?    *hit = false;    /* Make sure we will have room to remember the buffer pin */    //确保有空间存储buffer pin    ResourceOwnerEnlargeBuffers(CurrentResourceOwner);    //如为P_NEW,则需扩展    isExtend = (blockNum == P_NEW);    //跟踪    TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,                                       smgr->smgr_rnode.node.spcNode,                                       smgr->smgr_rnode.node.dbNode,                                       smgr->smgr_rnode.node.relNode,                                       smgr->smgr_rnode.backend,                                       isExtend);    /* Substitute proper block number if caller asked for P_NEW */    //如调用方要求P_NEW,则替换适当的块号    if (isExtend)        blockNum = smgrnblocks(smgr, forkNum);    if (isLocalBuf)    {        //本地buffer(临时表)        bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);        if (found)            pgBufferUsage.local_blks_hit++;        else if (isExtend)            pgBufferUsage.local_blks_written++;        else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||                 mode == RBM_ZERO_ON_ERROR)            pgBufferUsage.local_blks_read++;    }    else    {        //非临时表        /*         * lookup the buffer.  IO_IN_PROGRESS is set if the requested block is         * not currently in memory.         * 搜索buffer.         * 如请求的block不在内存中,则IO_IN_PROGRESS设置为T         */        //获取buffer描述符        bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,                             strategy, &found);        if (found)            //在内存中命中            pgBufferUsage.shared_blks_hit++;        else if (isExtend)            //新的buffer            pgBufferUsage.shared_blks_written++;        else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||                 mode == RBM_ZERO_ON_ERROR)            //读取block            pgBufferUsage.shared_blks_read++;    }    /* At this point we do NOT hold any locks. */    //这时候,我们还没有持有任何锁.    /* if it was already in the buffer pool, we're done */    //---------- 如果buffer已在换冲池中,工作已完成    if (found)    {        //------------- buffer已在缓冲池中        //已在换冲池中        if (!isExtend)        {            //非扩展buffer            /* Just need to update stats before we exit */            //在退出前,更新统计信息            *hit = true;            VacuumPageHit++;            if (VacuumCostActive)                VacuumCostBalance += VacuumCostPageHit;            TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,                                              smgr->smgr_rnode.node.spcNode,                                              smgr->smgr_rnode.node.dbNode,                                              smgr->smgr_rnode.node.relNode,                                              smgr->smgr_rnode.backend,                                              isExtend,                                              found);            /*             * In RBM_ZERO_AND_LOCK mode the caller expects the page to be             * locked on return.             * RBM_ZERO_AND_LOCK模式,调用者期望page锁定后才返回             */            if (!isLocalBuf)            {                //非临时表buffer                if (mode == RBM_ZERO_AND_LOCK)                    LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),                                  LW_EXCLUSIVE);                else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)                    LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));            }            //根据buffer描述符读取buffer并返回buffer            //#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)            return BufferDescriptorGetBuffer(bufHdr);        }        /*         * We get here only in the corner case where we are trying to extend         * the relation but we found a pre-existing buffer marked BM_VALID.         * This can happen because mdread doesn't complain about reads beyond         * EOF (when zero_damaged_pages is ON) and so a previous attempt to         * read a block beyond EOF could have left a "valid" zero-filled         * buffer.  Unfortunately, we have also seen this case occurring         * because of buggy Linux kernels that sometimes return an         * lseek(SEEK_END) result that doesn't account for a recent write. In         * that situation, the pre-existing buffer would contain valid data         * that we don't want to overwrite.  Since the legitimate case should         * always have left a zero-filled buffer, complain if not PageIsNew.         * 程序执行来到这里,进程尝试扩展relation但发现了先前已存在的标记为BM_VALID的buffer.         * 这种情况之所以发生是因为mdread对于在EOF之后的读不会报错(zero_damaged_pages设置为ON),         *   并且先前尝试读取EOF的block遗留了"valid"的已初始化(填充0)的buffer.         * 不幸的是,我们同样发现因为Linux内核的bug(有时候会返回lseek/SEEK_END结果)导致这种情况.         * 在这种情况下,先前已存在的buffer会存储有效的数据,这些数据不希望被覆盖.         * 由于合法的情况下应该总是留下一个零填充的缓冲区,如果不是PageIsNew,则报错。         */        //获取block        bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);        if (!PageIsNew((Page) bufBlock))            //不是PageIsNew,则报错            ereport(ERROR,                    (errmsg("unexpected data beyond EOF in block %u of relation %s",                            blockNum, relpath(smgr->smgr_rnode, forkNum)),                     errhint("This has been seen to occur with buggy kernels; consider updating your system.")));        /*         * We *must* do smgrextend before succeeding, else the page will not         * be reserved by the kernel, and the next P_NEW call will decide to         * return the same page.  Clear the BM_VALID bit, do the StartBufferIO         * call that BufferAlloc didn't, and proceed.         * 在成功执行前,必须执行smgrextend,否则的话page不能被内核保留,         *   同时下一个P_NEW调用会确定返回同样的page.         * 清除BM_VALID位,执行BufferAlloc没有执行的StartBufferIO调用,然后继续。         */        if (isLocalBuf)        {            //临时表            /* Only need to adjust flags */            //只需要调整标记            uint32      buf_state = pg_atomic_read_u32(&bufHdr->state);            Assert(buf_state & BM_VALID);            buf_state &= ~BM_VALID;            pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);        }        else        {            //非临时表            /*             * Loop to handle the very small possibility that someone re-sets             * BM_VALID between our clearing it and StartBufferIO inspecting             * it.             * 循环,直至StartBufferIO返回T为止             */            do            {                uint32      buf_state = LockBufHdr(bufHdr);                Assert(buf_state & BM_VALID);                //清除BM_VALID标记                buf_state &= ~BM_VALID;                UnlockBufHdr(bufHdr, buf_state);            } while (!StartBufferIO(bufHdr, true));        }    }    //------------- buffer不在缓冲池中    /*     * if we have gotten to this point, we have allocated a buffer for the     * page but its contents are not yet valid.  IO_IN_PROGRESS is set for it,     * if it's a shared buffer.     * 如果到了这个份上,我们已经为page分配了buffer,但其中的内容还没有生效.     * 如果是共享内存,那么设置IO_IN_PROGRESS标记.     *     * Note: if smgrextend fails, we will end up with a buffer that is     * allocated but not marked BM_VALID.  P_NEW will still select the same     * block number (because the relation didn't get any longer on disk) and     * so future attempts to extend the relation will find the same buffer (if     * it's not been recycled) but come right back here to try smgrextend     * again.     * 注意:如果smgrextend失败,我们将以一个已分配但为设置为BM_VALID的buffer结束这次调用     */    //验证    Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));   /* spinlock not needed */    //获取block    bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);    if (isExtend)    {        //-------- 扩展block        /* new buffers are zero-filled */        //新buffers使用0填充        MemSet((char *) bufBlock, 0, BLCKSZ);        /* don't set checksum for all-zero page */        //对于使用全0填充的page,不要设置checksum        smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false);        /*         * NB: we're *not* doing a ScheduleBufferTagForWriteback here;         * although we're essentially performing a write. At least on linux         * doing so defeats the 'delayed allocation' mechanism, leading to         * increased file fragmentation.         * 注意:这里我们不会执行ScheduleBufferTagForWriteback.虽然我们实质上正在执行写操作.         * 起码,在Linux平台,执行这个操作会破坏"延迟分配"机制,导致文件碎片.         */    }    else    {        //-------- 普通block        /*         * Read in the page, unless the caller intends to overwrite it and         * just wants us to allocate a buffer.         * 读取page,除非调用者期望覆盖它并且希望我们分配buffer.         *          */        if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)            //如为RBM_ZERO_AND_LOCK或者RBM_ZERO_AND_CLEANUP_LOCK模式,初始化为0            MemSet((char *) bufBlock, 0, BLCKSZ);        else        {            //其他模式            instr_time  io_start,//io的起止时间                        io_time;            if (track_io_timing)                INSTR_TIME_SET_CURRENT(io_start);            //smgr(存储管理器)读取block            smgrread(smgr, forkNum, blockNum, (char *) bufBlock);            if (track_io_timing)            {                //需要跟踪io时间                INSTR_TIME_SET_CURRENT(io_time);                INSTR_TIME_SUBTRACT(io_time, io_start);                pgstat_count_buffer_read_time(INSTR_TIME_GET_MICROSEC(io_time));                INSTR_TIME_ADD(pgBufferUsage.blk_read_time, io_time);            }            /* check for garbage data */            //检查垃圾数据            if (!PageIsVerified((Page) bufBlock, blockNum))            {                //如果page为通过验证                if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)                {                    //出错,则初始化                    ereport(WARNING,                            (errcode(ERRCODE_DATA_CORRUPTED),                             errmsg("invalid page in block %u of relation %s; zeroing out page",                                    blockNum,                                    relpath(smgr->smgr_rnode, forkNum))));                    //初始化                    MemSet((char *) bufBlock, 0, BLCKSZ);                }                else                    //出错,报错                    ereport(ERROR,                            (errcode(ERRCODE_DATA_CORRUPTED),                             errmsg("invalid page in block %u of relation %s",                                    blockNum,                                    relpath(smgr->smgr_rnode, forkNum))));            }        }    }    //--------- 已扩展了buffer或者已读取了block    /*     * In RBM_ZERO_AND_LOCK mode, grab the buffer content lock before marking     * the page as valid, to make sure that no other backend sees the zeroed     * page before the caller has had a chance to initialize it.     * 在RBM_ZERO_AND_LOCK模式下,在标记page为有效之前获取buffer content lock,     *   确保在调用者初始化之前没有其他进程看到已初始化为0的page     *     * Since no-one else can be looking at the page contents yet, there is no     * difference between an exclusive lock and a cleanup-strength lock. (Note     * that we cannot use LockBuffer() or LockBufferForCleanup() here, because     * they assert that the buffer is already valid.)     * 由于没有其他进程可以搜索page内容,因此获取独占锁和cleanup-strength锁没有区别.     * (注意不能在这里使用LockBuffer()或者LockBufferForCleanup(),因为这些函数假定buffer有效)     */    if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&        !isLocalBuf)    {        //锁定        LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);    }    if (isLocalBuf)    {        //临时表        /* Only need to adjust flags */        //只需要调整标记        uint32      buf_state = pg_atomic_read_u32(&bufHdr->state);        buf_state |= BM_VALID;        pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);    }    else    {        //普通表        /* Set BM_VALID, terminate IO, and wake up any waiters */        //设置BM_VALID,中断IO,唤醒等待的进程        TerminateBufferIO(bufHdr, false, BM_VALID);    }    //更新统计信息    VacuumPageMiss++;    if (VacuumCostActive)        VacuumCostBalance += VacuumCostPageMiss;    //跟踪    TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,                                      smgr->smgr_rnode.node.spcNode,                                      smgr->smgr_rnode.node.dbNode,                                      smgr->smgr_rnode.node.relNode,                                      smgr->smgr_rnode.backend,                                      isExtend,                                      found);    //返回buffer    //#define BufferDescriptorGetBuffer(bdesc) ((bdesc)->buf_id + 1)    return BufferDescriptorGetBuffer(bufHdr);}

三、跟踪分析

测试场景一:Block不在缓冲区中
脚本:

16:42:48 (xdb@[local]:5432)testdb=# select * from t1 limit 10;

启动gdb,设置断点

(gdb) b ReadBuffer_commonBreakpoint 1 at 0x876e28: file bufmgr.c, line 711.(gdb) cContinuing.Breakpoint 1, ReadBuffer_common (smgr=0x2b7cce0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL,     strategy=0x0, hit=0x7ffc7761dfab) at bufmgr.c:711711     bool        isLocalBuf = SmgrIsTemp(smgr);(gdb)

1.初始化相关变量和执行相关判断(是否扩展isExtend?是否临时表isLocalBuf?)

(gdb) n713     *hit = false;(gdb) 716     ResourceOwnerEnlargeBuffers(CurrentResourceOwner);(gdb) 718     isExtend = (blockNum == P_NEW);(gdb) 720     TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,(gdb) 728     if (isExtend)(gdb) 731     if (isLocalBuf)(gdb) 745         bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,(gdb)

2.调用BufferAlloc获取buffer描述符

(gdb) 747         if (found)(gdb) p *bufHdr$1 = {tag = {rnode = {spcNode = 1663, dbNode = 16402, relNode = 51439}, forkNum = MAIN_FORKNUM, blockNum = 0},   buf_id = 108, state = {value = 2248409089}, wait_backend_pid = 0, freeNext = -2, content_lock = {tranche = 54, state = {      value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}}(gdb) p found$2 = false(gdb) (gdb) n750             pgBufferUsage.shared_blks_read++; --> 更新统计信息(gdb)

4.没有在缓存中命中,则获取block

756     if (found)(gdb) 856     Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));   /* spinlock not needed */(gdb) 858     bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);(gdb) 860     if (isExtend)(gdb) p bufBlock$4 = (Block) 0x7fe8c240e380

4.2如为普通buffer
4.2.1如模式为RBM_ZERO_AND_LOCK/RBM_ZERO_AND_CLEANUP_LOCK,填充0
4.2.2否则,通过smgr(存储管理器)读取block,如需要,则跟踪I/O时间,同时检查垃圾数据

(gdb) p mode$5 = RBM_NORMAL(gdb) (gdb) n880         if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)(gdb) 887             if (track_io_timing)(gdb) 890             smgrread(smgr, forkNum, blockNum, (char *) bufBlock);(gdb) 892             if (track_io_timing)(gdb) p *smgr$6 = {smgr_rnode = {node = {spcNode = 1663, dbNode = 16402, relNode = 51439}, backend = -1}, smgr_owner = 0x7fe8ee2bc7b8,   smgr_targblock = 4294967295, smgr_fsm_nblocks = 4294967295, smgr_vm_nblocks = 4294967295, smgr_which = 0,   md_num_open_segs = {1, 0, 0, 0}, md_seg_fds = {0x2b0dd78, 0x0, 0x0, 0x0}, next_unowned_reln = 0x0}(gdb) p forkNum$7 = MAIN_FORKNUM(gdb) p blockNum$8 = 0(gdb) p (char *) bufBlock$9 = 0x7fe8c240e380 "\001"(gdb)

5.已扩展了buffer或者已读取了block
5.1如需要,锁定buffer
5.2如为临时表,则调整标记;否则设置BM_VALID,中断IO,唤醒等待的进程

(gdb) n901             if (!PageIsVerified((Page) bufBlock, blockNum))(gdb) 932     if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&(gdb) n938     if (isLocalBuf)(gdb) 949         TerminateBufferIO(bufHdr, false, BM_VALID);(gdb)

5.3更新统计信息
5.4返回buffer

(gdb) 952     VacuumPageMiss++;(gdb) 953     if (VacuumCostActive)(gdb) 956     TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,(gdb) 964     return BufferDescriptorGetBuffer(bufHdr);(gdb) 965 }(gdb)

buf为109

(gdb) ReadBufferExtended (reln=0x7fe8ee2bc7a8, forkNum=MAIN_FORKNUM, blockNum=0, mode=RBM_NORMAL, strategy=0x0) at bufmgr.c:666666     if (hit)(gdb) 668     return buf;(gdb) p buf$10 = 109(gdb)

测试场景二:Block已在缓冲区中
再次执行上面的SQL语句,这时候相应的block已读入到buffer中

(gdb) delDelete all breakpoints? (y or n) y(gdb) cContinuing.^CProgram received signal SIGINT, Interrupt.0x00007fe8ec448903 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:8181  T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)(gdb) b ReadBuffer_commonBreakpoint 2 at 0x876e28: file bufmgr.c, line 711.(gdb)

found变量为T

...(gdb) 745         bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,(gdb) 747         if (found)(gdb) p found$11 = true(gdb) (gdb) n748             pgBufferUsage.shared_blks_hit++;(gdb)

进入相应的逻辑
3.如在缓存中命中
3.1如非扩展buffer,更新统计信息,如有需要,锁定buffer并返回
3.2如为扩展buffer,则获取block
3.2.1如PageIsNew返回F,则报错
3.2.2如为本地buffer(临时表),则调整标记
3.2.3如非本地buffer,则清除BM_VALID标记

(gdb) 756     if (found)(gdb) 758         if (!isExtend)(gdb) 761             *hit = true;(gdb) 762             VacuumPageHit++;(gdb) 764             if (VacuumCostActive)(gdb) 767             TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,(gdb) 779             if (!isLocalBuf)(gdb) 781                 if (mode == RBM_ZERO_AND_LOCK)(gdb) 784                 else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)(gdb) 788             return BufferDescriptorGetBuffer(bufHdr);(gdb) 965 }(gdb)

到此,关于"PostgreSQL中ReadBuffer_common函数有什么作用"的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注网站,小编会继续努力为大家带来更多实用的文章!

0