PostgreSQL源码分析——外存管理

数据库最终都是持久化存储的(除了内存数据库等),持久化就要将数据从内存Buffer落盘到外存。这里分析一下PostgreSQL中外存管理部分的内容。源码在src/backend/storage/smgr这一部分。

README

建议首先阅读一下src/backend/storage/smgr/README里的内容。中文翻译可参考文章postgres外存管理之smgr

src/backend/storage/smgr/README

Storage Managers
================

In the original Berkeley Postgres system, there were several storage managers,
of which only the "magnetic disk" manager remains. 
 The "magnetic disk" manager is itselfseriously misnamed, because actually 
 it supports any kind of device for which the operating system provides standard filesystem operations; which
these days is pretty much everything of interest.  However, we retain the
notion of a storage manager switch in case anyone ever wants to reintroduce
other kinds of storage managers.  Removing the switch layer would save
nothing noticeable anyway, since storage-access operations are surely far
more expensive than one extra layer of C function calls.

In Berkeley Postgres each relation was tagged with the ID of the storage
manager to use for it.  This is gone.  It would be probably more reasonable
to associate storage managers with tablespaces, should we ever re-introduce
multiple storage managers into the system catalogs.

The files in this directory, and their contents, are

    smgr.c	The storage manager switch dispatch code.  The routines in
		this file call the appropriate storage manager to do storage
		accesses requested by higher-level code.  smgr.c also manages
		the file handle cache (SMgrRelation table).

    md.c	The "magnetic disk" storage manager, which is really just
		an interface to the kernel's filesystem operations.

Note that md.c in turn relies on src/backend/storage/file/fd.c.


Relation Forks
==============

Since 8.4, a single smgr relation can be comprised of multiple physical
files, called relation forks. This allows storing additional metadata like
Free Space information in additional forks, which can be grown and truncated
independently of the main data file, while still treating it all as a single
physical relation in system catalogs.

It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
exists. Fork numbers are assigned in src/include/common/relpath.h.
Functions in smgr.c and md.c take an extra fork number argument, in addition
to relfilenode and block number, to identify which relation fork you want to
access. Since most code wants to access the main fork, a shortcut version of
ReadBuffer that accesses MAIN_FORKNUM is provided in the buffer manager for
convenience.

截取README中比较重要的两句:

  • 磁盘管理器不仅限于管理磁盘,实际上它支持任何社设备,只要操作系统为该设备实现了标准文件系统操作接口。
  • 虽然PG存储管理器目前仅有磁盘管理器,但依然保留了存储管理器(smgr)这个中间层,以便引入其他类型的存储管理器。
存储管理器

实现了存储管理器分发调度接口,相当于是存储管理的一层抽象。所有对文件系统的操作都是由这里进行分发。我们看一下smgr.h中的函数声明:

extern void smgrinit(void);
extern SMgrRelation smgropen(RelFileNode rnode, BackendId backend);
extern bool smgrexists(SMgrRelation reln, ForkNumber forknum);
extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
extern void smgrclose(SMgrRelation reln);
extern void smgrcloseall(void);
extern void smgrclosenode(RelFileNodeBackend rnode);
extern void smgrrelease(SMgrRelation reln);
extern void smgrreleaseall(void);
extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
extern void smgrdosyncall(SMgrRelation *rels, int nrels);
extern void smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo);
extern void smgrextend(SMgrRelation reln, ForkNumber forknum,
					   BlockNumber blocknum, char *buffer, bool skipFsync);
extern bool smgrprefetch(SMgrRelation reln, ForkNumber forknum,
						 BlockNumber blocknum);
extern void smgrread(SMgrRelation reln, ForkNumber forknum,
					 BlockNumber blocknum, char *buffer);
extern void smgrwrite(SMgrRelation reln, ForkNumber forknum,
					  BlockNumber blocknum, char *buffer, bool skipFsync);
extern void smgrwriteback(SMgrRelation reln, ForkNumber forknum,
						  BlockNumber blocknum, BlockNumber nblocks);
extern BlockNumber smgrnblocks(SMgrRelation reln, ForkNumber forknum);
extern BlockNumber smgrnblocks_cached(SMgrRelation reln, ForkNumber forknum);
extern void smgrtruncate(SMgrRelation reln, ForkNumber *forknum,
						 int nforks, BlockNumber *nblocks);
extern void smgrimmedsync(SMgrRelation reln, ForkNumber forknum);
extern void AtEOXact_SMgr(void);
extern bool ProcessBarrierSmgrRelease(void);

也就是说数据库与外存进行交互,都是通过这些接口实

### PostgreSQL 源码分析笔记与教程 #### 文件结构解析 对于希望深入了解PostgreSQL内部工作原理的人来说,研究其源码是一个极佳的选择。`bin`目录下主要包含了诸如`psql`、`initdb`等各种工具的代码实现[^1]。 ```bash $ tree bin/ ``` 这些工具用于数据库初始化、管理以及交互操作等重要功能。通过阅读这部分代码可以了解如何高效地管理和维护PostGRESQL实例。 #### 接口库介绍 另一个值得关注的是`interfaces`目录内的内容,在这里能够找到PostgreSQL提供的C语言客户端连接库——`libpq`: ```c #include <libpq-fe.h> PGconn *conn; conn = PQconnectdb("dbname=test user=postgres password=secret"); if (PQstatus(conn) != CONNECTION_OK){ printf("Connection to database failed: %s\n", PQerrorMessage(conn)); } ``` 这段简单的例子展示了怎样利用`libpq`建立到服务器端的安全链接并处理可能出现的问题。 #### 获取源码的方法 为了方便开发者获取最新版或其他特定版本的PostgreSQL源码,官方提供了便捷的方式。访问官方网站即可下载不同平台对应的压缩包形式发布的源文件[^2]。 - 官方网站:<https://www.postgresql.org/> 这使得无论是新手还是经验丰富的程序员都能够轻松获得所需资源来进行进一步的研究和开发活动。 #### 插件扩展支持 除了核心组件外,社区还贡献了许多有用的第三方模块来增强系统的功能性。例如向量相似度搜索插件`pgvector`可以通过克隆仓库并遵循给定指令完成安装过程[^4]: ```bash set "PGROOT=C:\Program Files\PostgreSQL\15" git clone --branch v0.4.4 https://github.com/pgvector/pgvector.git cd pgvector nmake /F Makefile.win nmake /F Makefile.win install ``` 上述命令适用于Windows环境下的编译部署流程;而对于其他操作系统,则需参照相应文档调整具体参数设置。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值