概述
本文以分布式卷和复制卷介绍glusterfs相关命令
命令后增加--xml,命令返回值会以xml格式返回
节点管理
查看节点列表
gluster peer status
[root@cvknode1 b1]# gluster peer status
Number of Peers: 2
Hostname: 192.168.0.111
Uuid: 5c07e6de-ba88-414b-9b87-83dd242fc36b
State: Peer in Cluster (Connected)
Hostname: 192.168.0.116
Uuid: a56e341d-09fe-4ad5-9d46-ea193a3b94c9
State: Peer in Cluster (Connected)
增加节点
gluster peer probe 192.168.0.111
gluster peer probe 192.168.0.116
[root@cvknode1 b1]# gluster peer probe 192.168.0.111
peer probe: success.
删除节点
gluster peer detach 192.168.0.116
[root@cvknode1 b1]# gluster peer detach 192.168.0.116
All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y
peer detach: success
卷管理
目标为创建一个三节点的分布式卷和复制卷
1、分布式卷架构
效果为文件按照hash随机分布,多个节点有不同的文件
2、复制卷架构
效果为多个节点有相同文件:
其他类型的卷操作类似,请自行体会
3、还有一个比较常用的类型就是分布式复制卷, 结合了分布式卷和复制卷的优点,适用于节点较多时的高性能可靠存储。架构如下:
效果为文件按照Hash随机分布,但文件有备份
创建卷
gluster volume create testvol 192.168.0.107:/brick1/b1 force
增加force参数时,可使用任意目录创建卷,否则只能用挂载点
默认为分布式卷
创建后需要启动才能使用
[root@cvknode1 glusterfs]# mkdir -p /brick1/b1
[root@cvknode1 glusterfs]# gluster volume create testvol 192.168.0.107:/brick1/b1 force
volume create: testvol: success: please start the volume to access data
如果要创建复制卷,使用如下命令
gluster volume create testrep replica 2 transport tcp 192.168.0.107:/brick2/rep 192.168.0.111:/brick2/rep force
[root@cvknode1 ~]# gluster volume create testrep replica 2 transport tcp 192.168.0.107:/brick2/rep 192.168.0.111:/brick2/rep force
volume create: testrep: success: please start the volume to access data
启动卷
gluster volume start testvol
[root@cvknode1 glusterfs]# gluster volume start testvol
volume start: testvol: success
扩容卷
gluster volume add-brick testvol 192.168.0.111:/brick1/b1 force
gluster volume add-brick testvol 192.168.0.116:/brick1/b1 force
[root@cvknode1 b1]# gluster volume add-brick testvol 192.168.0.111:/brick1/b1 force
volume add-brick: success
对于复制卷来说,扩容时需要重新指定备份数
gluster volume add-brick testrep replica 3 192.168.0.116:/brick2/rep force
[root@cvknode1 ~]# gluster volume add-brick testrep replica 3 192.168.0.116:/brick2/rep force
volume add-brick: success
复制卷添加仲裁节点
gluster volume add-brick testrep replica 3 arbiter 1 192.168.0.116:/brick2/rep force
仲裁节点不存储数据,可在不增加存储成本的情况下有效防止脑裂
[root@cvknode1 ~]# gluster volume add-brick testrep replica 3 arbiter 1 192.168.0.116:/brick2/rep force
volume add-brick: success
查看卷
查看所有卷
gluster volume info
查看指定卷
gluster volume info testvol
[root@cvknode1 b1]# gluster volume info testvol
Volume Name: testvol
Type: Distribute
Volume ID: 43f3ddfe-d7a3-48ef-8225-950eccad0067
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 192.168.0.107:/brick1/b1
Brick2: 192.168.0.111:/brick1/b1
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
查看指定卷状态
gluster volume status testvol
命令结尾增加 detail参数会得到各节点更加详细的信息
当某个节点Online为N时,说明有故障,可以尝试gluster volume replace-brick命令,将brick替换为临时brick,然后再重新替换回去
[root@cvknode1 b1]# gluster volume status testvol
Status of volume: testvol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.0.107:/brick1/b1 49153 0 Y 3634518
Brick 192.168.0.111:/brick1/b1 49153 0 Y 2692944
Brick 192.168.0.116:/brick1/b1 49153 0 Y 2921430
Task Status of Volume testvol
------------------------------------------------------------------------------
Task : Rebalance
ID : 0aef9493-3827-4687-bbe7-9c822c24d56c
Status : completed
挂载
mount -t glusterfs 192.168.0.107:/testvol /brick1/mpt/
IP地址写任意存储节点均可
[root@cvknode111 b1]# mkdir -p /brick1/mpt
[root@cvknode111 b1]# mount -t glusterfs 192.168.0.107:/testvol /brick1/mpt/
[root@cvknode111 b1]# df -h | grep brick1
192.168.0.107:/testvol 349G 90G 245G 27% /brick1/mpt
业务系统操作文件均操作挂载点里的数据
缩容
gluster volume remove-brick testvol 192.168.0.116:/brick1/b1
[root@cvknode1 b1]# gluster volume remove-brick testvol 192.168.0.116:/brick1/b1 start
It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly.
Do you want to continue with your current cluster.force-migration settings? (y/n) y
volume remove-brick start: success
ID: c148220a-ad12-4d87-9ccb-e91b0c84ca6d
[root@cvknode1 b1]# gluster volume remove-brick testvol 192.168.0.116:/brick1/b1 status
Node Rebalanced-files size scanned failures skipped status run time in h:m:s
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
192.168.0.116 4 833.5MB 5 0 0 in progress 0:00:22
The estimated time for rebalance to complete will be unavailable for the first 10 minutes.
如图所示,gluster会提示你在缩容时禁用cluster.force-migration选项,否则如果文件正在被写入,强制迁移可能导致文件损坏。禁用该选项后,迁移期间被写入的文件将跳过迁移,迁移完成后你需要在被删除节点上手动拷贝这些文件到存活节点上。
缩容是一个异步任务,待任务完成后,执行commit命令完成任务
gluster volume remove-brick testvol 192.168.0.116:/brick1/b1 commit
[root@cvknode1 b1]# gluster volume remove-brick testvol 192.168.0.116:/brick1/b1 commit
volume remove-brick commit: success
Check the removed bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick.
对于复制卷来说,缩容需要重新指定备份数
gluster volume remove-brick testrep replica 2 192.168.0.116:/brick2/rep force
[root@cvknode1 ~]# gluster volume remove-brick testrep replica 2 192.168.0.116:/brick2/rep force
Remove-brick force will not migrate files from the removed bricks, so they will no longer be available on the volume.
Do you want to continue? (y/n) y
volume remove-brick commit force: success
重平衡
gluster volume rebalance testvol start force
用于节点变化导致的分布不均匀的情况,主要用于分布式卷,执行命令后重新计算Hash分布
[root@cvknode1 b1]# gluster volume rebalance testvol start force
volume rebalance: testvol: success: Rebalance on testvol has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 268d10ef-10aa-4bb2-88e0-6b5fbe87fe00
查询卷的配置
gluster volume get testvol all
[root@cvknode1 b1]# gluster volume get testvol all
Option Value
------ -----
cluster.lookup-unhashed on
cluster.lookup-optimize on
cluster.min-free-disk 10%
cluster.min-free-inodes 5%
cluster.rebalance-stats off
修改卷的配置
gluster volume set testvol nfs.disable off
nfs.disable可以替换为其他任意配置项
[root@cvknode1 b1]# gluster volume set testvol nfs.disable off
Gluster NFS is being deprecated in favor of NFS-Ganesha Enter "yes" to continue using Gluster NFS (y/n) y
volume set: success
停止卷
gluster volume stop testvol
[root@cvknode1 b1]# gluster volume stop testvol
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: testvol: success
删除卷
gluster volume delete testvol
删除完成后挂载点被删除,但是brick中的数据还在
[root@cvknode1 b1]# gluster volume delete testvol
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: testvol: success
替换卷
gluster volume replace-brick testrep 192.168.0.107:/brick3/rep 192.168.0.107:/brick3/backup commit force
[root@cvknode1 /]# gluster volume replace-brick testrep 192.168.0.107:/brick3/rep 192.168.0.107:/brick3/backup commit force
volume replace-brick: success: replace-brick commit force operation successful
故障处理
1、复制卷两个brick数据不一致
执行命令模拟访问挂载点中所有文件出发同步
find /brick2/mpt/ -type f -print0 | xargs -0 head -c1
[root@cvknode111 mpt]# find /brick2/mpt/ -type f -print0 | xargs -0 head -c1
==> /brick2/mpt/H3C_Workspace_App-E2008-win64.zip <==
P
==> /brick2/mpt/20250613153708fe675a.key <==
+
==> /brick2/mpt/H3C_Learningspace_App-E2008-linux-x64.zip <==
P
==> /brick2/mpt/npp.8.7.8.Installer.x64.exe <==
M
==> /brick2/mpt/2.json <==
{
==> /brick2/mpt/center-share/FileZilla_3.67.0_win64-setup.exe <==
M
==> /brick2/mpt/H3C_Learningspace_App-E2008-win32.zip <==
2、复制卷brick损坏修复
(1)在正常节点获取gluster扩展属性
getfattr -d -m . -e hex /brick2/rep/
[root@cvknode1 mpt]# getfattr -d -m . -e hex /brick2/rep/
getfattr: Removing leading '/' from absolute path names
# file: brick2/rep/
trusted.afr.testrep-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.mdata=0x0100000000000000000000000068591de8000000003a1917160000000068591de8000000003a19171600000000685914d30000000016496137
trusted.glusterfs.volume-id=0xc040d52ab1b4425980dcaac2253a8da8
(2)在故障节点设置扩展属性
[root@cvknode1 brick2]# setfattr -n trusted.gfid -v 0x00000000000000000000000000000001 rep/
[root@cvknode1 brick2]# setfattr -n trusted.glusterfs.volume-id 0xc040d52ab1b4425980dcaac2253a8da8 rep/
setfattr: 0xc040d52ab1b4425980dcaac2253a8da8: No such file or directory
[root@cvknode1 brick2]# setfattr -n trusted.glusterfs.volume-id -v 0xc040d52ab1b4425980dcaac2253a8da8 rep/
[root@cvknode1 brick2]# setfattr -n trusted.glusterfs.mdata -v 0x0100000000000000000000000068591de8000000003a1917160000000068591de8000000003a19171600000000685914d30000000016496137 rep/
[root@cvknode1 brick2]# setfattr -n trusted.afr.testrep-client-2 -v 0x000000000000000000000000 rep/
(3)重启glusterd服务
[root@cvknode1 brick2]# service glusterd restart
Redirecting to /bin/systemctl restart glusterd.service
3、脑裂
(1)查看存储卷有无脑裂文件,如果存在显示为“ls in split brain”的脑裂文件,请删除脑裂文件。如需保留脑裂文件,可进入块存储路径下复制出文件,待存储池修复后再复制回原路径。
gluster volume heal <name> info
[root@cvknode1 /]# gluster volume heal cluster_0-vms-learningstorage-glusterfs-courseImages info
Brick 192.168.0.107:/vms/learningstorage/glusterfs/brick
Status: Connected
Number of entries: 0
Brick 192.168.0.111:/vms/learningstorage/glusterfs/brick
Status: Connected
Number of entries: 0
Brick 192.168.0.116:/vms/learningstorage/glusterfs/brick
Status: Connected
Number of entries: 0
(2)进入到对应节点的brick目录将脑裂文件备份。
(3)执行mkdir -p /vms/temp创建临时目录/vms/temp。
(4)执行mount -t glusterfs IP:/VolumeName /vms/temp挂载GlusterFS存储卷,其中IP为GlusterFS存储卷某一正常节点的IP地址,VolumeName为存储卷名称。
(5)进入到/vms/temp目录删除脑裂文件。
(6)取消挂载umount /vms/temp。
(7)将备份文件拷贝到对应的挂载点路径。