故障现象
Mongodb副本集中有一台节点状态异常(无法提供读服务),暂不影响线上业务,因为还有两台节点提供读写服务。
排查过程
报错关键日志
2023-02-27T00:07:56.510+0800 I ACCESS [conn119] SCRAM-SHA-1 authentication failed for __system on local from client 10.0.3.3 ;
AuthenticationFailed It is not possible to authenticate as the __system user on servers started without a --keyFile parameter
2023-02-27T00:07:57.845+0800 I REPL [ReplicationExecutor] Error in heartbeat request to 10.x.x.x:3708;
Unauthorized not authorized on admin to execute command { replSetHeartbeat: "xxxxx", pv: 1, v: 5, from: "10.x.x.x:3708", fromId: 2, checkEmpty: false }
副本集成员在做心跳检测时出现权限认证异常,同步查看其他两节点Mongo进程,发现均有携带–auth参数,唯独这台没有,本地查看mongodb.conf文件中是有auth相关参数及keyfile文件引用
处理措施
手动指定--auth --keyFile
重新启动Mongo
发现直接启动失败,服务异常退出了,以下为相关日志
2023-02-27T12:03:45.995+0800 I ACCESS permissions on /home/xxx/mongodb-3.0.5/_package/keyfile are too open
keyfile认证文件权限太开放了,通过赋予600权限启动正常,复制集健康状态随之也恢复
参考材料
https://stackoverflow.com/questions/14789622/mongodb-keyfile-too-open-permissions
http://pe-kay.blogspot.com/2016/02/update-existing-mongodb-replica-set-to.html