生产环境网关模块偶发的 OutOfDirectMemoryError 错误排查起来困难且曲折,2021-02-05号也出现过此问题,起初以为是 JVM 堆内存过小 (当时是 2g) 导致,后调整到8g(2月5号调整)。但是经过上次调整后5月7号又出现此问题,于是猜测可能是由于网关模块存在内存泄露导致。
症状
报错详情
网关模块偶现 OutOfDirectMemoryError 错误,两次问题出现相隔大概 3 个月。两次发生的时机都是正在大批量接收数据 (大约 500w),TPS 60 左右,网关服务波动不大,完全能扛住,按理不应该出现此错误。
详细报错信息如下:
2021-05-06 13:44:18|WARN |[reactor-http-epoll-5]|[AbstractChannelHandlerContext.java : 311]|An exception 'io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16384 byte(s) of direct memory (used: 8568993562, max: 8589934592)' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handling the following exception:
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16384 byte(s) of direct memory (used: 8568993562, max: 8589934592)
at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:754)
at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:709)
at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.allocateDirect(UnpooledUnsafeNoCleanerDirectByteBuf.java:30)
at io.netty.buffer.UnpooledDirectByteBuf.<init>(UnpooledDirectByteBuf.java:64)
at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:41)
at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.<init>(UnpooledUnsafeNoCleanerDirectByteBuf.java:25)
at io.netty.buffer.UnsafeByteBufUtil.newUnsafeDirectByteBuf(UnsafeByteBufUtil.java:625)
at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:359)
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
at io.netty.buffer.AbstractByteBufAlloc