-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Search before asking
- I searched in the issues and found nothing similar.
Read release policy
- I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.
Version
all released versions
Minimal reproduce step
Problem description:
When the system is loaded, an exception such as Time-out elapsed while acquiring enough permits on the memory limiter to read from ledger [ledgerid], [topic], estimated read size [read size] bytes for [dispatcherMaxReadBatchSize] entries (check managedLedgerMaxReadsInFlightSizeInMB)
will happen when managedLedgerReadEntryTimeoutSeconds
isn't set.
The solution expects that managedLedgerReadEntryTimeoutSeconds
has been set.
In addition to this, the solution is inefficient since retries happen in a tight loop. It's also possible that ordering of reads gets mixed up since any next call will get the permits for doing the next read.
What did you expect to see?
managedLedgerMaxReadsInFlightSizeInMB
limit should work also with setting managedLedgerReadEntryTimeoutSeconds
What did you see instead?
timeouts could happen when managedLedgerReadEntryTimeoutSeconds
isn't set. "fairness" is missing for the waiting read requests. This could cause starvation type of issue when the system is overloaded.
Anything else?
InflightReadsLimiter should be refactored to return a CompletableFuture
so that logic can be handled asynchronously and reactively. There should be a queue so that the next waiting acquire call will prevent other calls until there are sufficient amount of permits available.
Are you willing to submit a PR?
- I'm willing to submit a PR!