RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://github.com/StackExchange/StackExchange.Redis/issues/2664 below:

Thread pool starvation caused by sync waiting inside PhysicalBridge · Issue #2664 · StackExchange/StackExchange.Redis · GitHub

Here is the code that starts a dedicated backlog processing thread and provides some motivation in comments on the decision not to use the thread pool:

// Start the backlog processor; this is a bit unorthodox, as you would *expect* this to just // be Task.Run; that would work fine when healthy, but when we're falling on our face, it is // easy to get into a thread-pool-starvation "spiral of death" if we rely on the thread-pool // to unblock the thread-pool when there could be sync-over-async callers. Note that in reality, // the initial "enough" of the back-log processor is typically sync, which means that the thread // we start is actually useful, despite thinking "but that will just go async and back to the pool" var thread = new Thread(s => ((PhysicalBridge)s!).ProcessBacklogAsync().RedisFireAndForget()) { IsBackground = true, // don't keep process alive (also: act like the thread-pool used to) Name = "StackExchange.Redis Backlog", // help anyone looking at thread-dumps }; thread.Start(this);

But after asynchronously awaiting ProcessBridgeBacklogAsync, the rest of the code (including loop iterations) will not only continue on the thread pool thread but also block it on a sync lock:"

private async Task ProcessBacklogAsync() { _backlogStatus = BacklogStatus.Starting; try { while (!isDisposed) { if (!_backlog.IsEmpty) { // TODO: vNext handoff this backlog to another primary ("can handle everything") connection // and remove any per-server commands. This means we need to track a bit of whether something // was server-endpoint-specific in PrepareToPushMessageToBridge (was the server ref null or not) await ProcessBridgeBacklogAsync().ForAwait(); } // The cost of starting a new thread is high, and we can bounce in and out of the backlog a lot. // So instead of just exiting, keep this thread waiting for 5 seconds to see if we got another backlog item. _backlogStatus = BacklogStatus.SpinningDown; // Note this is happening *outside* the lock var gotMore = _backlogAutoReset.WaitOne(5000); if (!gotMore) { break; } }

In our case, we have several relatively large Redis clusters with tens to hundreds of nodes, and sometimes we experience situations where the majority of thread pool threads are being blocked inside instances of PhysicalBridge:

So we have the best of two worlds: hundreds of expensive creations of worker threads and thread pool starvation 🙂

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4