Hi, recently we observed that when Redis is not accessible to our microservices than they stop responding to requests. This was quite a shock as we always relayed on Redis as on service that is not guaranteed. Therefore we strongly relay on short timeouts and in case Redis is not available we always upstream requests to source of truth.
However some time ago we started to use FireAndForget flag for requests that did not require to complete (best effort). This was ideal when invalidating cache and similar use cases.
It seems that when we queue such request in time when there is network partition to Redis instance or some fail over procedure is in progress than all subsequent request to Redis on every thread using same connection multiplexer hangs for ever (until Redis instance is back and accessible).
Check this simple test:
[Fact] public async Task IssueReproduction() { // connection to fake service var connection = await ConnectionMultiplexer.ConnectAsync(new ConfigurationOptions { EndPoints = new EndPointCollection(new List<EndPoint>{new DnsEndPoint("localhost", 1111)}), AsyncTimeout = 500, SyncTimeout = 500, ConnectTimeout = 500, ConnectRetry = 0, AbortOnConnectFail = false }); // Timeouts works when we try to get something from Redis var stopwatch = Stopwatch.StartNew(); try { await connection.GetDatabase().StringGetAsync("fake-key"); } catch (RedisTimeoutException) { } stopwatch.Elapsed.Should().BeLessThan(TimeSpan.FromSeconds(2)); // After we try to set something it will work as well stopwatch.Restart(); try { await connection.GetDatabase().StringSetAsync("fake-key", "value", TimeSpan.FromHours(1), flags: CommandFlags.None); } catch (RedisTimeoutException) { } stopwatch.Elapsed.Should().BeLessThan(TimeSpan.FromSeconds(2)); // It still works when we get something stopwatch.Restart(); try { await connection.GetDatabase().StringGetAsync("fake-key"); } catch (RedisTimeoutException) { } stopwatch.Elapsed.Should().BeLessThan(TimeSpan.FromSeconds(2)); // It works for first fire-and-forget set stopwatch.Restart(); try { await connection.GetDatabase().StringSetAsync("fake-key", "value", TimeSpan.FromHours(1), flags: CommandFlags.FireAndForget); } catch (RedisTimeoutException) { } stopwatch.Elapsed.Should().BeLessThan(TimeSpan.FromSeconds(2)); // BUT after first fire and forget set it stops respecting timeouts and hangs all future requests for ever await connection.GetDatabase().StringGetAsync("fake-key"); throw new Exception("THIS WILL NOT BE THROWN!"); }
We are on latest version of StackExchange.Redis
assembly 2.6.96
.
Is it right approach to use FireAndForget mode to speed such invalidations/updates when we hope it will succeed but we do not want to block parent request waiting for Redis response? Like I said best effort to set something in cache in background.
Is it expected that it disrupt timeouts for every subsequent request?
KrzysztofBranicki, marekott, tomasz-janiczek, eligu and m00lecule
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4