A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/StackExchange/StackExchange.Redis/issues/2265 below:

Endpoints marked as not selectable even though Connection is established on AWS ElasticCache clustered mode enabled · Issue #2265 · StackExchange/StackExchange.Redis · GitHub

We are using ElasticCache on AWS with clustered mode enabled with two shards and two nodes in each (primary and replica).
When calling
connection = ConnectionMultiplexer.Connect(configurationOptions, Console.Out);
in many cases endpoints are marked as not selectable (DidNotRespond flag) in second iteration when endpoints are discovered from cluster even though their status is always ConnectedEstablished.
The heartbeat timer will clear that flag, but we need to do a dirty workaround


var attemptsLeft = 5;
var delay = TimeSpan.FromSeconds(1);
do
{
   var status = connection.GetStatus();

   if (!status.Contains("not in use"))
     break;

logger.LogWarning("Not all servers  selectable. Waiting for {Delay} for connection. Attempts {AttemptsLeft} left", delay, attemptsLeft);
Thread.Sleep(delay);
} while (attemptsLeft-- > 0);

to wait for servers to be selectable. We need primary endpoint to be connected and selectable as soon as ConnectionMultiplexer is created otherwise we will get errors like "Command cannot be issued to a replica: "

StackExchange.Redis.RedisConnectionException: ProtocolFailure on redishangfire.barnehage.testaws.visma.com:6379/Interactive, Writing/ReadAsync, last: HMSET, origin: WriteMessageToServerInsideWriteLock, outstanding: 1, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 10 of 10 available, in: 0, last-heartbeat: never, last-mbeat: 5s ago, global: 5s ago, v: 2.6.66.47313
StackExchange.Redis.RedisCommandException: Command cannot be issued to a replica: HMSET {hangfire}:recurring-job:IDocumentJobService.CreateDefaultTemplatesAsync
at StackExchange.Redis.PhysicalBridge.WriteMessageToServerInsideWriteLock(PhysicalConnection connection, Message message) in /_/src/StackExchange.Redis/PhysicalBridge.cs:line 1483

Waiting less than a second clears that flag and everything start to work.
Is there any reason behind to mark endpoints as not selectable when connection is already established?

else
{
   server.SetUnselectable(UnselectableFlags.DidNotRespond);
   log?.WriteLine($"  {Format.ToString(server)}: Did not respond");
}
  case (int)State.ConnectedEstablishing:
  case (int)State.ConnectedEstablished:
      var tmp = physical;
      if (tmp != null)
      {
          if (state == (int)State.ConnectedEstablished)
          {
             Interlocked.Exchange(ref connectTimeoutRetryCount, 0);
             tmp.BridgeCouldBeNull?.ServerEndPoint?.ClearUnselectable(UnselectableFlags.DidNotRespond);
          }

Interesting part that increasing connectTimeout does not help at all.

16:07:23.6432: Connecting (sync) on .NET 6.0.9 (StackExchange.Redis: v2.6.66.47313)
16:07:23.6599: redishangfire.my.domain.com:6379,password=*****,ssl=True,sslProtocols=Tls,abortConnect=False
16:07:23.6617: redishangfire.my.domain.com:6379/Interactive: Connecting...
16:07:23.6694: redishangfire.my.domain.com:6379: BeginConnectAsync
16:07:23.6712: 1 unique nodes specified (with tiebreaker)
16:07:23.6714: redishangfire.my.domain.com:6379: OnConnectedAsync init (State=Connecting)
16:07:23.6717: Allowing 1 endpoint(s) 00:00:05 to respond...
16:07:23.6723: Awaiting 1 available task completion(s) for 5000ms, IOCP: (Busy=0,Free=1000,Min=2,Max=1000), WORKER: (Busy=2,Free=32765,Min=2,Max=32767), POOL: (Threads=4,QueuedItems=0,CompletedItems=332)
16:07:23.6782: Configuring TLS
16:07:24.0147: TLS connection established successfully using protocol: Tls
16:07:24.0546: redishangfire.my.domain.com:6379/Interactive: Connected 
16:07:24.0553: redishangfire.my.domain.com:6379: Server handshake
16:07:24.0556: redishangfire.my.domain.com:6379: Authenticating (password)
16:07:24.0645: redishangfire.my.domain.com:6379: Setting client name: ip-10-48-101-89(SE.Redis-v2.6.66.47313)
16:07:24.0662: redishangfire.my.domain.com:6379: Auto-configuring...
16:07:24.0673: redishangfire.my.domain.com:6379: Requesting tie-break (Key=__Booksleeve_TieBreak)...
16:07:24.0676: redishangfire.my.domain.com:6379/Interactive: Writing: GET __Booksleeve_TieBreak
16:07:24.0679: redishangfire.my.domain.com:6379: Sending critical tracer (handshake): ECHO
16:07:24.0679: redishangfire.my.domain.com:6379/Interactive: Writing: ECHO
16:07:24.0681: redishangfire.my.domain.com:6379: Flushing outbound buffer
16:07:24.0681: redishangfire.my.domain.com:6379: OnEstablishingAsync complete
16:07:24.0682: redishangfire.my.domain.com:6379: Starting read
16:07:24.0710: redishangfire.my.domain.com:6379: Auto-configured (INFO) role: replica
16:07:24.0713: redishangfire.my.domain.com:6379: Auto-configured (INFO) version: 6.2.6
16:07:24.0718: redishangfire.my.domain.com:6379: Auto-configured (INFO) server-type: cluster
16:07:24.0756: Response from redishangfire.my.domain.com:6379/Interactive / GET __Booksleeve_TieBreak: Error: MOVED 6202 rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379
16:07:24.0758: Response from redishangfire.my.domain.com:6379/Interactive / ECHO: BulkString: 16 bytes
16:07:24.0759: redishangfire.my.domain.com:6379: OnConnectedAsync completed (From command: ECHO)
16:07:24.0769: All 1 available tasks completed cleanly, IOCP: (Busy=0,Free=1000,Min=2,Max=1000), WORKER: (Busy=6,Free=32761,Min=2,Max=32767), POOL: (Threads=8,QueuedItems=1,CompletedItems=1016)
16:07:24.0774: Endpoint summary:
16:07:24.0775:   redishangfire.my.domain.com:6379: Endpoint is ConnectedEstablished
16:07:24.0775:   redishangfire.my.domain.com:6379: Returned with success as Cluster replica (Source: From command: ECHO)
16:07:24.2786: rec1234abcdefgh-0002-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: OnConnectedAsync init (State=Connecting)
16:07:24.2788: rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: OnConnectedAsync init (State=ConnectedEstablishing)
16:07:24.2789: rec1234abcdefgh-0002-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: OnConnectedAsync init (State=Connecting)
16:07:24.2790: rec1234abcdefgh-0001-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: OnConnectedAsync init (State=Connecting)
16:07:24.2791: Allowing 4 endpoint(s) 00:00:04.3930000 to respond...
16:07:24.2791: Awaiting 4 available task completion(s) for 4393ms, IOCP: (Busy=0,Free=1000,Min=2,Max=1000), WORKER: (Busy=3,Free=32764,Min=2,Max=32767), POOL: (Threads=8,QueuedItems=3,CompletedItems=1026)
16:07:28.6746: Not all available tasks completed cleanly (from ReconfigureAsync#1284, timeout 4393ms), IOCP: (Busy=0,Free=1000,Min=2,Max=1000), WORKER: (Busy=2,Free=32765,Min=2,Max=32767), POOL: (Threads=8,QueuedItems=0,CompletedItems=1079)
16:07:28.6754:   Server[0] (rec1234abcdefgh-0002-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379) Status: WaitingForActivation (inst: 11, qs: 0, in: 0, qu: 11, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: ReadAsync. ws: Idle)
16:07:28.6755:   Server[1] (rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379) Status: WaitingForActivation (inst: 11, qs: 0, in: 0, qu: 11, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: ReadAsync. ws: Idle)
16:07:28.6755:   Server[2] (rec1234abcdefgh-0002-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379) Status: WaitingForActivation (inst: 12, qs: 0, in: 0, qu: 12, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: ReadAsync. ws: Idle)
16:07:28.6755:   Server[3] (rec1234abcdefgh-0001-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379) Status: WaitingForActivation (inst: 12, qs: 0, in: 0, qu: 12, aw: False, in-pipe: -1, out-pipe: -1, bw: Inactive, rs: ReadAsync. ws: Idle)
16:07:28.6755: Endpoint summary:
16:07:28.6755:   rec1234abcdefgh-0002-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Endpoint is ConnectedEstablished
16:07:28.6756:   rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Endpoint is ConnectedEstablished
16:07:28.6756:   rec1234abcdefgh-0002-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Endpoint is ConnectedEstablished
16:07:28.6756:   rec1234abcdefgh-0001-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Endpoint is ConnectedEstablished
16:07:28.6756:   rec1234abcdefgh-0002-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Did not respond
16:07:28.6756:   rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Did not respond
16:07:28.6756:   rec1234abcdefgh-0002-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Did not respond
16:07:28.6756:   rec1234abcdefgh-0001-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Did not respond
16:07:28.6756: Cluster: 16384 of 16384 slots covered
16:07:28.6757: Endpoint Summary:
16:07:28.6757:   redishangfire.my.domain.com:6379: Cluster v6.2.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished; sub: ConnectedEstablished, 1 active
16:07:28.6760:   redishangfire.my.domain.com:6379: int ops=13, qu=0, qs=0, qc=0, wr=0, socks=1, non-pref=1; sub ops=4, qu=0, qs=0, qc=0, wr=0, subs=1, socks=1
16:07:28.6786:   redishangfire.my.domain.com:6379: Circular op-count snapshot; int: 0+13=13 (1.30 ops/s; spans 10s); sub: 0+4=4 (0.40 ops/s; spans 10s)
16:07:28.6786:   rec1234abcdefgh-0002-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Cluster v6.2.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a; not in use: DidNotRespond
16:07:28.6787:   rec1234abcdefgh-0002-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: int ops=11, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
16:07:28.6787:   rec1234abcdefgh-0002-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Circular op-count snapshot; int: 0+11=11 (1.10 ops/s; spans 10s); sub:
16:07:28.6787:   rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Cluster v6.2.6, primary; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a; not in use: DidNotRespond
16:07:28.6788:   rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: int ops=11, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
16:07:28.6788:   rec1234abcdefgh-0001-001.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Circular op-count snapshot; int: 0+11=11 (1.10 ops/s; spans 10s); sub:
16:07:28.6788:   rec1234abcdefgh-0002-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Cluster v6.2.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a; not in use: DidNotRespond
16:07:28.6788:   rec1234abcdefgh-0002-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: int ops=12, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
16:07:28.6788:   rec1234abcdefgh-0002-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Circular op-count snapshot; int: 0+12=12 (1.20 ops/s; spans 10s); sub:
16:07:28.6788:   rec1234abcdefgh-0001-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Cluster v6.2.6, replica; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a; not in use: DidNotRespond
16:07:28.6788:   rec1234abcdefgh-0001-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: int ops=12, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
16:07:28.6789:   rec1234abcdefgh-0001-002.rec1234abcdefgh.620ymr.euw1.cache.amazonaws.com:6379: Circular op-count snapshot; int: 0+12=12 (1.20 ops/s; spans 10s); sub:
16:07:28.6789: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: -1s ago
16:07:28.6789: Starting heartbeat...
16:07:28.6792: Total connect time: 5,036 ms        

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4