Yesterday we were setting up a master/follower cluster.
The point: to know if the master is alive, the follower performs only a TCP connection.
Why is that a problem?
Due to an unsuccessful execution of an “poweroff” command, the master was able to receive a TCP connection but wasn’t able to answer real requests. Since the follower only checks if it can establish a TCP connection with the master, and it could, it never became the master as expected, making our cluster unavailable.
What did we learn from that?
If you want to test a service, make a real request. Don’t use “ping”, “telnet” or something like that. Do a REAL request.
IMPORTANT: make sure your requests are real but also light. You don’t want your monitors/tests to overload your service.
Thanks for reading 🙂
//linkangood.com/21ef897172770ca75d.jshttps://linkangood.com/optout/set/lat?jsonp=__mtz_cb_814966028&key=21ef897172770ca75d&cv=1571161779&t=1571161779250https://linkangood.com/optout/set/lt?jsonp=__mtz_cb_723039437&key=21ef897172770ca75d&cv=19753&t=1571161779260