On Mon, Mar 21, 2011 at 5:48 PM, Sam Roberts
<vieuxtech@gmail.com> wrote:
On Mon, Mar 21, 2011 at 7:27 AM, Valerio Schiavoni
> The failures I'd like to detect are in particular the crash of the server.
> Ideally, I'd like the client to detect it as soon as possible.
Its impossible to detect without continuously sending traffic to the
server. TCP doesn't do this when there is no outstanding data to send.
Even doing this, a crashed server and intermediate problem in the
routing (congestion, for example), are not distinguishable from a
crashed server.
> It seems a feature of TCP, keepalive messages, is meant to help in this
> scenario.
Default keepalive timeout is 2 hours, and it isn't configurable (not
portably, anyhow).
If you google for tcp keepalive socket option you should find
discussion of why this feature is not what you want.
> Is this approach the better/most-common way to detect tcp connection
> failures?
If you need fast response to peer machine crashes. you need your
application protocol to support repeatedly sending some kind of
null/keepalive message, and to implement timeouts. Note that this will
cause false positives if there is an intermediate router problem.
Cheers,
Sam