lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Jan 17, 2011, at 1:08 PM, Valerio Schiavoni wrote:

> Hello Drake,
> 
> On Mon, Jan 17, 2011 at 6:51 PM, Drake Wilson <drake@begriffli.ch> wrote:
>>> - as soon as the server crashes and the connection is somehow broken,
>>> the client must do something
>> 
>> You can't do that.  There is no way to detect that on the wire if you
>> really want "as soon as"
> 
> That is the reason why I wanted to use keep-alive messages, as they
> sounded a out-of-the-box solution which I could rely upon.
> 
>> TCP-level keepalives are not readily configurable at the application
>> level in many common operating systems.  There is no standard socket
>> option that I know of to set the timing from that side; I've only ever
>> seen it done at the OS configuration level.
> 
> ok then. I will search for the default values used by the tcp stack on
> Linux and OSX, out of curiosity.
> 
>> I wouldn't use them if there's another choice.  If you're designing
>> the application protocol, provide a mechanism for application-level
>> keepalives instead.
> 
> I am implementing a protocol [1] that rely on TCP to detect failures
> of nodes, where 'an open tcp connection' is setup to a small set of
> nodes. What I wanted is to detect when these connections are broken,
> without any other heartbeat strategy.    I guess I will roll-back to
> that approach nevertheless.
> 
> 1 - http://docs.di.fc.ul.pt/jspui/bitstream/10455/2981/1/07-13.pdf
> 
> Thanks for the suggestions.
> valerio
> 

Keepalives don't only detect failures of nodes, they detect failures of the path to the other TCP peer.  More accurately, they're really "make deads" because if you have a TCP connection that's otherwise idle (with no outstanding, unACK'ed data), then why would you want to tear down the association due to a transient network failure?  I routinely have idle SSH sessions open to hosts for weeks and I like that they survive my DSL line bouncing in the middle of the night when I don't notice or care.

If your application needs to know if it's peer application is available or not, then you ought to implement a mechanism that detects that, end-to-end.  You can't rely on a keepalive to do that; your application might be hung and not reading data from it's socket, yet it will not be torn down by the TCP keepalive mechanism because the remote TCP stack will happily respond, just like it will respond to zero-window probes.

Louis Mamakos