lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Thu, Sep 11, 2014 at 03:38:23PM -0400, Sean Conner wrote:
> It was thus said that the Great William Ahern once stated:
> > On Thu, Sep 11, 2014 at 09:50:29AM -0400, Sean Conner wrote:
> > > It was thus said that the Great Andrew Starks once stated:
> > <snip>
> > > > Are there interesting differences between how you think about / deal with
> > > > error handling in C/C++ vs. Lua?  Accepting that the mechanisms are
> > > > different, do you do more "try to recover" code in C?
> > > 
> > >   I don't program in C++, so I don't use exceptions.  In regard to error
> > > handling, I tend to handle errors in Lua like I do in C.  And I don't really
> > > do "recover" code in C.  [2][3]
> > > 
> > 
> > What do you mean by recover? Surely you handle errors like EMFILE and ENOMEM
> > without exiting the process.
> 
>   How would *you* handle ENOMEM?  

The same way I handle EMFILE. Unwind to a steady state.

If I have 5,000 open connections, and connection 5,001 hits ENOMEM (maybe
the client sent me a huge JSON file that, after parsing, exploded to a
gigabyte), why would I sabotage the other 5,000 connections?

FWIW, I also disable swap on my servers in addition to disabling overcommit.
I'm not going to leave quality of service to chance. In many cases excessive
swapping is even worse than the process aborting. If I don't have the
resources to perform a particular task, why would I let that kill every
other task?

The classic example is Adobe Photoshop. Can you imagine how upset artists
would be if Photoshop killed itself whenever it ran out of memory? If you
had 20 pictures open and were applying some effects, and on the 15th image
it ran out of memory and destroyed all your work.... There would be no
excuse for that. Any developer would call that a bug, not a feature.

>   I wrote a big on this a few years ago:
> 
> 	http://boston.conman.org/2009/12/01.2

Your example is something of a strawman because even the original is broken
in several ways. For one thing, it leaks the listen object. So you weren't
bothering to try to write useful code.

Also, you're using a global epoll descriptor. I point that out because, yes,
in many cases it can make sense to bail on recoverable errors (e.g.
command-line programs, forked server models where you have one process per
session, etc), but rarely in _library_ code, and library code shouldn't be
using global objects. Context matters. I only object to the attitude that
applications should bail on ENOMEM no matter the context. It takes skill to
deal with ENOMEM efficiently, and that skill will never be acquired if one
takes the attitude from the outset that it's not practical to do so. Nobody
ever said programming was easy--dealing with ENOMEM sometimes requirings
using patterns you wouldn't otherwise, but eventually you grow to appreciate
them.

In any event, your example represents the worst case. In even a large
application such an example would most likely be one of the most complex. As
I said before, using RAII patterns you confine and isolate logic like that
to only a handful of places. Then if you strive to write failure-free
interfaces as much as possible--such as using data structures that don't use
dynamic allocation--you reduce even further the amount of error checking
sprinkled around your code.

>   And depending upon the context, ENOMEM may be a case 3 (per the post) or
> case 4.  I generally just stop processing if I get ENOMEM since I usually
> have no idea how to handle the situation.

How do you handle EMFILE?
 
>   Then again, I rarely encounter ENOMEM since Linux does overcommit by
> default, so I'll see a SIGSEGV long before ENOMEM 8-P

Even on a Linux system with overcommit enabled, the process might still have
a per-process or per-user memory limit, or per-process mmap limit (modern
allocators use mmap extensively).

> > In C it helps if you use failure-free data structures as much as possible,
> > which is why I never use list or tree implementations that do dynamic node
> > allocation. Otherwise, you need to maintain additional state and write
> > additional logic to handle partial rewinding of application state.
> 
>   So what do you do in C++?  Is the std::vector (however it's spelled, I'm
> not a C++ programmer) failure-free?  Or the other default data structures
> you get in C++ (or Boost)?  

1) I don't use C++. If I want lots of bells and whistles I tend to prefer
   languages like Lua. The cost/benefit is better, IMO. In any event,
   dealing with ENOMEM can sometimes be easier in C++ because of
   automatically invoked destructors.

2) Sometimes I choose to use a red-black tree rather than an array. For
   example, in my JSON C library with path-based querying, I use trees to
   represent arrays. Each JSON object already has the tree node as a member,
   which means once I allocate the JSON object structure, I know I won't
   need any more dynamic allocation. I also allocate string nodes with one
   allocation. Using such simple strategies I do at least 3x fewer
   allocations than typical, naive JSON libraries. Which means it's 3x
   easier for me to handle the ENOMEM case. And not coincidentally my JSON
   library is faster than most, even though it's much more sophisticated
   from a feature perspective, and I never really invested much effort into
   performance.

3) I never said it was possible to always avoid dynamic allocation. I use
   quite alot of dynamic allocation. For example, growable ring buffers for
   opaque byte data. The goal is to minimize and simplify. There's a huge
   landscape of options between handling ENOMEM in a naive manner and
   bailing on ENOMEM.

4) Another strategy, along the lines of RAII, is to pre-allocate slots. For
   example, in order to use epoll effectively you want to minimize as much
   as possible adding and removing events from the kernel queue (this is why
   Zed Shaw's experiment comparing epoll with poll was flawed, and why he's
   unaware that his Mongrel web server could scale better than it does--I
   was recently asked to investigate Mongrel scaling issues). Rather than
   using callbacks when closing a descriptor (because I personally hate
   callbacks), I use a strategy of delayed closure of my descriptors---I
   wait to close until the caller has notified me that the event loop has
   been updated and the old descriptor removed. I must keep old descriptors
   on a pending list, but it's too late to allocate the slot at the time I
   need to insert it because if I got ENOMEM I'd have to abort the process.
   So before creating a descriptor I make sure there's a free slot available
   in case I need to discard. As a bonus, this strategy puts all that
   resource acquisition logic in one place, rather creating so-called
   sphaghetti code everywhere.