lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Tue, Sep 23, 2014 at 03:58:13PM -0300, Thiago L. wrote:
> On 23/09/14 03:54 PM, William Ahern wrote:
<snip>
> >For one thing, that the library users wants to handle events through a
> >callback (push rather than pull). Depending on how the object acquires
> >input, that the event dispatcher will loop indepenently rather than being
> >controlled in a step-by-step fashion by the library user.
> >
> >Google "push pull parsing". It's generally considered that pull parsers are
> >much easier to integrate into applications. For example,
> >
> >	http://docs.oracle.com/cd/E19316-01/819-3669/bnbdy/index.html
> >
> So just use raw sockets...?

All my implementations for things like HTTP, SMTP, and even JSON (the
lexer), whether in C, Lua, or Perl, all have these two methods

  parser:parse(data) --> parses chunk of input octet stream
  parser:get() --> returns next buffered event object

If I also support composing a stream, they'll have

  composer:put(event) --> composes event object onto octet stream buffer
  composer:compose() --> returns next chunk of octet stream from buffer

They don't do any I/O interinally, although I'll also usually add a more
convenient, optional, higher-level API which handles socket I/O (e.g. the
get method is overloaded with a method which first does socket read and
parse operations). But with the low-level API I or somebody else can use any
other socket library. It also allows me to easily integrate the library into
an application regardless of whether it's threaded or non-blocking,
callback-based or stateful.

More importantly, it also makes it easier to integrate the library into
another library. So, for example, I can tie my low-level HTTP parser into a
higher-level HTTP/MIME session library, or some other HTTP-based protocol
module. Source/sink callbacks are superficially nice, but when you try
combining two separate pieces of software written by different developers
(which may be past-you and future-you) which both use source/sink callbacks,
it becomes quite ugly combining them at the boundary.[1]

Also, it makes it _much_ easier to do regression testing! Running regression
tests on libraries which do socket I/O is a pain in the butt. Even if they
offer source/sink callbacks, it's still pretty ugly compared to something
like

	parser:parse("GET / HTTP/1.1\r\nContent-Type: text/plain\r\n\r\n")
	assert(parser:get().protocol == 1.1)
	assert(parser:get().subtype == "plain")

Doing things in a way which puts the caller 100% in charge of control flow
is admittedly more tedious than simply implementing source/sink callbacks at
the interface. But it puts all the complexity where it belongs: in a black
box. And because it's easier to write regression tests, even assuming added
complexity in the logic (often you must resort to explicit state
machines--either switch based, or as has been pointed out to me before on
this list (by Dirk Laurie, I think), functional-style, tail-recursive
continuation passing[2]) your production code will often be less buggy and
easier to maintain.


[1] Every developer goes through a phase of creating "glue" libraries which
they try to use in all their own software, but which few others will use.
Take, for example, http://w3.impa.br/~diego/software/luasocket/ltn12.html.
It's an obviously great idea, tried-and-true (the interface protocol is a
superset of the above), but getting everybody to use that library in
particular is like herding cats. Also, such grand solutions can be too
abstract--i.e. they make easy problems easier but hard problems more
difficult. In any event, if you use the method above, you can easily include
an LTN12 wrapper interface; it just doesn't need to be a dependency.

[2] This is similar to callback dispatching, but it's _hidden_ in the
implementation. And it doesn't need to be generic, can be more strongly
typed, and can be refactored without worrying about breaking external
interfaces.