lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

It was thus said that the Great pocomane once stated:
> Sean, if you stopped to write the API specification due to this
> discussion, please don't. Continue your remarkable work.

  Thanks.  I've been wanting this to finish up before I continue.  There is
a possible change to the directory API, and I've been reading up on the
Windows IOCP and there are ... issues there.  Sigh.

> On Tue, Feb 4, 2020 at 6:08 PM Sean Conner wrote:
> >
> >   With select() and poll(), the list of file descriptors is kept on the user
> > side; with epoll() and kqueue(), the list of file descriptors is kept on the
> > kernel side.  The effect is that epoll()/kqueue() return a list of file
> > descriptors that have events.  This is another reason I'm asking for a
> > citation for handling in a specific order.
> Ok, no citation so just trash all the rest as proposal. But I would
> ask you to answer to the following anyway, since there is something I
> am missing, and I want to understand.
> You reported how the data is returned from epoll/kqueue. After that,
> you have to build the lua table from scratch, right? Why build a lua
> table with fd/whatever as key is different from built a lua table with
> the same fd/whatever in the array part?

  I'll describe how select() and epoll() work (since they are quite
different in their approach), and then tell how I implemented the backend. 
All the gory source code is available here:

  With that ...

  select() takes five parameters:

	n         - integer, number of file descriptors
	readset	  - bitvector---each bit represents a file descriptor.  If
		    bit 0 is set, then file descriptor 0 is in use; if bit
		    3, then file descriptor 3.  Checks if file desciptors
		    is ready for reading.
	writeset  - bitvector (same as above) Checks if file descriptors
		    are ready for writing.
	exceptset - bitvector (same as above) Checks if there is priority
		    data or errors (not all OSes support both options---it's
		    quite messy really).
	timeout   - struct timeval for timing out the call.

(except for n, the rest are actually pointers and can be NULL if you aren't
interested in using that parameter)

  So to use, you typically clear the set, then set each bit per file
descriptor in the appropriate set, then call select().  select() will modify
the sets passed in.  On input, if a bit is set for a file descriptor, the
appropriate action (read, write, except) is checked, and if the condition
exists, the bit remains set; if a bit is clear in input, no condition for
that file descriptor is checked.  Example (sans error checking):

	fd_set read;
	fd_set write;
	int    high;
	int    rc;
	  high = max(fd1,fd2);
	  /* there could be more */

	  rc = select(high,&read,&write,NULL,NULL); /* no except, no timeout */

	  if (FD_ISSET(fd1,&read))
	    /* handle reading fd1 */
	  else if (FD_ISSET(fd2,&read))
	    /* handle reading fd2 */
	  else if (FD_ISSET(fd2,&write))
	    /* handle writing fd2 */

  The limits here are the size of the bit vector (typically 1,024 bits,
could be bigger) and having to scan through the vectors for every file
descriptor to check if there's an event (across three different bit

  epoll() is actually a few functions, epoll_ctl() (to create the set),
epoll_ctl() (to add and remove file descriptors) and epoll_wait() (to get
the list of events).  You first create a port, which returns a file
descriptor.  You then add, modify or remove file descriptors with the
epoll_ctl() call, and then use epoll_wait() to wait for events.  close() is
used to clean up and remove the set.  Example (sans error checking):

	int set;
	set = epoll_create(0);

	/* need only do this once */
	epoll_ctl(set,EPOLL_CTL_ADD,fd1,(struct epoll_event) {.events = EPOLLIN,             .data.fd = fd1 });
	epoll_ctl(set,EPOLL_CTL_ADD,fd2,(struct epoll_event) {.events = EPOLLIN | EPOLLOUT , .data.fd = fd2 });
	/* there could be more */

	  struct epoll_event events[10];
	  int    count;

	  count = epoll_wait(set,events,10,-1); /* up to 10 events, no timeout */
	  for (int i = 0 ; i < count ; i++)
	    if (events[i].data.fd == fd1)
	      /* handle reading fd1 since we only want read events */
	    else if (events[i].data.fd == fd2)
	      if ((events[i].events & EPOLLIN) == EPOLLIN)
		/* handle reading from fd2 */
	      if ((events[i].events & EPOLLOUT) == EPOLLOUT)
		/* handle write events from fd2 */

  epoll_wait() will return *upto* 10 events---it could return less.  And the
returned list only contains items that have an event so there will be at
least one (or more) of the given events assigned per file descriptor (hense
we *know* that fd1 is ready for reading without checking for the event,
since we're only interested in one type of event for fd1).  Here, the kernel
is keeping track of all the file descriptors per event set, unlike select()
where it's the user code keeping track.

  Also, notice the comment "need only do this once"---yes, once you add a
file descriptor to an epoll set, the kernel is *already* tracking it. 
Keeping an array userside could be done, but it wastes space with a needless
copy of file descriptors.  Hmmm ... (I'm holding a thought here, I'll get
back to it in a bit).

  So, wrapping this in Lua.  Luasocket, luaposix and my own
all wrap a socket up into a userdata, so we can do sock:method(...) instead
of socket.method(sock,...).  So already we need a way to extract the
underlying file descriptor from the socket userdata.  Also, one can open
devices with, which return a userdata that wraps a standard C
FILE* object, inside of which is a file descriptor.  That's why I defined
(currently, with a bad name) :_tofd()---to extract the underlying file
descriptor without having to know the underlying userdata structure.

  But once I get the underlying file descriptor (or handle under
Windows---keep that in mind), I *still* need a way to map the file
descriptor *back* to the userdata object.  I mean, I could go the easy way
out (as an implementor) and have the user be responsible for obtaining the
underlying file descriptor and map it back, but I reject that approach:

	1) it's not Luaish
	2) it's a pain for the user

  I've mentioned this before about barely wrapping C functions---I mean,
which would you rather use?

	local assoc =
	  [fd1:_tofd()] = fd1,
	  [fd2:_tofd()] = fd2,
	local list = set:events()
	for _,item in ipairs(list) do
	  local obj = assoc[item.obj]
	  if obj == fd1 then
	  elseif obj == fd2 then


	local list = set:events()
	for _,item in ipairs(list) do
	  if item.obj == fd1 then
	  elseif item.obj == fd2 then

  I personally prefer the second.  And to do the second, the implementation
needs to associate the underlying file descriptor with the userdata object. 
And if the implementation has that, it's just as easy to associate any
arbitrary value with the underlying file descriptor, like a function [2].

  Okay, now back to my thought---originally I had:

	list = set:events()

to get back a list of events to loop through.  But then the implementation
does the loop twice.  Not great.  I'm wondering if perhaps:

	for event in set:events(timeout) do

  Nice idea, and it removes iterating through the results twice, but make
error checking and timeout notification problematic.  Hmm ... perhaps
something like:

	iter,err = set:events(timeout)
	if not iter then
	  -- handle error 
	for event in iter do
	  -- handle loop

  That could work ... I'll have to think about this.  Thanks.

> 1) Tie myself to complex frameworks that can handle different type of
> events. As a real world example (lua-less, it was C), we worked on a
> machine control that got inputs from tcp and electrical signals
> (proprietary ADC/DAC board, we migrated to modbus only later).

  Hmm.  How did the proprietary ADC/DAC board work software wise?  Did you
read directly from the board or did you open it as a device via the file
system API?  If the later, then yes, that just fits in with the normal flow
of events (data is ready to read, read the data).  If you read it directly
from the hardware, then the event loop was probably something like:

	rc = select(timeout); /* or poll() or epoll() or whatever for TCP */
	/* handle select() */

  A quick search on modbus reveals it's a serial protocol defined over
something similar to RS-232, so in that case, the system probably presents
the data from the OS equivalent of a file descriptor, much like the TCP

  At least, that's my guess as to how it worked.

> 2) Use a language that provide an ubiquitous competitive multi
> threading mechanism. We wrote several rest services in Go. it is very
> simple thanks to its thread model.

  That's a variation on 1, only the framework is the runtime of the

> An extedable event system would be a third option (a sort of modular
> variation of the first one).
> > especially with the comment about libuv being a generic event
> > handling library, when it just wraps the APIs I'm already talking about.
> Sorry for the misunderstanding: libuv is not "Generic" or
> "Extendable". However I think it is not just a "Select/epoll" wrapper
> neither. At least this is what I found in its documentation: they use
> some blocking IO through a thread pool [1].

  The diagram shows epoll()/kqueue()/event ports/IOCP at the bottom.  Normal
event driven programming.  Given there's a flow chart of the I/O loop, this
tells me that libuv is more a framework than a simple library, and I suspect
they do this because of Solaris event ports and Windows IOCP (more about
that in another message).

  Also, blocking I/O is releated to files:  "Unlike network I/O, there are
no platform-specific file I/O primitives libuv could rely on, so the current
approach is to run blocking file I/O operations in a thread pool."

> > > > > - Object oriented programming
> > > >
> > > >   Why?
> > >
> > > I think that a Standard Library /API should provide oop in order to be
> > > taken seriously. I know that constructors, inheritance, private data
> > > and so on can be implemented in 30 lines of lua. But the people just
> > > expect that a Standard Library /API has that 30 lines wired inside.
> >
> >   I think we'll have to agree to disagree.  I think OOP has been oversold
> > and most of the examples used to teach it have done more harm and encourage
> > bad design.
> And I have to disagree to agree to disagree. Yeee... to write this
> sentence made my day. In short I agree with your view about oop:
> probably, I should have added "in order to be taken seriously by
> people (not me)".

  I recently came across this quote:

	A good computer language lets developers write good software, not
	novice developers write any software.

  I'm not sure how that applies, but I think it fits.


> [1] at bottom of the page. I
> did not read the libuv sources, I used it a little, I may have misread
> something. However the rest I wrote would be the same since libuv was
> not essential for the discussion.

[2]	Such as my own event framework: