lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I verified the code to work perfectly on Linux using Lua 5.1. There presumably is a bug in the interaction with FreeBSD. I'm going to be looking into it.

Jim Mellander wrote:
Hi Diego:

Thanks for your continuing assistance. My complaint is not entirely the timing, but would be more precisely the timing to accurately receive all the data - I've had to run a tight loop polling all the sockets, as select seems not to accurately return the status of all the sockets.

I've changed flood.lua to only send to 2 sockets, and server.lua binding to all 5. Here we see server.lua only seeing the first socket:

1144856783 Binding to host '127.0.0.1' and port 8081...
1144856783 Waiting for packets on 127.0.0.1:8081...
1144856783 Binding to host '127.0.0.1' and port 8082...
1144856783 Waiting for packets on 127.0.0.1:8082...
1144856783 Binding to host '127.0.0.1' and port 8083...
1144856783 Waiting for packets on 127.0.0.1:8083...
1144856783 Binding to host '127.0.0.1' and port 8084...
1144856783 Waiting for packets on 127.0.0.1:8084...
1144856783 Binding to host '127.0.0.1' and port 8085...
1144856783 Waiting for packets on 127.0.0.1:8085...
1144856784 Packet received from 127.0.0.1/2568 1 bytes
1144856784 Packet received from 127.0.0.1/2568 1 bytes
1144856785 Packet received from 127.0.0.1/2568 1 bytes
1144856785 Packet received from 127.0.0.1/2568 1 bytes
1144856786 Packet received from 127.0.0.1/2568 1 bytes
1144856786 Packet received from 127.0.0.1/2568 1 bytes


This, along with the previous info (not setting #5 when 5 sockets are being used) suggests that the status for the *last* socket in the select is not being returned accurately. Above run made with SOCKET_POLL defined when building luasocket, causing it to use the poll system call. Running 'top' shows the process in 'poll' most of the time, as expected (LUASOCKET_DEBUG also defined)

(Time passes)

OK, luasocket recompiled with SOCKET_POLL undefined, top shows process in'select' most of the time, same behavior:

1144857761 Binding to host '127.0.0.1' and port 8081...
1144857761 Waiting for packets on 127.0.0.1:8081...
1144857761 Binding to host '127.0.0.1' and port 8082...
1144857761 Waiting for packets on 127.0.0.1:8082...
1144857761 Binding to host '127.0.0.1' and port 8083...
1144857761 Waiting for packets on 127.0.0.1:8083...
1144857761 Binding to host '127.0.0.1' and port 8084...
1144857761 Waiting for packets on 127.0.0.1:8084...
1144857761 Binding to host '127.0.0.1' and port 8085...
1144857761 Waiting for packets on 127.0.0.1:8085...
1144857761 Packet received from 127.0.0.1/3304 1 bytes
1144857761 Packet received from 127.0.0.1/3304 1 bytes
1144857762 Packet received from 127.0.0.1/3304 1 bytes
1144857762 Packet received from 127.0.0.1/3304 1 bytes
1144857763 Packet received from 127.0.0.1/3304 1 bytes
1144857763 Packet received from 127.0.0.1/3304 1 bytes
1144857764 Packet received from 127.0.0.1/3304 1 bytes
1144857764 Packet received from 127.0.0.1/3304 1 bytes
1144857765 Packet received from 127.0.0.1/3304 1 bytes
1144857765 Packet received from 127.0.0.1/3304 1 bytes
1144857766 Packet received from 127.0.0.1/3304 1 bytes
1144857766 Packet received from 127.0.0.1/3304 1 bytes

It looks like an off-by-one error. BTW - I note (or think I do, anyway), that defining SOCKET_POLL only changes the timeout function for an individual socket (socket_waitfd), but the actual lua select function will always use select in any event.

As a dreadful hack I changed:
ret = select(n, rfds, wfds, efds, t >= 0.0 ? &tv: NULL);

to

ret = select(1024, rfds, wfds, efds, t >= 0.0 ? &tv: NULL);

in socket_select.c to see if that would make a difference, to no avail.

Running truss on the running 'lua server.lua' gives in a recurring pattern:

select(0x400,0xbfbff630,0xbfbff5b0,0x0,0xbfbff510) = 1 (0x1)
gettimeofday(0xbfbfd624,0x0)                     = 0 (0x0)
recvfrom(0x3,0xbfbfd6a0,0x2000,0x0,0xbfbff6a0,0xbfbfd68c) ERR#35 'Resource temporarily unavailable'
gettimeofday(0xbfbfd494,0x0)                     = 0 (0x0)
select(0x4,0xbfbfd590,0x0,0x0,0xbfbfd500)        = 1 (0x1)
recvfrom(0x3,0xbfbfd6a0,0x2000,0x0,0xbfbff6a0,0xbfbfd68c) = 1 (0x1)
gettimeofday(0xbfbff638,0x0)                     = 0 (0x0)
write(1,0x80ad000,55)                            = 55 (0x37)
gettimeofday(0xbfbff504,0x0)                     = 0 (0x0)
gettimeofday(0xbfbff4b4,0x0)                     = 0 (0x0)
select(0x400,0xbfbff630,0xbfbff5b0,0x0,0xbfbff510) = 1 (0x1)
gettimeofday(0xbfbfd624,0x0)                     = 0 (0x0)
recvfrom(0x3,0xbfbfd6a0,0x2000,0x0,0xbfbff6a0,0xbfbfd68c) ERR#35 'Resource temporarily unavailable'


I note that select() is only returning 1, indicating that only 1 socket is readable, and the first socket that recvfrom() attempts is a failure, but the next select() & recvfrom() works.

I'll dig further, but I find it hard to believe that it is a Freebsd bug (I'm a bit biased, since the BSD networking code was developed here at LBL). Perhaps there's an interpretation difference in the expected parameters.



Diego Nehab wrote:

Hi,

The timing seems faster than expected, and the 5th expected datasource is not seem. Does this give any insight into the problem?



Actually, I guess the timing is OK, since we're sleeping for .5 secs after sending all packets, but the 5th datasource is being skipped.



Your original complaint was about the timing, right? Can you revert to
the original LuaSocket and test if the timing is correct?

BTW - I have luasocket compiled to use the poll syscall instead of select, in hopes of fixing my problem, but so far, no luck...



I have no experience with BSD, but I don't expect select to be broken.

Regards,
Diego.



--
Jim Mellander
Incident Response Manager
Computer Protection Program
Lawrence Berkeley National Laboratory
(510) 486-7204

Your fortune for today is:

Never face facts; if you do you'll never get up in the morning.
		-- Marlo Thomas