lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


G'day,

[As always, sorry about not properly threading this message; it is a
side-effect of always working from the digest.]

If you are on a system that has the utility 'tr', and you can perhaps
have both the source and destination as normal files for a while, then
there may simple solutions (replace SET1 with the characters/octets
that you want to keep):

    1. tr, keeping SET1 and discarding non-members:

           tr --complement --delete 'SET1' <SOURCE >FILTERED

    2. tr, keeping SET1 and using e.g. '?' for all unwanted octets:

           tr --complement 'SET1' '?' <SOURCE >FILTERED

These have the benefit of using a C program, which also can buffer the
input and output to make I/O more efficient, so, in isolation, it is
likely to be quite fast.

These 'tr' examples require SET1 to be well-behaved in the presence of
the system shell; otherwise, characters such a single-quote ('\'') and
NUL ('\0') may result in undesirable behaviour.

However, integrating 'tr' with Lua (presumably using io.popen()) may
pose difficulties.

If you need to stick to plain Lua, then SOURCE contents would
need to be written out to a temporary file, then io.popen() would give
you 'tr's output without needing another file, but you would need to
clean up the temporary file on exit (including error cases).  The
additional file activity may kill any benefit; such overheads may be
lower if the temporary file is located on a RAMdisk.

--

Another possibility is to run 'tr' as a separate process:

The "luaposix" LuaRock has a complete set of primitives to set up
posix.pipe(), posix.fork(), posix.close(), posix.dup2() and
posix.execp() -- this can allow you to both feed in data, and retrieve
output, without needing temporary files, plus the bonus that command-line
arguments are not exposed to shell magic-character mechanisms.  (Problems
with NUL may still exist, since tr's command-line interface is not
8-bit-clean.)

You would also need posix.poll(), posix.read() and posix.write() to
keep the script/child process link alive, plus posix.wait() to reap
the child; this last call gives you final process information (e.g.
termination status).

I've written a script, called PosixExec.lua, that I use in conjunction
with the luaposix rock to handle most of the gory details hinted at
above.  It is an derived version of the demonstration program for
posix.fork() given in the luaposix documentation.  An ancient version
of this script has been released in the past.  If there is really
strong interest in this script (which could arguably be a Rock in its
own right, or maybe become an add-on for the LuaPosix Rock), then
I'm happy to release it.

cheers,

sur-behoffski
programmer, Grouse Software