lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Hello all,

I have written a simple preprocessor for Lua programs.
Its purpose is to allow me to write more compact Lua
programs without actually changing the compiler.  Besides
loving compactness, I guess I am the kind of person that is
susceptible to the fascination of symbolic (as opposed to
verbal) writing.  And, as I like the syntactical variety
in program texts, I believe that it is better achieved by
using distinctive symbols as much as possible instead of
words.  Hence my attempt at syntactical heresy for Lua.
It is indeed very simple -- mostly, nothing but a
one-to-one token replacement.

I believe that using rslua (and `rs' is because I dare
calling it `rationalized syntax' instead of `heretic',
but I only do that in private :)) lets me keep the program
text in fewer lines and somewhat less indented, especially
when there are anonymous functions and coroutines.

Let me state clearly that I know that:
  -- most Lua programmers like Lua's syntax just the way
     it is;
  -- for many, the concrete syntax is not even important;
  -- for many of those who would like to see changes in the
     syntax, what I present here is not what they themselves
     would want.
So I am not trying to show anyone `the right way', but just
to share a predilection of mine.  Comments and suggestions
are welcome.

Here are two examples, taken from the Lua distribution and
transcribed to rslua.

The first one is the factorial example expressed through
the Y combinator:

Y = \(g)
      @ a = \(f) >> f(f) .
      >> a(\(f)
             >> g(\(x) @ c=f(f) >> c(x) .) .) .

F = \(f) >> \(n) if n==0 : >> 1 // >> n*f(n-1) . . .

factorial = Y(F)

\ test(x)
  io.write(x,"! = ",factorial(x),"\n")

for n=0,16 ! test(n) .

Next comes the sieve of Eratosthenes with coroutines:

\ gen(n) >> :cw:(\() for i=2,n ! :cy:(i) . .) .

\ filter(p,g)  >> :cw:(
    \()  !? .t !
           @ n = g()
           if n==? : >>.
           if :mm:(n,p)~=0 : :cy:(n) . . .) .

N = N | 1000
x = gen(N)
!? .t !
  @ n = x()
  if n==? : >.
  x = filter(n,x)

The preprocessor is a sed script.  Here is what it does.

Characters and character combinations are replaced as

:     /?      //    !?     !!      >?     !   .    @
then  elseif  else  while  repeat  until  do  end  local

>.         !>.           >>      >>.         !>>.
break end  do break end  return  return end  do return end

>>      >>.         !>>.           \
return  return end  do return end  function

&    |   ~    .f     .t    .pi      ?
and  or  not  false  true  math.pi  nil

Also, sequences of the form `<name> <op>=' (with no or some
spaces after <name>) are replaced by `<name> = <name> <op>',
where <op> is one of the following:

+   -   *   /   ^   ..   &   |

(This implies that & and | change to `and' and `or'

Sequences of the form `++<name>', `--<name>' and `~~<name>'
are replaced by `<name> = <name>+1', `<name> = <name>-1' and
`<name> = not <name>', respectively.

And sequences of the form `:<name>:' are replaced by names
of module procedures as follows:

:cc: (coroutine.create), :cr: (coroutine.resume),
:cy: (coroutine.yield), :cw: (coroutine.wrap), and
:cs: (coroutine.status).

:ma: (math.abs), :mm: (math.mod), :_: (math.floor),
:^: (math.ceil), :\/: (math.min), :/\: (math.max),
:ms: (math.sin), :mc: (math.cos), :mt: (math.tan),
:mas: (math.asin), :mac: (math.acos), :mat: (math.atan),
:mat2: (math.atan2), :mq: (math.sqrt), :mp: (math.pow),
:me: (math.exp), :ml: (math.log), and :ml0: and :m10:

All mentioned sequences are only replaced if they are
preceded and followed by either a whitespace (i.e. space,
tab, or start/end of line), or some of the following
specific characters or sequences:
  -- \, .f, .t, ?, and .pi may be preceded by some of ,;([{
  -- all of the above but \ can also be preceded by == or ~=
  -- additionally, before .pi there may be +, -, *, /, ^,
     <, >, <=, or >=
  -- ., .f, .t, ?, .pi, as well as ++<name>, --<name>, and
     ~~<name>, may be followed by some of ,;)]}
  -- .pi can also be followed by some of +, -, *, /, ^,
     <, >, ==, and ~=
  -- \ may be followed by (
  -- :<name>: may be followed by (

I tried to choose the replacement rules, including
delimiters, according to what seemed reasonable to me in
the context of the real syntax of Lua.

If anyone is interested, I can post the sed script that
implements the preprocessor.  It is actually shorter than
my explanation here. :)

Some issues that the use of rslua implies are:

One must be careful to properly separate tokens that are
to be replaced, and to avoid replacement where not wanted.

The expression following <op>= may have to be enclosed in
parentheses, or the result might differ from the expected.

The following two are more serious, but can be addressed
by using awk (or why not Lua) instead of sed to implement
the preprocessor.

In += etc. only simple names can be dealt with (no dots,
no colons).

The preprocessor will make replacements even in constant
strings, if there is a match.  That destroys the program.
(The same holds of comments but this is less of a problem.)