lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Sun, 2005-01-30 at 04:35, Gunnar Zötl wrote:

> 
> I proposed a change to the reader some time ago, that would allow the
> implementation of a proper macro system on top of lua, without an
> external preprocessor, and it could be written in lua. Think scheme's
> syntax-rules, but in a manner that would be somewhat easier to grasp.
> This would allow for any kind of syntactic sugar you could think of.
> As I understand it, the basics of this are already there in 5.1.

Not really -- as I understand it, it works at the lexer level,
but almost all useful syntactic sugar requires parsing.

There is a grammar for Lua in the manual, but it
is only enough to generate it: it is far too ambiguous
for parsing.

I have a Lua grammar which verifies Lua syntax,
and it is considerably more complex (and still has
around 5 ambiguities). I include it below.

To make syntactic sugars two changes are needed.
First, instead of returning (), which is a Felix version of 
Lua nil, you'd have to return a string representing
the nonterminal parsed -- in this parser it would just
be a pretty printer for Lua code.

Then you would need to be able to add new grammar
productions, and for the action code generate strings
of the base language (so the Lua parser would not see the
sugar).

On top of that, to make the grammar work, you'd need
a parser engine -- for this grammar I'm using some 
Ocaml code to translate the Felix grammar to an Elkhound
grammar, and Elkhound to translate that to C++.

This could then be wrapped as a function, and patched
into the core interpreter by changing the source of
strings to filter through the parser.

This grammar is GLR but close to LALR. An LL grammar
may be easier to extend. But the bottom line is that
'user defined extensions' to Lua are not going
to be all that easy, because the language itself
is much more complex than you might think: Lua is NOT
actually as simple as you might think compared
to say Scheme or Forth.

The bottom line is: FORGET IT. If you want OO sugars
added to lua, hacking the existing parser is by
far the easiest way to do it.

** As mentioned about 5 remaining ambiguities,
and I basically gave up trying to resolve them.
GLR tolerates ambiguity, and for purpose of
syntax checking the actual parse tree doesn't matter.

*** writing a parser in Lua will not be easy,
it isn't particularly well suited to that.

**** a simple solution is to only allow
specific kinds of extensions, eg:

(a) add a new statement, using existing
nonterminals -- can probably be done with
a hook

(b) add new operators -- probably can be done
by just modifying a table of operators


-------------------------------------------
  comment "Top level Lua grammar";

  nonterm block:unit =
    | clause* => ()
  ;

  nonterm clause : unit =
    | stat SEMICOLON => ()
    | stat => ()
  ;

  nonterm stat : unit = 
    | varlist1 EQ explist1  => ()
    | functioncall  => ()
    | DO block END  => ()
    | WHILE exp DO block END  => ()
    | REPEAT block UNTIL exp  => ()
    | IF exp THEN block elseif_clause* else_clause? END  => ()
    | RETURN explist1? => ()
    | BREAK => ()
    | FOR NAME EQ exp COMMA_exp COMMA_exp? DO block END  => ()
    | FOR namelist IN explist1 DO block END  => ()
    | FUNCTION funcname funcbody  => ()
    | LOCAL FUNCTION NAME funcbody  => ()
    | LOCAL namelist init? => ()
  ;

  nonterm elseif_clause : unit =
    | ELSEIF exp THEN block => ()
  ;

  nonterm else_clause : unit =
    | ELSE block => ()
  ;

  nonterm funcname : unit = 
    | NAME DOT_NAME* COLON_NAME? => ()
  ;

  nonterm DOT_NAME : unit =
    | DOT NAME => ()
  ;

  nonterm COLON_NAME : unit =
    | COLON NAME => ()
  ;

  nonterm COMMA_NAME : unit =
    | COMMA NAME => ()
  ;

  nonterm COMMA_vari : unit =
    | COMMA vari => ()
  ;

  nonterm varlist1 : unit = 
    | vari COMMA_vari* => ()
  ;

  nonterm namelist : unit = 
    | NAME COMMA_NAME* => ()
  ;

  nonterm init : unit = 
    | EQ explist1 => ()
  ;

  nonterm COMMA_exp : unit =
    | COMMA exp => ()
  ;

  nonterm explist1 : unit = 
    | exp COMMA_exp* => ()
  ;

  nonterm args : unit =
    | LPAREN explist1 RPAREN => ()
    | LPAREN RPAREN => ()
    | tableconstructor  => ()
    | STRING  => ()
  ;

  nonterm function : unit =
    | FUNCTION funcbody => ()
  ;

  nonterm funcbody : unit = 
    | LPAREN parlist1 RPAREN block END => ()
    | LPAREN RPAREN block END => ()
  ;

  nonterm parlist1 : unit = 
    | namelist COMMA DOTDOTDOT => ()
    | namelist => ()
    | DOTDOTDOT => ()
  ;

  nonterm tableconstructor : unit = 
    | LBRACE fieldlist RBRACE => () 
    | LBRACE RBRACE => ()
  ;

  nonterm fieldlist : unit = 
    | field {fieldsep field} fieldsep? => ()
  ;

  nonterm field : unit = 
    | LSQ exp RSQ EQ exp  => ()
    | NAME EQ exp  => ()
    | exp  => ()
  ;

  nonterm fieldsep : unit = 
    | COMMA => ()
    | SEMICOLON => ()
  ;

  // expressions

  nonterm exp : unit =
    | exp OR lland => ()
    | lland => ()
  ;

  nonterm lland : unit =
    | lland AND comparison => ()
    | comparison => ()
  ;
    
  nonterm comparison : unit =
    | comparison LT cat => ()
    | comparison LE cat => ()
    | comparison GT cat => ()
    | comparison GE cat => ()
    | comparison EQEQ cat => ()
    | comparison NE cat => ()
    | cat => ()
  ;
  
  nonterm cat : unit =
    | sum DOTDOT cat => ()
    | sum => ()
  ;
  
  nonterm sum : unit =
    | sum PLUS factor => ()
    | sum MINUS factor => ()
    | factor => ()
  ;
  
  nonterm factor : unit =
    | factor STAR unary => ()
    | factor SLASH unary => ()
    | unary => ()
  ;

  nonterm unary : unit = 
    | MINUS unary => ()
    | TILDE unary => ()
    | NOT unary => ()
    | power => ()
  ;
  
  nonterm power : unit =
    | atom CARET power => ()
    | atom  => ()
  ;
  
  nonterm atom : unit =
    | NIL => ()
    | FALSE => ()
    | TRUE => ()
    | NUMBER => ()
    | STRING => ()
    | tableconstructor => ()
    | function => ()
    | prefixexp => ()
  ;

  nonterm prefixexp : unit =
    | vari => ()
    | functioncall  => ()
    | LPAREN exp RPAREN => ()
  ;

  nonterm vari : unit = 
    | NAME  => ()
    | prefixexp LSQ exp RSQ => ()
    | prefixexp DOT_NAME => ()
  ;


  nonterm functioncall : unit =
    | prefixexp args  => ()
    | prefixexp COLON_NAME args => ()
  ;


-- 
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net