lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I would try to optimize cases where a more specific instruction of the LPEG virtual
machine is bigger than two or three more general instructions.

Something like this. There is an ISet instruction for character sets. When you do an
ordered choice like: 


> a = lpeg.P'x' + lpeg.P'y'
> a:print()
[]
00: set [(78-79)]
09: end 

The pattern "a" is a character set, and its size is 9.

If there is some pattern like:
> a = ('x' + 'y') *  smallpattern

It would become smaller if you rewrite it like:
> a = 'x' * smallpattern + 'y' * smallpattern

If you put a common prefix in evidence, you can also get
a smallter a pattern. Try:

> a = lpeg.P"aXYZ" + "aLMN" + "aFGH"
> a:print()

and

> a = lpeg.P"a" * (lpeg.P"XYZ" + "LMN" + "FGH")
> a:print()


Sérgio






----- Mensagem original ----
De: Norman Ramsey <nr@cs.tufts.edu>
Para: lua@bazar2.conectiva.com.br
Cc: nr@cs.tufts.edu
Enviadas: Quinta-feira, 27 de Novembro de 2008 21:52:27
Assunto: LPEG -- getting clobbered by pattern too big!

I'm writing an LPEG grammar for moderately simple assembly language.
Unfortunately the very last instruction results in an LPEG
error: my pattern is too big!  Apparently Roberto is using
16-bit signed integers to count offsets in the abstract machine,
so my pattern is limited to 32K elements.   I'm probably doing
lots of stupid things, but it sure is annoying because my
grammar (appended) is not that big.

Any hints on how to reduce the number of elements in an LPEG pattern?
I'm not even sure what 'elements' are at the source level...
and I doubt anybody wants to read my 800-line assembler!

The grammar is below; all hints are welcome!


Norman


============= UM Macro Assembler input grammar ============
  <comment> ::= from # or // to end-of-line
<reserved> ::= if | a | goto | new | array | nand | xor
             |  inactivate | input | output | in | program | using
             |  off | here | halt | words | push | pop | on | off | stack
    <ident> ::= identifier as in C, except <reserved> or <reg>
    <label> ::= <ident>
      <reg> ::= rNN, where NN is any decimal number
        <k> ::= <hex-literal> | <decimal-literal> | <character-literal>
   <lvalue> ::= <reg> | a[<reg>][<rvalue>]
   <rvalue> ::= <reg> | a[<reg>][<rvalue>]
             |  <k> | <label> | <label> + <k> | <label> - <k>
    <relop> ::= != | == | <s | >s | <=s | >=s
    <binop> ::= + | - | * | / | nand | & | '|' | xor | mod
     <unop> ::= - | ~
    <instr> ::= <lvalue> := <rvalue>
             |  <lvalue> := <rvalue> <binop> <rvalue>
             |  <lvalue> := <unop> <rvalue>
             |  <lvalue> := new array (<rvalue> words)
             |  inactivate a[<reg>]
             |  <lvalue> := input()
             |  output <rvalue>
             |  output <string-literal>
             |  goto *<reg> in program a[<reg>]
             |  halt
             |  goto <rvalue> [linking <reg>]
             |  if (<rvalue> <relop> <rvalue>) <lvalue> := <rvalue>
             |  if (<rvalue> <relop> <rvalue>) goto <rvalue>
             |  push <rvalue> on  stack <reg>
             |  pop  <lvalue> off stack <reg>
<directive> ::= .section <ident>
             |  .data <label> [(+|-) <k>]
             |  .data <k>
             |  .space <k>
             |  .string <string-literal>
             |  .zero <reg> | .zero off               // identify zero register
             |  .temps <reg> {, <reg>} | .temps off   // temporary regs
     <line> ::= {<label>:} [<instr> [using <reg> {, <reg>}] | <directive>]
  <program> ::= {<line> (<comment> | newline | ;)}



      Veja quais são os assuntos do momento no Yahoo! +Buscados
http://br.maisbuscados.yahoo.com