lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great albertmcchan once stated:
> local C, P = lpeg.C, lpeg.P
> 
> -- my attempt for lua pettern "(.+)and(.*)"
> local lpeg_pat = C((P(1) - 'and')^1) * 'and' * C(P(1)^0)
> local re_pat = re.compile "{ (. ! 'and')* . } 'and' {.*}"
> 
> -- lua pattern "(.*)and(.*)
> lpeg_pat = C((P(1) - 'and')^0) 'and' * C(P(1)^0)
> 
> -- what is lpeg re equivalent code ?

  That was weird.  I would expect

	C((P(1) - 'and')^1) * 'and' * C(P(1)^0)

to map to:

	{ ( . ! 'and' )+ } 'and' { .* }

but it didn't.  I went so far as to recompile LPeg with debugging so I could
dump the parse tree.  The LPeg that works dumped out:

	[]
	seq
	  seq
	    capture kind: 'simple'  key: 0
	      seq
	        seq
	          not
	            seq
	              char 'a'
	              seq
	                char 'n'
	                char 'd'
	          any
	        rep
	          seq
	            not
	              seq
	                char 'a'
	                seq
	                  char 'n'
	                  char 'd'
	            any
	    seq
	      char 'a'
	      seq
	        char 'n'
	        char 'd'
	  capture kind: 'simple'  key: 0
	    rep
	      any

  While the re code dumped out as:

	[]
	seq
	  seq
	    capture kind: 'simple'  key: 0
	      seq
	        seq
	          any	-- !!!!!!
	          not
	            seq
	              char 'a'
	              seq
	                char 'n'
	                char 'd'
	        rep
	          seq
	            any
	            not
	              seq
	                char 'a'
	                seq
	                  char 'n'
	                  char 'd'
	    seq
	      char 'a'
	      seq
	        char 'n'
	        char 'd'
	  capture kind: 'simple'  key: 0
	    rep
	      any

  The difference is marked.  I then tried:

	{ (! 'and' . )+  } 'and' {.*}

(NOTE:  I swapped the order in the first bit) and that produced the same
code as the LPeg version.  So to answer your question, the re equivalent
code would be:

	{ (! 'and' .)* } 'and' {.*}

  -spc