lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Sean Conner once stated:
> > 
> > Also, unless I misunderstood something or something has changed since I
> > last looked at it, LPeg cannot do anything requiring backtracking, so there
> > are things matchable by patterns but unmatchable by LPeg. Possible example
> > being "capture the three characters between a question mark and a dollar
> > sign, immediately before the dollar sign", for which the (untested) pattern
> > "%?.-([^$][^$][^$])%$" should match easily, but AFAICT, there is no
> > equivalent LPeg pattern since the only way is to match up to the $ and then
> > back up three characters.
> 
>   Yes it can:
> 
> 	qd = lpeg.P"?"    -- match the question mark
> 	   * lpeg.P(1)^3  -- at least three characters, could be more
> 	   * lpeg.Cmt(	  -- at match time
>                   lpeg,P"$", -- when we hit the dollar sign, call this
>                   function(subject,position)
> 	            -- position is one past the dollar sign,
> 	            -- so we return the three characters prior to the 
> 	            -- dollar sign (position-4 to position-2 of subject)
>  	            return position,subject:sub(position-4,position-2)
> 	          end
> 	     )

  I realized the code has a bug in that the three characters matched should
*not* match the dollar sign.  Easiest fix is in the function:

	qd = lpeg.P"?"
	   * lpeg.P(1)^3
	   * lpeg.Cmt(lpeg.P"$",function(subject,position)
	         local ret = subject:sub(position-4,position-2)
	         if not ret:find("$",1,true) then
	           return position,ret
	         end
	       end
	     )

OR, you could fix it:

	no_dollar = lpeg.R("\0#","%\255") -- look ma!  No $
	qd        = lpeg.P"?"
	          * (lpeg.P(1) - (no_dollar * no_dollar * no_dollar * lpeg.P"$"))^0
                  * lpeg.C(no_dollar * no_dollar * no_dollar)
                  * lpeg.P"$"
          
and get rid of the call to lpeg.Cmt().  Okay, this one:

	lpeg.R()	- define ranges of characters to match
	lpeg.C()	- return pattern matches as a capture
	patt1 - patt2	- match pattern1 if pattern2 does not match

  That should take care of
  
  	?$$$$$123$	- returns 123
  	?$$$$		- returns nil
  	?123$		- returns 123
  	?987654123$	- returns 123
  	?12$		- returns nil
  	?12$$		- returns nil

  -spc (Think of 'patt1 - patt2' as look ahead ... )