• Subject: Re: Lpeg recursive patterns, part II
• From: Wim Langers <wim.langers@...>
• Date: Tue, 22 Feb 2011 07:39:56 +0100

Thanks Tony for pointing the problem out to me !!  (back to the drawing board...)

Cheers ,

Wim

On Mon, Feb 21, 2011 at 11:21 PM, Tony Finch wrote:
On 21 Feb 2011, at 21:22, Wim Langers <wim.langers@adrias.biz> wrote:

local PATTERN = {
'Shapes';
_Str = C(R('az') ^ 1),

Shape = V('_Str') * P('::') * V('_Str') * P('()') * P(' ') ^ 0 / function(key,id) return 'shape'..key..id end,

Note that this pattern eats all spaces following the shape _expression_.

Shapes = P('shapes') * (P('(') * Ct(V('Shapes_')) * P(')')) / function(t) return t end,

This function capture is redundant!

Shapes_ = V('Shape') * P(' ')^1 * V('Shapes_') + V('Shape')

This pattern's first branch cannot match, because all the spaces after the first Shape have been eaten, and there are none left for the P(' ')^1. Unlike regexes, PEGs have limited backtracking: they will not backtrack over a repetition and try a shorter match. (This is why they are less prone to unexpectedly bad performance.) So you must never follow a repetition with a pattern that must match a string matched by the repetition.

A good way to design lpeg patterns for conventional languages is to define patterns for the lexical tokens that include white space, and ignore space at the higher syntactic levels. This makes the pattern look more like a typical context free grammar. Ensure you consistently match space always at the start or always at the end of each token.

Tony.
--
f.anthony.n.finch  <dot@dotat.at>  http://dotat.at/