lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Glenn McAllister wrote:
I'm having trouble figuring out how to create a grammar that lets me replace escape sequences in a string, when the unescaped version of the string triggers another pattern.

I eventually figured this out, so I decided that I should reply to my own post in case someone has the same problem I did. The bit I was missing, even though its stressed often enough, is how PEGs can be composed bit by bit.


Lets start with the relevant PEG, which hopefully is correct in the first place:

Template    <- Chunk+ End
Chunk       <- Literal / Action / Newline
Newline     <- '\n' / '\r'
Action      <- ActionStart (!ActionEnd .)+ ActionEnd
ActionStart <- !'\\' '$'
ActionEnd   <- '$'
Literal     <- ((!Action / !Newline) .)+
End         <- !.

Fixing the PEG to allow for escapes:

Literal <- ((!Action / !Newline) (Escape / .))+

I'm not sure how to write the Escape rule in a PEG, since there is no notion of substitution in a raw PEG. The LPeg version is below.


Basically I want the following:

1) "text1 $action1$ text2"  -> { 'text1', 'action1', 'text2' }
2) "text1\ntext2\ntext3"    -> { 'text1', 'text2', 'text3' }
3) "text2 tex\$t2"          -> { 'text1', 'tex$t2' }

Here's what I have so far, which will do 1) and 2), but obviously not 3):

local grammar = {
    "Template",
    Template = lpeg.Ct(Chunk^1) * -1,
    Chunk = Literal + Action + Newline,
    Newline = lpeg.C(lpeg.S'\n\r'),
    ActionStart = lpeg.P'$' - lpeg.P'\\',
    ActionEnd = lpeg.P'$',
    Action = ActionStart * lpeg.C((1 - ActionEnd)^1) * ActionEnd,
    Literal = lpeg.C((1 - (ActionStart + Newline))^1)
}

Here's the updated version:

-- I'm going to have more escapes in the future...
local escapes = {
    ['\\$'] = '$'
}

local grammar = {
    -- Template, Chunk, Newline, Action, ActionStart, ActionEnd as before
    Escape = (lpeg.P'\\' * lpeg.S'$') / escapes,
    Literal = lpeg.Cs(((Escape + 1) - (ActionStart + Newline))^1)
}


It basically comes down to me not knowing how to do the substitution in the Literal definition. I looked as the CSV example, but it wasn't shedding a lot of light for me.

Digging into leg (http://leg.luaforge.net/) helped me with this. Leg is doing an awful lot more than what I need atm, but its a great place to look for examples on how to do complicated PEG acrobatics.

--
Glenn McAllister     <glenn@somanetworks.com>      +1 416 348 1594
SOMA Networks, Inc.  http://www.somanetworks.com/  +1 416 977 1414

  Asking a writer what he thinks about criticism is like asking a
  lamppost what it feels about dogs.
                                                    - John Osborne