lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Gabriel Bertilson once stated:
> Here's a solution using Cmt. As Andrew pointed out, Cb just inserts a
> capture into the list of captures returned by the current pattern, it
> doesn't match anything.
> 
> Cg(1, "char") * Cmt(C(1) * Cb'char', function (_, _, char1, char2)
> return char1 == char2 end)^0
> 
> It matches one character, labels it as "char", then matches further
> characters if they are equal to "char". To use it on "aaabbbcccd" in
> the Lua interpreter (with LPeg functions available as variables):
> 
> > patt = Cg(1, "char") * Cmt(C(1) * Cb'char', function (_, _, cur, prev) return cur == prev end)^0
> > (C(patt)^1):match "aaabbbcccd"
> aa    bbb    ccc    d

  And it can be further extended to UTF-8:

local char = R"\1\127"
           + R"\194\244" * R"\128\191"^1
local seq  = Cg(char,'char') 
           * Cmt(
                  C(char) * Cb'char',
                  function(_,_,cur,prev) 
                    return cur == prev 
                  end
             )^0
local patt = C(seq)^1

print(patt:match "aaabbbcccd")
print(patt:match "###aaaa©©©©####bbbbb####")

  -spc