LPeg: parsing text with wikilinks

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: LPeg: parsing text with wikilinks
From: Александр Машин <alex.mashin@...>
Date: Mon, 25 May 2015 01:47:13 +0700


Dear all,

I am trying to write a parser that would process some wikitext--textwith wikilinks [[page|(optional alias)]] (but not escaped [[:page]]) inside.

I want it to return the passed string itself (with some alterations), alist of referred pages and the symbol that is likely to be the listseparator in the string passed.


I wrote the following grammar:

    wikitext             <- {| { {| ( prefix?  wikilink )+ |} tail } |}
    wikilink              <- unescapedopen page alias? close
    page                  <- { ( !close !pipe . )+ }
    alias                   <- pipe ( !close . )*
    tail                     <- .*
    prefix                 <- ( separator / ( !unescapedopen . ) )+
    open                  <- "[["
    unescapedopen <- open !escape
    close                  <- "]]"
    pipe                   <- "|"
    escape              <- ":"
    separator          <- {:separator: [,;*#] :} space*
    space                <- %s

After applying it (re.match) to the example line "Perhaps,[[Peter|Simon]], or [[Paul]], so they say", I got:


table {
    1 = Perhaps, [[Peter|Simon]], or [[Paul]], so they say
    2 = table {
        1 = Peter
        2 = Paul
        separator = ,
    }
}

This is close to what I want.

However, there are some issues:

1) can I make outer table's indices strings: ['full'] not [1], ['items']not [2]? I experimented with named group captures but unsuccessfully.

2) can the number of nested captures be reduced?

3) most importantly: I want a string constant (e.g. "Name::") to beinserted after any found <unescapedopen>; and the first capture thatreturns the whole line should contain this constant:"...[[Name::Paul]]..." not "...[[Paul]]...". This can have something todo with substitution captures; I tried them but couldn't do it. Can itbe done at all?


Alexander Mashin

Follow-Ups:
- Re: LPeg: parsing text with wikilinks, Parke

Prev by Date: Long-term LPeg memoization
Next by Date: Re: LPeg: parsing text with wikilinks
Previous by thread: Re: Long-term LPeg memoization
Next by thread: Re: LPeg: parsing text with wikilinks
Index(es):
- Date
- Thread