lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Fri, Feb 24, 2017 at 4:28 AM, Roberto Ierusalimschy
<roberto@inf.puc-rio.br> wrote:
>> Basically, regular expressions are incapable of retaining
>> relationships between specific entities in the input string across
>> distance. Every step of evaluation can be done knowing only what part
>> of the regular expression has been evaluated so far and where you're
>> currently sitting in the input string. If any more information than
>> that is necessary, a regexp can't do it.
>
> This is true for real regular expressions, not for what Perl and other
> languages call "regular expressions". For instance, back references
> clearly violate what real regular expressions can do. In particular,
> because of back references, Perl's regexs are NP-complete: you can code
> any NP problem as a regex plus a subject string (in polynomial time) and
> Perl will solve the problem (not in polynomial time :-).
>
>     http://perl.plover.com/NPC/NPC-3SAT.html
>
> -- Roberto

You are of course correct, but I have one objection, and one follow-up.

The objection: The transformation into an input string and regexp
requires work that cannot be done by a regexp match, or even a single
regexp replacement. This doesn't invalidate the proof in and of itself
but it also doesn't exactly play into the challenge.

The follow-up: While it's true that PCRE is strictly more powerful
than a finite automaton (that is, the languages it can express are
broader than the regular languages), it is not as powerful as a
pushdown automaton (that is, the languages it can express are not as
broad as the context-free languages).

In layman's terms, this means that there are languages that LPeg can
parse that PCRE cannot. (For example, PCRE cannot parse the balanced
parentheses language without calling out to Perl code.) And of course,
there are languages that LPeg cannot parse that arbitrary freeform Lua
code can.

/s/ Adam