lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


yes but the construct (.-) can still match more characters as needed to satisfy the first required space.
All the leading repeated 'a' are then part of the match of (.-).
The difference with (.*) occurs when (.-) is followed by another repeated subpattern and there's an ambiguity about which of the two should capture the content: with '.-' the repetition stops as soon as the following subpattern start matching, but with (.*) if the following starts matching but then fails later, there will be no rollback in '.-' to try eating another repetition that could potentially be eaten by what follows.
This makes '-' much faster than '*' in many patterns. But both will still match one or more characters ('*' is greedy and attempts to match the longest then will try matching the rest: if it fails, it will get backward to retry with less matches; the other '-' is not, so instead, when there's a space here it attempts to match that space until the full regexp is matched successfully and if it matches, then '-' will not get backward; badically '-' is used to match left context, '*' for the right context; when '-' is not follwoed by any repeated subpattern, both '-' and '*' are equivalent)
Here the unconditional subpattern to match is " x ", it is not repeated, so (.-) or (.*) before it are equivalent. for the input "aaa x bbb".

You would see however a difference with the input "aaa x bbb x ccc":
- with '(.-) x (.*)', the first capture would be 'aaa' and the second one would be 'bbb x ccc'
- with '(.*) x (.*)', the first capture would be 'aaa x bbb' and the second one would be 'ccc'


Le ven. 22 nov. 2019 à 18:27, Roberto Ierusalimschy <roberto@inf.puc-rio.br> a écrit :
> ```
> hippi@vas:~$ lua5.3
> Lua 5.3.3  Copyright (C) 1994-2016 Lua.org, PUC-Rio
> > print(('aaa x bbb'):match'(.-) x (.*)')
> aaa   bbb
> ```
> (lj 2.0.5 does this as well)
>
> i would expect that the 1st capture should be empty without a `^` at
> the beginning of the pattern, as the manuals (up to lua 5.4) say:
> "a single character class followed by '-', which also matches zero or
> more repetitions of characters in the class. Unlike '*', these
> repetition items will always match the shortest possible sequence;"

string.match looks for the *first* match. The resulting match starts
at position 1, while a match with an empty capture (no 'a's) starts at
position 4; therefore it is not the first one.

-- Roberto