[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPEG and the order of alternatives in ordered choice
- From: Paul K <paul@...>
- Date: Thu, 11 Sep 2014 04:01:42 +0000
> you shoud try lpeg.P('aaaaaaa'). the S of lpeg.S stands for ``set``.
of course:
aabb = lpeg.P('aa')+lpeg.P('bb'),
bbaa = lpeg.P('bb')+lpeg.P('aa'),
ab = lpeg.P('a')+lpeg.P('b'),
ba = lpeg.P('b')+lpeg.P('a'),
aaaabbbb = lpeg.P('aaaa')+lpeg.P('bbbb'),
bbbbaaaa = lpeg.P('bbbb')+lpeg.P('aaaa'),
aabaaa = lpeg.P('aab')+lpeg.P('aaa'),
aaaaab = lpeg.P('aaa')+lpeg.P('aab'),
produces (averaged over 20 iterations on the same string):
aaaabbbb 0.69475
bbbbaaaa 0.66295
aabaaa 1.4071
aaaaab 1.0251
aabb 0.8695
bbaa 0.86225
ab 0.27405
ba 0.28245
I think only the difference between aab+aaa and aaa+aab is noticeable;
the rest are probably test fluctuations. aab+aaa difference is
understandable because of backtracking in aab. It seems like the
alternatives are efficiently explored as long as they have short
unique prefixes.
There is probably some overhead with exploring failed alternatives,
but it seems to be small and the result is probably dominated by other
aspects of the matching.
What's interesting is that 'aa' is faster than 'aaa', but slower than
'aaaa' (I tested several times and the effect is the same). 'a' is
still the fastest, but it's probably because of being a special case.
Paul.