lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 09/11/2014 12:01 PM, Paul K wrote:
you shoud try lpeg.P('aaaaaaa'). the S of lpeg.S stands for ``set``.
of course:

   aabb = lpeg.P('aa')+lpeg.P('bb'),
   bbaa = lpeg.P('bb')+lpeg.P('aa'),
   ab = lpeg.P('a')+lpeg.P('b'),
   ba = lpeg.P('b')+lpeg.P('a'),
   aaaabbbb = lpeg.P('aaaa')+lpeg.P('bbbb'),
   bbbbaaaa = lpeg.P('bbbb')+lpeg.P('aaaa'),
   aabaaa = lpeg.P('aab')+lpeg.P('aaa'),
   aaaaab = lpeg.P('aaa')+lpeg.P('aab'),

produces (averaged over 20 iterations on the same string):

aaaabbbb 0.69475
bbbbaaaa 0.66295
aabaaa 1.4071
aaaaab 1.0251
aabb 0.8695
bbaa 0.86225
ab 0.27405
ba 0.28245

I think only the difference between aab+aaa and aaa+aab is noticeable;
the rest are probably test fluctuations. aab+aaa difference is
understandable because of backtracking in aab. It seems like the
alternatives are efficiently explored as long as they have short
unique prefixes.

There is probably some overhead with exploring failed alternatives,
but it seems to be small and the result is probably dominated by other
aspects of the matching.

What's interesting is that 'aa' is faster than 'aaa', but slower than
'aaaa' (I tested several times and the effect is the same). 'a' is
still the fastest, but it's probably because of being a special case.

Paul.

mhhh, very interesting results. I can't explain it though. I read the paper on lpeg
quite long ago. but I just glimpsed at the source code. besides, lpeg has evoled
a lot since then. so I don't really understand the VM implementation details.

maybe I could read the source code again this afternoon(I am in GMT+0800 timezone).