lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 09/11/2014 10:28 AM, Paul K wrote:
Hi All,

Let's say I have ordered choice with mutually exclusive alternatives;
this means that the order of alternatives doesn't matter from the
result perspective, but it may matter from the performance perspective
if one of the alternatives is matched before others.

I came up with a test to test this:

local str = ('a'):rep(100000000)
-- start clock
for i = 1, 50 do
   lpeg.match((lpeg.S('a')+lpeg.S('1'))^0, str)
end
-- stop clock

I thought that by matching 'aaaa...' against ('a' or '1') vs. ('1' or
'a') I should see some difference because of checking two alternatives
in one case and only one in the other one, but the results differ only
by 1-2%.

Increasing the number of failed alternatives to 5 (S('1')+...S('5'))
only changes the overall difference to 2-5%; changing to R('09')
produces similar results.

Is this expected or is my test flawed in some way?

The reason why I started to look into this was that I have a complex
grammar and several ordered choices and I thought it may make sense to
organize the alternatives in the order of their probability of
appearance (assuming they are mutually exclusive and not a subset of
each other), but I was expecting a larger improvement.

Paul.

IIRC, LPEG implements single-char patterns using a dedicated OPCODE.
`lpeg.S` and `lpeg.R` (and maybe the lpeg.P with a single char) all compiles
to the same OPCODE (the operands differs, of course), and the `+`
operator would compute the union of the two, resulting a single new one.

I suppose for patterns other than of single character, the order really
should affect the perfomance. could you test it again using complex patterns?