[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: LPEG and the order of alternatives in ordered choice
- From: Peng Zhicheng <pengzhicheng1986@...>
- Date: Thu, 11 Sep 2014 10:50:07 +0800
On 09/11/2014 10:28 AM, Paul K wrote:
Hi All,
Let's say I have ordered choice with mutually exclusive alternatives;
this means that the order of alternatives doesn't matter from the
result perspective, but it may matter from the performance perspective
if one of the alternatives is matched before others.
I came up with a test to test this:
local str = ('a'):rep(100000000)
-- start clock
for i = 1, 50 do
lpeg.match((lpeg.S('a')+lpeg.S('1'))^0, str)
end
-- stop clock
I thought that by matching 'aaaa...' against ('a' or '1') vs. ('1' or
'a') I should see some difference because of checking two alternatives
in one case and only one in the other one, but the results differ only
by 1-2%.
Increasing the number of failed alternatives to 5 (S('1')+...S('5'))
only changes the overall difference to 2-5%; changing to R('09')
produces similar results.
Is this expected or is my test flawed in some way?
The reason why I started to look into this was that I have a complex
grammar and several ordered choices and I thought it may make sense to
organize the alternatives in the order of their probability of
appearance (assuming they are mutually exclusive and not a subset of
each other), but I was expecting a larger improvement.
Paul.
IIRC, LPEG implements single-char patterns using a dedicated OPCODE.
`lpeg.S` and `lpeg.R` (and maybe the lpeg.P with a single char) all compiles
to the same OPCODE (the operands differs, of course), and the `+`
operator would compute the union of the two, resulting a single new one.
I suppose for patterns other than of single character, the order really
should affect the perfomance. could you test it again using complex patterns?