lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> Well, the substitution feature sure looks nice. But when I
> experimented with it, I found that its rather slow and nearly
> unusable when you've got a large number of substitutions in a
> long string (~1M). It first creates a giant capture table for
> every matching position and only then goes on to construct the
> resulting string.

I tried it over a 4M string (the Bible, changing all lower-case letters)
with a reasonable time. Certainly slower than gsub (0.35s -> 0.81s),
but quite usable in my view. Maybe a memory constrain?
(It even beats gsub when it does not make too many substitutions; e.g.
changing the upper-case letters in the Bible.)


> When I proposed the substitution operator, I was more thinking
> along the lines of a one-pass streaming approach.

I know. This implementation is slower, but it seems more useful,
because it combines with other captures. Both the CSV example
and re.lua use substitutions embedded in other captures to avoid
post-processing.


> Umm, you want to completely remove them? I found named captures
> rather convenient. And how would I decide which of the captures
> matched in an alternative without labels? Maybe I misunderstood.

We could use the same techniques we use in other parser tools like
yacc. We can have different semantic actions for each alternative, so
that the final results are compatible (e.g., "Primary" in re.lua). Or
else we can insert a label telling what you have. This label would come
before the value, instead of being its actual key:

    {l = v}  --> {l, v}

(One big problem with labels as keys is that it does not "scale" to
repetitions, so anyway we must have a kind of box for each alternative.)

-- Roberto