[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: How to prevent catastrophic regex?
- From: stepa alimov <stepa.alimov.93@...>
- Date: Thu, 11 Jun 2020 18:23:21 +0300
It runs slower than Lua string library for ASCII. But you probably can use it for some sort of validation/filtering for input and then run with faster Lua string library.
I'm working on it's performance. gsub, gmatch functions are most slow because it recalculates codepoint array for string on each iteration.
One way to improve library performance is to write AST generator as it will allow to generate more specific code.
The string package has regular _expression_ matching, and they can be used to leverage https://www.regular-expressions.info/catastrophic.html. Check the following example, which takes 25 seconds to run.
start = os.time()
s = string.gsub('aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'a*a*a*a*a*a*a*b', 'X')
print('ran in ', os.difftime(os.time(), start), ' second(s)')
I'd like to find a way to prevent from such "attack" on a system that would execute user provided lua scripts. The system uses a custom allocator that limits the memory that can be allocated, and a hook that calls lua_error at next instruction executed if script evaluation takes too long.
Unfortunately, none of those help, because the above code doesn't need a lot of memory allocated to run, and the debug hook sees the whole gsub call as one instructions.
So the only solution I see is to not expose string.gsub() to users, or to run in a sandbox that can be terminated violently after some time. Has the Lua community figured out a better solution? Have I missed something in our hooks should be used, or potential native Lua ways of "bounding" execution?
С уважением, Алимов Степан Геннадьевич.