[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Lua 5.4.0 beta announcement
- From: "Soni \"They/Them\" L." <fakedme@...>
- Date: Thu, 3 Oct 2019 18:19:42 -0300
On 2019-10-03 6:05 p.m., Roberto Ierusalimschy wrote:
> the only concern I have is over existing usage of e.g.
> utf8.codes(s:gsub(...)). it would probably be beneficial to make utf8.codes
> accept a start index before the lax switch, or otherwise enforce that the
> lax switch is not a number. (the start index is more appealing imo.)
Accepting a start index would not help in that case, would it?
While it would produce wrong results, that would probably be better than
producing unsafe results. Consider a sequence of gsubs that remove bad
sequences (and yeah you aren't supposed to do it like this but ppl do
things like this all the time - for proper security you should always
operate on decoded data where all arguments about overlong encodings and
whatever being bad for security can be thrown out the window but that
doesn't stop ppl doing it anyway and I could rant about this all day lol).
Anyway, I digress. Consider a sequence of gsubs that remove bad
sequences. And then they switch to the new version. Now some things that
were invalid are suddenly valid because the 2nd return value from gsub
is being used as true and enabling unsafe mode, so some things they're
getting rid of are suddenly going through.
In this case, I think it'd be better to just crash or produce broken but
safe results than let invalid UTF-8 through. but maybe that's just me.