[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Feature request: plain option for gsub
- From: Roberto Ierusalimschy <roberto@...>
- Date: Thu, 21 Aug 2014 11:21:02 -0300
> I think this already demonstrates my point. Coming up with a regex
> that is safe and escapes everything is not trivial.
>
> [...]
> >
> > On my system, '%p' does not match '[+$^]', so '%p' should become '[%p+$^]'.
This seems like a bug in his system (or else he is using some weird
locale...). '%p' corresponds to 'ispunct', and the C standard says this:
In an implementation that uses the seven-bit US ASCII character set, the
printing characters are those whose values lie from 0x20 (space) through
0x7E (tilde);
[...]
In the "C" locale, ispunct returns true for every printing character for
which neither isspace nor isalnum is true.
So, '[+$^]' must be all punctuations (and therefore match '%p').
If you assume a correct libC and a sane locale, '%p' is all you need.
-- Roberto