[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Problem matching on [.]
- From: Nick Gammon <nick@...>
- Date: Tue, 15 May 2007 09:16:16 +1000
Hello Lua users,
I have been using Lua for a few years now and thought I knew its
regular expressions pretty well, but this one caught me out:
> print (string.match ("luaXusers", "lua[.]users"))
I was trying to match any character between "lua" and "users". Now I
know this works:
> print (string.match ("luaXusers", "lua.users"))
However I put the dot inside the brackets to make it more obvious to
the reader that I was matching any character and not just a dot (as
in lua.users) which a casual read of the regular expression might
make you think.
Referring to the documentation in Programming In Lua (2nd edition), I
see this (page 180):
The following table lists all character classes:
. all characters
%c control characters
... and so on ...
Thus, the character "." is defined as a "character class".
Moving onto page 181, the book says:
"A char-set allows you to create your own character classes,
combining different character classes and single characters between
Thus the regular expression "[.]" should match any single character.
It consists of a char-set, and inside the char-set is a character
class, namely ".". If you want to match a period, you should really
use this: "[%.]".
After all, the documentation states that a "." is a "magic character"
and should be escaped with a "%" in order to have its natural meaning.
I acknowledge that changing Lua to do this may break a whole heap of
regular expressions currently in use, but perhaps the documentation
could be clarified to make it clear that a period inside a char-set
is "itself" and not "all characters".
- Nick Gammon