[Date Prev][Date Next][Thread Prev][Thread Next]
- Subject: Re: Problems with: Most efficient way to recognize a line by its first three characters
- From: Tom N Harris <telliamed@...>
- Date: Wed, 12 Mar 2014 15:28:25 -0400
On Wednesday, March 12, 2014 03:11:22 PM I wrote:
> On Wednesday, March 12, 2014 07:49:22 PM email@example.com wrote:
> > Everything works fine...except the recongition of the lines
> > beginning with '§'...
> > And I have not a single idea, why...
> That is not one "character". Or rather, UTF-8 encodes it as multiple bytes
> and Lua only knows about bytes. So when you do `line:sub(1,3)` you're only
> going to get '§7' without the last byte.
A further thought: obviously your pattern matching has to correspond to the
encoding of the data format, not what your editor is using. You can either
change your editor's encoding or be safer and write out the character codes in
the string. '§' in ISO-8859-1 is 0xA7 so in a Lua string that would be '\167'.