Re: proposal for reading individual characters from strings faster

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: proposal for reading individual characters from strings faster
From: Sean Conner <sean@...>
Date: Sat, 3 May 2014 16:30:07 -0400

It was thus said that the Great Tim Hill once stated:
> 
> However, no “simple” feature comes without hidden costs. The back-quote
> syntax appears to isolate source code from character coding issues, but
> does it? One approach is to always assume UTF-8 encoding, which is
> consistent across platforms, but may differ from the local encoding. This
> means that `a` ~= string.byte(“a”) on (say) EBCDIC platforms. Another
> approach is to use the local platform encoding, but this also doesn’t work
> since the locale at compile time may differ from the locale at run-time
> (even if the code is run directly after compile).

  It can even change at runtime!  

  One project I've been working on [1] involves parsing email [2] which
involves a lot of character set manipulations (not dealt with in [2]).  The
collection of emails I pull from uses at least a dozen, if not more,
character sets.  

  -spc

[1]	Long term, when I get around to it, not really important, but a fun
	diversion.  That type of project.

[2]	Obligatory email header parsing code:

	https://github.com/spc476/LPeg-Parsers/blob/master/email.lua

Follow-Ups:
- Re: proposal for reading individual characters from strings faster, Coroutines

References:
- proposal for reading individual characters from strings faster, Coroutines
- Re: proposal for reading individual characters from strings faster, Philipp Janda
- Re: proposal for reading individual characters from strings faster, Coroutines
- Re: proposal for reading individual characters from strings faster, Petite Abeille
- Re: proposal for reading individual characters from strings faster, Coroutines
- Re: proposal for reading individual characters from strings faster, Tim Hill
- Re: proposal for reading individual characters from strings faster, Coroutines
- Re: proposal for reading individual characters from strings faster, Tim Hill

Prev by Date: Re: FYI: tonumber() test failure with Open Watcom
Next by Date: Re: Proposal: Proposals are the wrong approach [prose and long]
Previous by thread: Re: proposal for reading individual characters from strings faster
Next by thread: Re: proposal for reading individual characters from strings faster
Index(es):
- Date
- Thread