lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


http://www.lua.org/source/5.1/lstring.c.html

Lua strings are technically an array (or sequence) of (char)s,

That way they are also accessed binary from files:
http://www.lua.org/source/5.1/liolib.c.html#read_line

Quite commonly they are casted to unsigned chars
http://www.lua.org/source/5.1/lstrlib.c.html#uchar

What (char) is depends on the system where Lua was compiled. On 99% of
systems today this is a signed 8bit value, it is not 100% defined to
be 8bit tough.

"\\" - is just a parser format. It isnt present in the once compiled
running core.

On Tue, Dec 21, 2010 at 9:20 AM, Dirk Laurie <dpl@sun.ac.za> wrote:
> On Mon, Dec 20, 2010 at 06:02:31PM +0200, Greg Falcon wrote:
>> Your point about multibyte characters is well taken, but:
>>
>> On Sun, Dec 19, 2010 at 5:32 PM, Tony Finch <dot@dotat.at> wrote:
>> > On 19 Dec 2010, at 22:19, Greg Falcon <veloso@verylowsodium.com> wrote:
>> >>
>> >> A subtle point here:  This snippet from the manual is talking about
>> >> the *character* at s[1], and Lua doesn't have a character type.
>> >
>> > It says character but it means octet.
>>
>> It probably means character in the C "char" sense.  "Octet" is not an
>> appropriate word to use for this concept in portable C programs, since
>> chars/bytes in standard C are allowed to be wider than 8 bits in
>> standard-conforming implementations.
>>
>
> Definition:
>
>    A string is a Lua value consisting of a sequence of bytes but
>    having no other structure, mainly used to represent other values
>    in a human-readable way.
>
> In particular, a string is not a table, therefore also not an array,
> and its entries are bytes, not characters or anything else.  (Although
> one tacitly assumes that there exist useful mappings between strings and
> sequences of characters, e.g. between the four-character sequence "\\"
> and the one-byte string consisting of the byte encoding of a backslash.)
>
> The k-th byte of a string is just that, a byte.  The notion of "the k-th
> character of a string" is useful in text processing applications, but
> it is not a Lua notion.  Lua does not have a type "character".  It is
> therefore quite impossible to make s[k] mean "the k-th character of s".
>
> One could, though, make s[k] mean "a one-byte string consisting of the
> k-th byte of s", i.e.
>    s[k] == string.char(string.byte(s,k,k))
> Then it is obvious that "s[3]='c'" is nonsense, since
>    string.char(string.byte(s,3,3)) = 'c'
> is nonsense.
>
> Dirk
>
>
>