[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Is it possible to add utf-8 lua source file support in lua 5.2?
- From: Quae Quack <quae@...>
- Date: Mon, 27 Sep 2010 20:00:32 +1000
On 27 September 2010 19:19, Robert Raschke <rtrlists@googlemail.com> wrote:
>
> On Mon, Sep 27, 2010 at 3:16 AM, Pan Shi Zhu <pan.shizhu@gmail.com> wrote:
>>
>> Lua has no problem supporting utf-8 file without BOM.
>>
>> According to POSIX standard, you should *not* add bom to utf-8 file.
>>
>> So utf-8 with BOM is not a standard file format.
>>
>> BTW: gnu gcc does not support utf-8+bom source file either.
>>
>
> Unfortunately, MS insists on being completely inconsistent, adding the BOM
> in some tools, and stripping it in others. Complete nightmare.
>
> I once added this to lauxlib.c function luaL_loadfile():
>
> --- lauxlib-orig.c Mon Sep 27 10:15:59 2010
> +++ lauxlib.c Mon Sep 27 10:16:28 2010
> @@ -565,6 +565,21 @@
> if (lf.f == NULL) return errfile(L, "open", fnameindex);
> }
> c = getc(lf.f);
> +
> + /* vvv RTR vvv: Check for UTF-8 BOM ef bb bf */
> + if (c == 0xef) {
> + if (getc(lf.f) == 0xbb && getc(lf.f) == 0xbf) {
> + /* do nothing, we've skipped the BOM and just continue with normal
> processing */
> + } else {
> + /* wasn't the UTF8 BOM, so reset everything again */
> + fclose(lf.f);
> + lf.f = fopen(filename, "r"); /* reopen */
> + if (lf.f == NULL) return errfile(L, "open", fnameindex); /* unable to
> reopen file */
> + }
> + c = getc(lf.f);
> + }
> + /* ^^^ RTR ^^^: Check for UTF-8 BOM ef bb bf */
> +
> if (c == '#') { /* Unix exec. file? */
> lf.extraline = 1;
> while ((c = getc(lf.f)) != EOF && c != '\n') ; /* skip first line */
>
>
> It's been good enough for me for a while.
>
> Robby
>
>
why not use ungetc?