[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [PROPOSAL 5.4] alternate way of format/pack/unpack (in the string library)
- From: Frank Kastenholz <fkastenholz@...>
- Date: Fri, 26 May 2017 13:35:15 -0400
Hi
What is the problem that this is meant to solve?
Embedding formatting instructions in strings is not a problem; modern C compilers, for example, evaluate printf() format strings at compile time and report on format/argument mismatches. If it can be done in C it can be done in Lua/LuaJIT, no?
Frank
> On May 26, 2017, at 11:32 AM, François Perrad <francois.perrad@gadz.org> wrote:
>
> With Lua 5.3, the `string` library embeds 3 minilanguages:
> - a text formatting minilanguage in `format` function
> - a regexp minilanguage in `find`, `gmatch`, `gsub` and `match` functions
> - a binary pack/unpacking minilanguage in `pack`, `packsize` and `unpack` functions
>
> These minilanguages are embedded in Lua strings which are interpreted only at runtime.
> Before runtime, there are neither syntax check, neither argument type check.
> They are outside the Lua grammar.
> They are not friendly with JIT optimization.
>
> The text formatting minilanguage is based on the one of C `sprintf`.
> This proposal is based on its C++ replacement, the iostream library.
> The strict replacement is the output stream, but with this model, it is easy to add the input stream counterpart.
> And the binary pack/unpack could be unified in this model.
> Two minilanguages are replaced by method chaining on a new userdata representing a string buffer.
>
> printf("x = %d y = %d", 10, 20); -- C
> string.format("x = %d y = %d", 10, 20) -- Lua 5.0
> ("x = %d y = %d"):format(10, 20) -- Lua 5.1
> cout << "x = " << 10 << " y = " << 20; -- C++
> string.buffer():put'x = ':put(10):put' y = ':put(20) -- proposal
>
> string.format("x = %#x", 200) --> "x = 0xc8"
> string.buffer():hex():showbase(true):put'x = ':put(200)
>
> string.format("pi = %.4f", math.pi) --> "pi = 3.1416"
> string.buffer():put'pi = ':fixed(true):precision(4):put(math.pi)
>
> d = 5; m = 11; y = 1990
> string.format("%02d/%02d/%04d", d, m, y) --> "05/11/1990"
> string.buffer():fill'0':width(2):put(d):put'/':width(2):put(m):put'/':width(4):put(y)
>
> The implementation defines a new userdata based on `luaL_Buffer` from the Lua/C API. `string.buffer` is the constructor.
> The name of methods comes from C++: `put`, `precision`, `width`, `fill`, `left`, `right`, `internal`, `dec`, `oct`, `hex`, `fixed`, `scientific`, `showbase`, `showpoint`, `showpos`, `uppercase`, `endl`, `ends`.
> And the methods `__tostring`, `len` & `add` come from Lua.
>
> As the conversion `int` to `char` makes sense only in C, the format "%c" must be rewrite with an explicit call of `string.char`
> string.format("%c", 0x41) --> 'A'
> string.buffer():put(string.char(0x41))
>
> And the feature of format "%q" is supplied by a new function `string.repl` (named like in Python)
> string.format("%q", 'a string with "quotes"') --> "a string with \"quotes\""
> string.repl('a string with "quotes"')
>
> This userdata supplies some input methods: `get`, `getline`, `pos`.
> The interface of `get` looks like the Lua `io.read`.
> local sb = string.buffer'05/11/1990'
> print(sb:get'i') --> 5
> assert(sb:get(1) == '/')
> print(sb:get'i') --> 11
> assert(sb:get(1) == '/')
> print(sb:get'i') --> 1990
>
> In order to replace `string.pack` & `string.unpack`, this userdata supplies these methods:
> `pack`, `packsize`, `unpack`, `little`, `align`, `int`, `num`, `str`.
>
> string.pack("iii", 3, -27, 450)
> string.buffer():int'int':pack(3):pack(-27):pack(450)
>
> string.pack("i7", 1 << 54)
> string.buffer():int(7):pack(1 << 54)
>
> string.pack("c1", "hello")
> string.buffer():str('prefix', 1):pack'hello'
>
> string.pack("<i2 i2", 500, 24)
> string.buffer():little(true):int(2):pack(500):pack(24)
>
> pack/unpack are introduced only since Lua 5.3, so, I think they could be deprecated in Lua 5.4.
> For historical reasons, `format` can not be deprecated.
> `format` is good for small things, and the new way is good for serious things.
> In the same way, the regex minilanguage was not deprecated/replaced by LPeg.
>
> Find in attachment, a patch of lstrlib.c against Lua 5.3.4.
>
> François
>
> <0001-experiment-string.buffer.patch>