lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi

What is the problem that this is meant to solve?
Embedding formatting instructions in strings is not a problem; modern C compilers, for example, evaluate printf() format strings at compile time and report on format/argument mismatches.  If it can be done in C it can be done in Lua/LuaJIT, no?

Frank


> On May 26, 2017, at 11:32 AM, François Perrad <francois.perrad@gadz.org> wrote:
> 
> With Lua 5.3, the `string` library embeds 3 minilanguages:
>  - a text formatting minilanguage in `format` function
>  - a regexp minilanguage in `find`, `gmatch`, `gsub` and `match` functions
>  - a binary pack/unpacking minilanguage in `pack`, `packsize` and `unpack` functions
> 
> These minilanguages are embedded in Lua strings which are interpreted only at runtime.
> Before runtime, there are neither syntax check, neither argument type check.
> They are outside the Lua grammar.
> They are not friendly with JIT optimization.
> 
> The text formatting minilanguage is based on the one of C `sprintf`.
> This proposal is based on its C++ replacement, the iostream library.
> The strict replacement is the output stream, but with this model, it is easy to add the input stream counterpart.
> And the binary pack/unpack could be unified in this model.
> Two minilanguages are replaced by method chaining on a new userdata representing a string buffer.
> 
>    printf("x = %d  y = %d", 10, 20);                           -- C
>    string.format("x = %d  y = %d", 10, 20)                     -- Lua 5.0
>    ("x = %d  y = %d"):format(10, 20)                           -- Lua 5.1
>    cout << "x = " << 10 << "  y = " << 20;                     -- C++
>    string.buffer():put'x = ':put(10):put'  y = ':put(20)       -- proposal
> 
>    string.format("x = %#x", 200)                               --> "x = 0xc8"
>    string.buffer():hex():showbase(true):put'x = ':put(200)
> 
>    string.format("pi = %.4f", math.pi)                         --> "pi = 3.1416"
>    string.buffer():put'pi = ':fixed(true):precision(4):put(math.pi)
> 
>    d = 5; m = 11; y = 1990
>    string.format("%02d/%02d/%04d", d, m, y)                    --> "05/11/1990"
>    string.buffer():fill'0':width(2):put(d):put'/':width(2):put(m):put'/':width(4):put(y)
> 
> The implementation defines a new userdata based on `luaL_Buffer` from the Lua/C API. `string.buffer` is the constructor.
> The name of methods comes from C++: `put`, `precision`, `width`, `fill`, `left`, `right`, `internal`, `dec`, `oct`, `hex`, `fixed`, `scientific`, `showbase`, `showpoint`, `showpos`, `uppercase`, `endl`, `ends`.
> And the methods `__tostring`, `len` & `add` come from Lua.
> 
> As the conversion `int` to `char` makes sense only in C, the format "%c" must be rewrite with an explicit call of `string.char`
>    string.format("%c", 0x41)   --> 'A'
>    string.buffer():put(string.char(0x41))
> 
> And the feature of format "%q" is supplied by a new function `string.repl` (named like in Python)
>    string.format("%q", 'a string with "quotes"')               --> "a string with \"quotes\""
>    string.repl('a string with "quotes"')
> 
> This userdata supplies some input methods: `get`, `getline`, `pos`.
> The interface of `get` looks like the Lua `io.read`.
>    local sb = string.buffer'05/11/1990'
>    print(sb:get'i')            --> 5
>    assert(sb:get(1) == '/')
>    print(sb:get'i')            --> 11
>    assert(sb:get(1) == '/')
>    print(sb:get'i')            --> 1990
> 
> In order to replace `string.pack` & `string.unpack`, this userdata supplies these methods:
> `pack`, `packsize`, `unpack`, `little`, `align`, `int`, `num`, `str`.
> 
>    string.pack("iii", 3, -27, 450)
>    string.buffer():int'int':pack(3):pack(-27):pack(450)
> 
>    string.pack("i7", 1 << 54)
>    string.buffer():int(7):pack(1 << 54)
> 
>    string.pack("c1", "hello")
>    string.buffer():str('prefix', 1):pack'hello'
> 
>    string.pack("<i2 i2", 500, 24)
>    string.buffer():little(true):int(2):pack(500):pack(24)
> 
> pack/unpack are introduced only since Lua 5.3, so, I think they could be deprecated in Lua 5.4.
> For historical reasons, `format` can not be deprecated.
> `format` is good for small things, and the new way is good for serious things.
> In the same way, the regex minilanguage was not deprecated/replaced by LPeg.
> 
> Find in attachment, a patch of lstrlib.c against Lua 5.3.4.
> 
> François
> 
> <0001-experiment-string.buffer.patch>