[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: C-isms: \0 bytes & Lua's stdlib (Re: [ANN] Lua 5.3.4 (rc3) now available)
- From: nobody <nobody+lua-list@...>
- Date: Fri, 27 Jan 2017 01:57:28 +0100
On 2017-01-25 05:54, nobody wrote:
> (IIRC there were a bunch of other things you only get from the C API
> sections. Standalone interpreter-only users who don't "speak" C are
> unlikely to read those as it's mostly incomprehensible gibberish for
> them. I had a list somewhere but can't find it right now. =/)
While I didn't find that list, I found another note on C-isms in Lua.
Somewhere, the manual says "Lua is 8-bit clean: strings can contain any
8-bit value, including embedded zeros ('\0'). Lua is also
encoding-agnostic; it makes no assumptions about the contents of a string."
Further mentions of \0 in the non-C-API part are:
* string.len counts embedded zeros
* frontier pattern matches start/end of string as \0
* string format's '%s' with modifiers "should not" contain \0
(is that "should not but may" or "must not" or...?)
...and that's it, so someone who doesn't know C could not be expected to
expect any problems. In reality, a whole bunch of functions behave
unexpectedly when strings contain \0. (Or: behave just like you'd
expect if you know C.)
So I started to go through all functions on the Lua side... got bored in
the packackge & io library and then put it aside. I initially started
with 5.3.3 some time ago, now checked the diffs & some functions from
each category so it should be up-to-date with 5.3.4-rc3 (but I'm not
100% certain).
This is a long, (nearly) complete list (some stuff in io.* and package.*
is probably missing), intended mostly for reference, not for an
immediate fix-everything-rampage. Some might be worth a change to the
manual, for some it might actually be easier to improve the function,
but most are probably irrelevant. Do keep in mind that things have
always behaved like this (as far as I bothered to check) and according
to the list archive, no one ever ran into this – so it's not all that
urgent to do anything about this. (There was one thread on \0 bytes and
io.lines, but nothing else about the Lua side.)
To follow along with the examples, create files 'foo' and 'foo.lua'
containing the code 'error "wrong file"'.
# assert/error
> assert( false, "foo\0bar" )
stdin:1: foo
stack traceback: [...]
> error "foo\0bar"
stdin:1: foo
stack traceback: [...]
> debug.traceback( "foo\0bar" )
foo
stack traceback: [...]
(This might bite you: If you include a bad value in the error message
to print it out, it may drop part of the message. Then again, I
actually used this behavior once to throw "<user-friendly
message>\0<long data dump>" up & get only the message printed. But that
can also be done (much cleaner) by a table with __tostring-MM.)
# debug library
> debug.getinfo( 0, "\0n" )
--> empty table
> function nop() end
> debug.sethook( error, "\0crl" )
> for i = 1, 100 do nop() end
-- ...nothing...
(Irrelevant / invalid argument / acceptable undefined behavior.)
> io.open("x","w"):write("print'a';debug.debug();print'b'"):close()
true
> os.execute([[printf "print 'foo\0bar'\ncont\n" | lua x]])
a
lua_debug> (debug command):1: unfinished string near <eof>
lua_debug> b
true exit 0
# option parser
> collectgarbage "count\0to infinity please"
23.501953125
> f = io.open "foo" ; f:seek( "end\0this nonsense!", 0 )
19
> f:setvbuf( "no\0please stop!" )
true
(Funny but duh, who cares? Fixed arguments are documented, rest is
undefined.)
# loading files
> dofile "foo\0bar"
foo:1: wrong file
stack traceback: [...]
> loadfile "foo\0bar" ( )
foo:1: wrong file
stack traceback: [...]
> require "foo\0bar"
./foo.lua:1: wrong file
stack traceback: [...]
> package.loaded["foo\0bar"] = "ok"
> require "foo\0bar"
./foo.lua:1: wrong file
stack traceback: [...]
> oldpath = package.path ; package.path = oldpath .. ';./foo\0?.lua'
> require "unrelated"
./foo:1: wrong file
> package.path = "./bar\0?.lua;" .. oldpath ; require "foo"
stdin:1: module 'foo' not found:
no field package.preload['foo']
no file './bar'
no file '/usr/local/lib/lua/5.3/foo.so'
[...]
-- it skipped the rest of package.path and continued with cpath!
> package.path = oldpath
> package.preload["foo\0bar"] = "ok" ; require "foo\0bar"
./foo.lua:1: wrong file
stack traceback: [...]
-- package.searchers[k]: same problem
> package.searchpath( "foo\0bar", "./?" )
./foo
> package.searchpath( "foo", "/?;\0?;./?" )
nil
no file '/foo'
-- probably same behavior for package.cpath, package.loadlib
-- (I did not bother compiling a test lib)
# I/O library
(If you know C, you know that the underlying OS functions treat "\0" as
end of string, so you know this cannot work.)
> io.input("foo\0bar"):read'a'
error "wrong file"
> for l in io.lines "foo\0bar" do print( l ) end
error "wrong file"
> io.open("foo\0bar"):read'a'
error "wrong file"
> io.output "foo\0bar" :write 'error "overwrote wrong file!"' :flush()
> dofile "foo"
foo:1: overwrote wrong file!
stack traceback: [...]
> io.popen "echo 'foo\0bar'"
] sh: -c: line 0: unexpected EOF while looking for matching `''
] sh: -c: line 1: syntax error: unexpected end of file
file (0x6da7a0)
-- and maybe some more... getting bored
-- :read "l\0xyzzy" also works as "l"
# OS functions
(Same as above, if you know C...)
> os.execute "echo 'foo\0bar'"
] sh: -c: line 0: unexpected EOF while looking for matching `''
] sh: -c: line 1: syntax error: unexpected end of file
nil exit 1
> os.date"%Y\0garbage"
2016
> os.getenv "SHELL\0garbage"
/bin/bash
> os.remove "foo\0bar"
true
> dofile "foo"
cannot open foo: No such file or directory
stack traceback: [...]
-- OUCH!!! So there should probably be a note somewhere (os library, or
-- maybe further up because it also affects the io lib) and then maybe
-- an extra reminder on this function?
> os.setlocale 'C\0 at its best'
C
Phew… seems that was everything. (Anything not mentioned as broken or
skipped works as advertised.)
-- nobody