lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great David Given once stated:
> I want to write out strings to a file and read them back in again, in an
> ASCII-safe way. I'm writing them out with string.format("%q"), because
> it's cheap and easy[*], but now I need to read them back in again.
> 
> Actually going through the string and parsing the escapes seems like a
> lot of work. A faster but much scarier alternative is to wrap the string
> with 'return (<string>' and compile and execute it in a restricted
> environment.
> 
> Given that the interpreter already contains code to safely turn an
> escaped string into a Lua string, there must be a better way --- what is it?

  I turn to LPeg for any parsing job.  Here's an example that does what you
want:

local lpeg   = require "lpeg"

local digit  = lpeg.R"09"
local escape = lpeg.P[[\]] / ""	-- handle escape codes
             * (
                 (digit * digit^-2)	-- handle \nnn, return actual byte
                 / function(c)
                     return string.char(tonumber(c))
                   end
                 + lpeg.P"\n" / "\n"	-- handle 'continue to next line' escape
                 + lpeg.P'"'  / '"'	-- double quote
		 + lpeg.P"'"  / "'"	-- single quote
                 + lpeg.P"a"  / "\a"	-- other escapes
                 + lpeg.P"b"  / "\b"
                 + lpeg.P"t"  / "\t"
                 + lpeg.P"n"  / "\n"
                 + lpeg.P"v"  / "\v"
                 + lpeg.P"f"  / "\f"
                 + lpeg.P"r"  / "\r"
               )
local line   = lpeg.P'"' 	-- must start with a quote
             * lpeg.Cs( ( escape + (lpeg.P(1) - lpeg.S[["\]]))^0 )
             * lpeg.P'"'	-- must end with a quote

test = [["o\"ne\1two\3th\"ree\
four\9five\127hello"]]

x = line:match(test)
print(test) print()
print(x)    print()
y = string.format("%q",x)
print(y)    print()

  The real work happens in the lpeg.Cs() call, which iterates through the
string, replacing escape sequences with their literal replacement.  I expect
this to be, speed wise, similar to using the Lua parser since it compiles
into its own VM.

  -spc