lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

(In response to discussions, especially Daurnimator and "David Given"...
apologies for message-thread breakage...)

> I want to write out strings to a file and read them back in again, in an
> ASCII-safe way. I'm writing them out with string.format("%q"), because
> it's cheap and easy[*], but now I need to read them back in again.

In one current project, I write the data as an executable script (after
all, Lua partially derives from the TeX-like DAT)... and simply require()
it in order to recover the data.  However, this is an agreement amongst
friends, whereas you're looking for a bullet-proof parser where the source
may not be trusted.

In my project, I'm "fingerprinting" a machine:

      -- Obtain raw-text general information about this machine
      M.CPUInfo        = assert(ReadFile("/proc/cpuinfo"))
      M.MemInfo        = assert(ReadFile("/proc/meminfo"))
      M.PCIInfo        = assert(PE.exec_quietly("lspci"))
      M.USBInfo        = assert(PE.exec_quietly("lsusb"))
      M.IfConfigInfo   = assert(PE.exec_quietly("ifconfig"))
      M.FSTabInfo      = assert(ReadFile("/etc/fstab"))
      M.PartitionsInfo = assert(ReadFile("/proc/partitions"))

(PE.exec and PE.exec_quietly are value-added versions of luaposix's
pipes x 3/fork/dup2 x 3/fd close x 3/execp, much of it shamelessly
stolen from an earlier version of luaposix.)

[The disk-partition information comes in handy as I usually boot off
of a "live" Linux CD/DVD, modified to have Lua and all the necessary
scripts in place to work neatly.  However, device names (sda/sdb etc)
can be jumbled, so I needed to be able to identify partitions
unambiguously.  Anyway...]

The raw-text output is quoted verbatim, along with quite a lot of
effort to characterise the disk drives (e.g. MBR, msdos/gpt partition
table, partition types/UUIDs, filesystem types, integrated with
information from fstab, such as mount point, backup pass, mount
options, etc.

I wanted the output of these utilities to appear verbatim, and so
decided to use:
      "[[".."]]" quoting; except when I found either of those markers
                in the raw text; and so tried
      "[=[" .. "]=]" quoting; except where I found either of those
                markers in the raw text; and had to try ...
      "[==[" .. "]==]" quoting; except where I found either of
                those markers in the raw text; and had to try ...

You can see where this is heading.

Perhaps "%Q" could be added as a string.format / pattern match specifier,
based around the dynamic set of long-string literal specifiers.  This
would satisfy the "ASCII-safe" needs above.

Some mechanism for bailing out if the specifiers get too long
('[' .. "=":rep(999999999) .. ']' ?!)  would be needed, given that the
string is untrusted.

Any comments?