lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

It was thus said that the Great Dirk Laurie once stated:
> Op Sa. 8 Des. 2018 om 08:17 het Sean Conner <> geskryf:
> >
> > It was thus said that the Great Sean Conner once stated:
> >   So even if I were to use lpeg.Cmt() to force evaluation of all nested
> > captures, I'm still not garenteed to get what I want (I think---I tried and
> > no, it still didn't work, but I would like to hear from Roberto if I'm
> > interpreting this correctly.
> 1. Is this a question about what Cb, Cg and Cmt are supposed to do or
> a challenge to achieve your task?

  Yes and maybe, respectively.

> 2. Are you aware that Lua without Lpeg can do that task effortlessly?

  Support for mailcap has to deal with the following examples:

	/usr/bin/foo %s %t %{opt}
	/usr/bin/foo %s %t
	/usr/bin/foo %t %{foo} %{bar}

"%s" is replaced with a filename if it exists; otherwise the data is to be
piped in via stdin.  The "%{tag}" stuff relates to content type options, as
explained in the man page:

	If the command field contains "%{" followed by a parameter name and
	a closing "}", then all those characters will be replaced by the
	value of the named parameter, if any, from the Content-type header.

  I was trying to keep things simple, which is why I didn't include that
bit.  I have found a way to get this to work using LPeg:

lpeg = require "lpeg"

char = lpeg.P"%s" * lpeg.Carg(1) * lpeg.Carg(2)
     / function(s,f) s.redirect = false return f end
     + lpeg.P"%t" * lpeg.Carg(3) / "%1"
     + lpeg.P"%{" * lpeg.Carg(4) * lpeg.C(lpeg.R"az"^1) * lpeg.P"}"
     / function(tab,opt)
         return tab[opt] or ""
     + lpeg.R" ~"
cmd  = lpeg.Carg(1) / function(s) s.redirect = true end
     * lpeg.Cs(char^1) * lpeg.Carg(1)
     / function(c,s) return c,s.redirect end

print(cmd:match("foo -t %t %s",1,{},"/tmp/","application/x-foo",{}))
print(cmd:match("bar -t %t %{opt} %{beta}",   1,{},"/tmp/",
        "application/x-bar",{ opt='alpha' }))

  The first extra parameter is a table used for storing stateful infomration
(in this case, just a flag); the second parameter is the file, third is the
type and the final one is a list of options (the parsing of the name is
simplified for this example; the actual pattern includes both upper and
lower case letters, digits, and "-_.").  I can live with the stateful table
(I'm kind of 'meh' about it actually).

  Personally, I prefer LPeg as I find it easier to read than the Lua
patterns.  For instance, I found it easy to add support for the '%{opt}'
substitution and it's still one pass over the data.  Also, I'm already using
LPeg for parsing of URLs, parsing the mailcap file itself, gopher index
files, the mimetype value, sanitizing text strings [1] and converting HTML
entities to UTF-8 [2] so I'm already using it extensively.

  But hey, I'm interested in seeing an alternative ...

  -spc (Not sure I'll use it though ... )

[1]	Basically, removing control codes and escape sequences from text
	files.  Dumping raw text from unknown sources to a terminal is
	downright dangerous!

[2]	I have encountered gopher documents with HTML entities, since quite
	a bit of gopher content is mirrored from the web.  The LPeg code for
	that is quote short:

	local char = lpeg.P"&#" * lpeg.C(lpeg.R"09"^1)             * lpeg.P";" / utf8.char
	           + lpeg.P"&"  * lpeg.C(lpeg.R("az","AZ","09")^1) * lpeg.P";" / ENTITIES
	           + lpeg.P(1)
	return lpeg.Cs(char^0)

	The ENTITIES tables, however, is a bit longer ...