lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Take a look into lua-gumbo.  Simple code to extract tables looks like this.

gumbo = require('gumbo')
infile = arg[1]
doc=gumbo.parseFile(infile)
t=doc:getElementsByTagName('table')
for i=1,#t do
 print(t[i].outerHTML)
end

On Wed, Nov 18, 2015 at 9:55 PM, Nereus <codecomplete@free.fr> wrote:
Hello

I'd like to write a script to extract parts of an HTML page. Since Lua is so
small, it looks like a good match to run on an appliance.

A bit of research shows that it's not a good idea to use a regex engine, and
people recommend using an XML parser.

Is there a good tool I could use in Lua to parse HTML?

Thank you.



--
View this message in context: http://lua.2524044.n2.nabble.com/Good-solution-to-parse-HTML-tp7670415.html
Sent from the Lua-l mailing list archive at Nabble.com.