[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Any LPEG tutorial for laymen ?
- From: Sean Conner <sean@...>
- Date: Wed, 25 Sep 2013 01:28:58 -0400
It was thus said that the Great David Crayford once stated:
> On 24/09/2013 8:51 PM, Luiz Henrique de Figueiredo wrote:
> >>Take the following output from a netstat command.
> >>
> >>Client Name: SMTP Client Id: 000000B7
> >[...]
> >>I would love to learn how to write LPeg parser to yank the key->values
> >>from that multi-line report easily.
> >You don't need LPeg for this task. Try
> > for k,v in T:gmatch("(%u[%w ]-):%s*(.-)%s") do print(k,v) end
> >where T contains the netstat output.
>
> Thanks. This is how dumbstruck I am WRT pattern matching. I want to
> parse the following piece of netstat output
>
> SKRBKDC 00000099 UDP
> Local Socket: 172.17.69.30..464
> Foreign Socket: *..*
>
> The top line is the user, connection id and state. All I want to do is
> capture three whitespace seperated words.
>
> In REXX I would do this:
>
> parse var line userid connid state
>
> What is the most succinct way of doing something similar in Lua?
Using LPeg:
lpeg = require "lpeg" -- load up the module
-- this defines whitespace. It's just a space (ASCII 32).
-- alternatively, you can define it as:
--
-- SP = lpeg.S" \t"
--
-- Which defines whitespace as a set of characters (ASCII 32
-- and ASCII 9).
SP = lpeg.P" "
-- This defines a word. It's basically, at least one character (lpeg.P(1))
-- that is NOT a space (- SP). The "^1" is a loop operator of LPeg and here
-- it means "one or more". "lpeg.C()" is the capture function, and this is
-- what "captures" (or returns) what we are interested in.
word = lpeg.C( (lpeg.P(1) - SP)^1 )
-- And our line, which is three space separated words. In order to account
-- for multiple spaces, we use the loop operator on the whitespace. The
-- first bit, "SP^0" means "0 or more whitespace characters at the start of
-- the line." The "*" here can be read as "and", so translated, "optional
-- white space and a word and some space and a word and some space and a
-- word."
line = SP^0 * word * SP^1 * word * SP^1 * word
-- That's it for the parsing. This function just takes a line of text, and
-- splits it into three separate words. Right now, we just print them one
-- to a line, but the code could return all three or do whatever.
function parse(text)
local w1,w2,w3 = line:match(text)
print(w1)
print(w2)
print(w3)
print()
end
-- And some tests ...
parse "SKRBKDC 00000099 UDP"
parse " Local Socket: 172.17.69.30..464"
parse " Foreign Socket: *..*"
-spc