[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: parser : which library ?
- From: Daurnimator <quae@...>
- Date: Tue, 31 May 2011 14:27:47 +1000
On 31 May 2011 14:06, Miles Bader <miles@gnu.org> wrote:
> Tony Finch <dot@dotat.at> writes:
>> LPEG is brilliant for small parsing tasks. Its main weakness is that it
>> doesn't provide much help for error reporting and recovery. If the
>> context of the code ensures that you don't have syntax errors,
>> e.g. machine-generated strings or other tools to check for errors, then
>> LPEG can get you a long way.
>
> I use LPEG to parse "human input" stuff, and haven't found it to be
> _that_ hard to offer basic error reporting, using periodic
> synchronization tokens in the grammar, and a small set of convenience
> functions (mapping string positions to line numbers, etc)...
>
> It would be nice if LPEG provided a little more support, of course, but
> even as it is, I think it's completely practical to use LPEG for
> traditional parsing tasks that need error handling.
>
> -Miles
>
> --
> Joy, n. An emotion variously excited, but in its highest degree arising from
> the contemplation of grief in another.
>
>
To get line numbers with lpeg I just make a backcapture to count the
line number, and a pattern for line ending:
require "lpeg"
local P , R , S , V = lpeg.P , lpeg.R , lpeg.S , lpeg.V
local C , Carg , Cb , Cc , Cg , Ct , Cmt = lpeg.C , lpeg.Carg ,
lpeg.Cb , lpeg.Cc , lpeg.Cg , lpeg.Ct , lpeg.Cmt
local locale = lpeg.locale ( )
local eof = P(-1)
local newline = P"\r"^-1 * "\n"
local incrementline = Cg ( Cb"linenum" / function ( a ) return a + 1
end , "linenum" )
local setup = Cg ( Cc ( 1 ) , "linenum" ) -- initiate line number
local comment = S"#" * (P(1)-newline)^0
local line = ( comment + P"hooah" ) * newline * incrementline + Cmt (
Cb"linenum" , function(str,i,linenum) if #str >= i then error("bad
line " .. linenum ) end end )
local patt = setup * line^1 * eof
print( patt:match[[
#this is a comment
hooah
hooah
hooah
hooah
hooah
This line will error with line number.
#not parsed
]] )
==> This is very easy way to parse stuff, and error msgs can be made clear.
Daurn.