lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great joy mondal once stated:
> Hi Sean Conner,
> 
> Thanks for the quick reply !
> 
> You have answered both my questions
> 
> ( I need to try and write the first example since I am still unsure about
> the LPEG code for capturing indentation - left curly and right curly are
> simple enough but I am thrown off regarding indentation ).

  Easy enough:

	indent = lpeg.P"\n" * lpeg.P" "^0

Or, if your gammar ends on a newline, then skip the initial lpeg.P"\n".  To
save the current level, you could do something like (untested):

	indent = lpeg.P"\n" * (lpeg.C(lpeg.P" "^0) * lpeg.Carg(1))
	       / function(indent,info)
	       	   -- ------------------
	       	   -- if the length of indent is larger than the current
	       	   -- indent level, we have a new indent level
	       	   -- -----------------
	       	   
	           if #indent > #info.identlevels[#info.identlevels] then
	             table.insert(info.identlevels,ident)
	             return "{" -- simulate an opening bracket or
	             	        -- whatever you use to indicate new indent
	          
	           elseif #indent < #info.indentlevels[#info.identlevels] then
	             table.remove(info.identlevels)
	             return "}"
	           
	           else
	             return " " -- just return something neutral
	           end
	         end



  This treats indent levels as either an opening brace, closing brace or
space (change to suite your needs).  But you do need to call the top level
parsing rule to:

	ast = parser:match(text,1,{ indentlevels = { "" } })
	
paramters to lpeg.match past the initial position argument are available via
lpeg.Carg(), and I'm using that here to keep track of some addtional
information during the parse (you could skip this and keep this info in
globals, but I dislike globals as much as possible).  Here, the indentlevels
array is just a stack of seen indents.  If we get an indent that is longer
than the current one, we have a new level, and if it's shorter, we've ended
a level, and if it matches, we're still in the current level.

  If you want to handle tabs as spaces, you can do it, but it can get
complicated.

> For the first issue, yes I need to create a hierarchical tree instead of a
> flat output, normally examples of lexer output online show a stream of
> tokens, but LPEG creates an AST directly.

  Oh, LPeg can create a stream of tokens---I've had to do that type of stuff
in certain circumstances.  It's not hard, but you do have to track a bit
more information:

	local parser = lpeg.C( --[[ LPeg code ]]-- ) * lpeg.Cp()
	
	local text = " ... code to parse here ... "
	local pos  = 1
	local info = { --[[ additional information used for parsing ]]-- }
	
	while pos <= #text do
	  local token,newpos = parser:match(text,pos,info)
	  if not token then
	    error "Error parsing"
	  end
	  -- process token
	  pos = newpos
	end
	
Basically, the parser bit will return the next logical token and the
position to resume parsing the text for the next token.

  -spc