lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Le 10 sept. 2014 à 04:43, Steven Degutis <sbdegutis@gmail.com> a écrit :

>> For syntax highlighting, you don't need the parser, just the lexer, right?
> 
> I was under the impression that the parser just gives me more
> information about the file, so that I can color things more
> specifically, e.g. it may recognize an identifier as being a local
> variable or a global, so I can color them differently.
> 
> But perhaps this is not the case. I have not heavily looked into the
> source code of Lua's parser or lexer yet.

If you need more information than what the lexer provides, you may be interested by LuaSyntaxer, valuable at  https://bitbucket.org/jean_luc/luasyntaxer

Extract from the README:

> Lua Syntaxer adds syntax analyzing capabilities to Lua 5.2 at the C API level.
> 
> Syntax analysis is callback-based. It is performed in a lua_State by calling a single function lua_parser(). This function take a notifyfunction callback parameter which is called each time the parser has discovered significant syntax information in the analyzed Lua source code chunk.
> The internal parser is closely based on Lua's own syntax parser lparser.c. This provides a good level of confidence that the Lua syntax structure reported by Lua Syntaxer will be identical to the interpretation of this program by Lua byte code compiler.
> 
> Lua Syntaxer does not built the AST for the analyzed code chunk. It is intended to be a low-level utility on top of which programmers can build an AST with the appropriate structure matching their own needs. As such, Lua Syntaxer can be used for implementing a Lua syntax-aware text editor, a code static analysis tool …

If you want to start from the official  lexer, you will need to do at least the following changes  (you can see the corresponding code in the llex.h and llex.m files in LuaSyntaxer):
- add a token comment TK_COMMENT to the defined tokens and notify comments token in function llex(), instead of just skipping the comments;
- associate a character range with each returned token, which is more handy for syntax highlighting than just a line number. :-)

Jean-Luc