lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hello Lorenzo !

You've not LPeg but thereis the "re" module where you can write in a kind of EBNF, I would recomend https://github.com/edubart/lpegrex the author has done a parser for Lua https://github.com/edubart/lpegrex/blob/main/parsers/lua.lua and C11 https://github.com/edubart/lpegrex/blob/main/parsers/c11.lua that really parses real C projects like sqlite3 amalgamation.

For manipulating C structures then there is a somehow pure Luajit implementation of LPeg https://github.com/sacek/LPegLJ .

An also in pure Lua https://github.com/pygy/LuLPeg .

Then there is tools to analyze/visualize grammars https://www.bottlecaps.de/rr/ui and https://www.bottlecaps.de/convert/ and it's parser generator https://www.bottlecaps.de/rex/ in several languages.

For easy iterative testing grammars there is several online playgrounds:

- cpp-peglib https://yhirose.github.io/cpp-peglib/

- peggy https://peggyjs.org/online.html

- pest https://pest.rs/

I'm looking at this lightweight project that's promising too https://github.com/ChrisHixon/chpeg/ .

Then there is this ones https://en.wikipedia.org/wiki/Comparison_of_parser_generators

Cheers !

On 12/6/22 14:10, Lorenzo Donati wrote:
Hi List!

I often need to convert expressions written in different languages between each other (typically C/C++ <-> Lua <-> LaTeX). In a few cases I also need to automatically generate random expressions of given complexity.

I'm currently an high-school teacher and most of my "expressions conversion problems" relates to automating the generation of tests and the calculation of their solutions.

I usually employ ad-hoc solutions implemented in Lua, using Lua gsub function complemented with specific code to handle all the messy details of the conversion.

I lately realized that I keep reinventing the same wheels over and over, with small variations each time. I counted at least two dozens such "wheels", which skyrockets to several hundreds slightly different "wheels" (in ~10 years) if I count every time I copied/pasted/updated a Lua script used to generate a new test!

Therefore I'd like to try and substitute all those ad-hoc solutions with some kind of configurable parser. The idea is to build a simple tree representation (AST?) of an expression that then can be easily manipulated changing the tokens (and their meaning) and then reconverted in an "equivalent" expression (with possible modifications) in another language.

Unfortunately I have no formal education on parsing techniques (I've a master-level degree in TLC-engineering, I'm not a computer scientist, although I've ~40 yrs experience in programming using different languages) and I'm also very short on time to dedicate to this problem.

Hence I'd like to ask the Lua community for help. Do you know any Lua parser library/code/snippet that meets the following requirements?

- MIT License or similar "no hassle" license. I do not plan to create something commercial with it, but I will probably distribute it in some form, e.g. by giving some SW tools to students and/or maybe colleagues. Therefore I don't want to be bogged down by copyleft or similar licensing hassles.

- Pure Lua, with not too contrived/long code, because I'd like to be able to patch it in my Lua code with possible modifications/adjustment, so I would probably need to understand at least part of the code.

- It's not required to be efficient. Just to give you an idea, if it could parse about 100 expressions (one-liners, less than ~80 char) in less than 1 second, I'll be fine. And if it could do 1000 expr/sec, it'll be enough for any usage I could possibly devise.

- The parsed syntax should handle the following:

   - It has an embedded lexical analyzer, so that I could define the
     tokens syntax;
   - grouping parentheses;
   - unary prefix operators;
   - binary operators;
   - function calls with multiple arguments and single return value;
   - the tokens representing the operators should be configurable;
   - the tokens representing the grouping parentheses should be
     configurable;
   - precedence and associativity should be configurable, at least
     to some degree;
   - operands syntax should be configurable (e.g. so that it can
     be programmed to recognize notations such as 2f03h,
     0xFF00'0000'FA37, 0b0000'0000'1011, etc.)
   - it comprises an expression generator that could convert the
     expression representation into the expression again, also
     using a different grammar. Or the representation format is
     easy to convert to an expression again.
   - The representation should be easy to create "by hand/by code",
     i.e. without needing to parse an actual expression.
     I need this to generate random expressions of given complexity
     (for example, a random C expression with 3 level of parentheses
     using 5 random bitwise binary operators and 16 bit operands).

   Bonus point if it could also handle the following:

   - unary postfix operators;
   - C array / C struct member access / Lua table access syntax;
   - C cast syntax (I know this is hard, sigh!)
   - simple identifiers representing constants or variables;
   - string literals with configurable delimiters;
   - well-documented :-)

   Some examples of expressions that should be handled:

   C-like:       (1u << 3) | (0xFFFF'0000 & 0b111h) + myfunc(a + 1)
   LaTeX like:   \NOT A + (B | C) \cdot \frac{1}{2}

   If parsing LaTeX syntax is too hard (with that pesky macro syntax)
   I could make without it, as long as the library is able to
   *generate* a LaTeX expression from a (modified) representation
   of a Lua or C expression.


- Simple to configure: I /can/ read a grammar in EBNF notation, but I surely cannot design one to meet my requirements, so no LPEG solutions (I don't know how to use LPEG, BTW) or other formal-grammar approach.

- Usable in Lua 5.3.


Thank you all for any help, suggestion or pointer!

Cheers!

-- Lorenzo


P.S.: the next 2-3 weeks will be very busy for me, so sorry in advance if I don't answer promptly to any reply to this message.

However I preferred to post it now, so that when I'll have more time to answer I would hopefully have a bigger picture of the available options.