Hi List!
I often need to convert expressions written in different languages
between each other (typically C/C++ <-> Lua <-> LaTeX).
In a few cases I also need to automatically generate random
expressions of given complexity.
I'm currently an high-school teacher and most of my "expressions
conversion problems" relates to automating the generation of tests and
the calculation of their solutions.
I usually employ ad-hoc solutions implemented in Lua, using Lua gsub
function complemented with specific code to handle all the messy
details of the conversion.
I lately realized that I keep reinventing the same wheels over and
over, with small variations each time. I counted at least two dozens
such "wheels", which skyrockets to several hundreds slightly different
"wheels" (in ~10 years) if I count every time I copied/pasted/updated
a Lua script used to generate a new test!
Therefore I'd like to try and substitute all those ad-hoc solutions
with some kind of configurable parser. The idea is to build a simple
tree representation (AST?) of an expression that then can be easily
manipulated changing the tokens (and their meaning) and then
reconverted in an "equivalent" expression (with possible
modifications) in another language.
Unfortunately I have no formal education on parsing techniques (I've a
master-level degree in TLC-engineering, I'm not a computer scientist,
although I've ~40 yrs experience in programming using different
languages) and I'm also very short on time to dedicate to this problem.
Hence I'd like to ask the Lua community for help. Do you know any Lua
parser library/code/snippet that meets the following requirements?
- MIT License or similar "no hassle" license. I do not plan to create
something commercial with it, but I will probably distribute it in
some form, e.g. by giving some SW tools to students and/or maybe
colleagues. Therefore I don't want to be bogged down by copyleft or
similar licensing hassles.
- Pure Lua, with not too contrived/long code, because I'd like to be
able to patch it in my Lua code with possible
modifications/adjustment, so I would probably need to understand at
least part of the code.
- It's not required to be efficient. Just to give you an idea, if it
could parse about 100 expressions (one-liners, less than ~80 char) in
less than 1 second, I'll be fine. And if it could do 1000 expr/sec,
it'll be enough for any usage I could possibly devise.
- The parsed syntax should handle the following:
- It has an embedded lexical analyzer, so that I could define the
tokens syntax;
- grouping parentheses;
- unary prefix operators;
- binary operators;
- function calls with multiple arguments and single return value;
- the tokens representing the operators should be configurable;
- the tokens representing the grouping parentheses should be
configurable;
- precedence and associativity should be configurable, at least
to some degree;
- operands syntax should be configurable (e.g. so that it can
be programmed to recognize notations such as 2f03h,
0xFF00'0000'FA37, 0b0000'0000'1011, etc.)
- it comprises an expression generator that could convert the
expression representation into the expression again, also
using a different grammar. Or the representation format is
easy to convert to an expression again.
- The representation should be easy to create "by hand/by code",
i.e. without needing to parse an actual expression.
I need this to generate random expressions of given complexity
(for example, a random C expression with 3 level of parentheses
using 5 random bitwise binary operators and 16 bit operands).
Bonus point if it could also handle the following:
- unary postfix operators;
- C array / C struct member access / Lua table access syntax;
- C cast syntax (I know this is hard, sigh!)
- simple identifiers representing constants or variables;
- string literals with configurable delimiters;
- well-documented :-)
Some examples of expressions that should be handled:
C-like: (1u << 3) | (0xFFFF'0000 & 0b111h) + myfunc(a + 1)
LaTeX like: \NOT A + (B | C) \cdot \frac{1}{2}
If parsing LaTeX syntax is too hard (with that pesky macro syntax)
I could make without it, as long as the library is able to
*generate* a LaTeX expression from a (modified) representation
of a Lua or C expression.
- Simple to configure: I /can/ read a grammar in EBNF notation, but I
surely cannot design one to meet my requirements, so no LPEG solutions
(I don't know how to use LPEG, BTW) or other formal-grammar approach.
- Usable in Lua 5.3.
Thank you all for any help, suggestion or pointer!
Cheers!
-- Lorenzo
P.S.: the next 2-3 weeks will be very busy for me, so sorry in advance
if I don't answer promptly to any reply to this message.
However I preferred to post it now, so that when I'll have more time
to answer I would hopefully have a bigger picture of the available
options.