lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hello,

A while ago I've made a C++ port of the Lua pattern matching functions that is called 'lex'. [0]
This library provides a nice C++17 API for the match and gsub functions and a way to iterate with a pattern similar to gmatch.

Recently I've been trying to improve the performance by avoiding redundant parsing of the input strings and patterns.

The following has been changed:
- Match a character/bracket class in a single pass.
   Previously this was done in two steps; first to get the character count of the class and then matching a character from the input string with the the character/bracket class.
- Instead of iterating char-by-char when searching for a match, advance multiple characters when possible.
  This could be an issue with the initial port and may not be related to original Lua implementation.
- Validate the pattern before usage and remove pattern error checks in the parser functions.

With these changes I think that I've reached the limit of the current design of the parser.

The changes have made a significant overall performance improvement on my Linux system when I build the library with GCC.
However, when building the library with the MSVC compiler the benchmark results got very odd, from almost twice as fast to twice as slow and everything in between.
Unfortunately I didn't found the cause of these odd results.

Because of the odd benchmark results I don't want to merge the experimental branch [1] to the main branch.
But also I would like others to know about this and give them the opportunity to test if they benefit from the changes and us it in their projects.

Enjoy!

-- Jasper

[0] https://github.com/PG1003/lex
[1] https://github.com/PG1003/lex/tree/experimental