lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Interesting discussion. To my mind this shows why it's NOT always a good idea to make everything pure BNF. Sometimes plain old English does it better. I think the compromise in the Lua ref guide is pretty sensible.

--TIm

On Jun 1, 2013, at 9:19 AM, Gavin Kistner <phrogz@me.com> wrote:

> On May 29, 2013, at 9:32 AM, Roberto Ierusalimschy <roberto@inf.puc-rio.br> wrote:
>>> * There is no definition of the formal syntax of a String in the EBNF,
>>> though there is in the prose of section 3.1. I suppose that this regex
>>> should match, though I've not tested extensively. Any refinements are
>>> welcome:
>>> /(?:"(?:[^\\"]|\\[abfnrtvz\\"']|\\\n|\\\d{1,3}|\\x[\da-fA-F]{2})*")|(?:'(?:[^\\']|\\[abfnrtvz\\"']|\\\n|\\\d{1,3}|\\x[\da-fA-F]{2})*')|(?:--\[(=*)\[\.+?\]\1\])/m
>> 
>> I am afraid I cannot read this very well. But I think the '[^\\"]'
>> in the beginning should be at least '[^\\"\n\r]' (or classes, by default,
>> excludes these characters?).
> 
> Ah, right you are, thank you.
> 
>> Also, the last part should not include '--' in the beginning.
> 
> Oops! Copy/paste from the long comment. Thanks.
> 
>>> * There is no definition of the formal syntax of a Number. Based on experimentation, it looks like this might be a valid regex for matching a Lua number:
>>> /-?\d*\.?\d+(e[-+]?\d+)?/i
>>> Anyone see anything wrong with that?
>> 
>> - A '-' is not part of a number (for the lexer). Otherwise, x-3 would
>> be read as 'x' followed by '-3'.
>> 
>> - The definition of a numeral in Lua follows C (which, in retrospect,
>> may not have been a very good idea). So, things like "3." are correct,
>> too.
> 
> Interesting and helpful.
> 
>> - Lua accepts hexadecimal numerals. (In 5.2, that includes floating-point
>> hexas too.)
> 
> Ah, yes, I had those covered in a separate section.
> 
> 
> FWIW, in the end I abandoned my recursive descent parser due to the presence of both direct and indirect left recursion in the grammar, and the amount of rewriting of the grammar I would have needed to do. (The more I rewrite the grammar, the less chance there is of it being correct.) In the end I ended up with one hella big regex for simple-but-effective syntax highlighting :)
> 
> https://github.com/Phrogz/coderay/blob/master/lib/coderay/scanners/lua.rb#L31