On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:
The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:
local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
::
labelname
::
x = math
.
pi
goto labelname
My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?
Your example:
local myTable = @table(64) {}
Variant1:
"@table(64)" is the attribute
"{}" is the _expression_
Variant2:
"@table" is the attribute
"(64){}" is the _expression_: you are invoking number 64 and pass empty table as argument :-
You are repeating my identical remark when I replied to Lorenzo Donati, about why it is ambiguous (and then developed later).
And I also summarized it, but I can repeat my former analysis:
Any annotation in Lua can ONLY FOLLOW another token that:
- marks the start of a simple statement (like "local", "return", or even ";" for the empty statement), or
- marks the start of a composite syntaxic unit (like "(", "[", "{", or "begin"), or
- marks the end of a composite syntaxic unit (like ")", "]", "}", or "end").
So it cannot occur in the middle of an _expression_ (except possibly immediately after "(", "[", "{" or ")", "]", "}", etc. but this depends on the permitted choice for the first token of annotations).
The first token used by the annotation
- MUST NOT be a valid unary operator (like "+", "-" or "not"),
- MUST NOT be a number constant or string constant.
- but it MAY be ANY other existing token
That first token MAY then ALSO unambiguouly be:
- binary operators like ("*", "..", "div", "or", "and", "<", "=", etc.) or ","
- or other reserved keywords used in compound statements ("begin", "end", "if", "then", "else", "do", "while", "repeat", "until", "for", "return", "break", "local", "function", etc.)
- or currently unused token like "@"
- or even possibly ";" (but I would not allow it as it would be errorprone with source code that is possibly partially commented out)
provided that:
- the annotation is ENTIRELY surrounded by "(...)" or "[...]" or "{...}" which would be required (except after some tokens like "local" which mark an explicit start of a new statement that can be annotated)
- or the first token is a currently undefined one like "@" (where the previous surrounding is not always needed)
So we have a large choice for defining them unambiguously and generalizing them!
I still think that choosing "@" for the first token is the best choice. But this does not invalidate the choice of "*" or "<", provided the surrounding rule is used.
and for the proposed syntax "<annotation>" is the worst choice, if we need to surround it by additional parentheses to avoid ambiguous shift-reduce conflicts in some places, resulting in the horrible "(<annotation>)" !