lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


As well, given the large possible choice for the first significant token of the annotation, we can say that:

- if the first token is "*" or "@", it is reserved for the standard specification of Lua including the future ones (so what follows that "*" must obey to these specifications)

- all other usable tokens are for extensions that can be entirely and easily ignored by a conforming parser that don't recognize it, it should not invalidate the interpretation of the rest of the syntax. But Lua may add a constraint for them, requiring these annotations to use the "surrounding rule" (with parentheses).



Le sam. 8 juin 2019 à 08:32, Philippe Verdy <verdy_p@wanadoo.fr> a écrit :


Le sam. 8 juin 2019 à 07:56, Egor Skriptunoff <egor.skriptunoff@gmail.com> a écrit :
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the _expression_

Variant2:
   "@table" is the attribute
   "(64){}" is the _expression_: you are invoking number 64 and pass empty table as argument :-

You are  repeating my identical remark when I replied to Lorenzo Donati, about why it is ambiguous (and then developed later).

And I also summarized it, but I can repeat my former analysis:

Any annotation in Lua can ONLY FOLLOW another token that:
-  marks the start of a simple statement (like "local", "return", or even ";" for the empty statement), or
 - marks the start of a composite syntaxic unit (like "(", "[", "{", or "begin"), or
 - marks the end of a composite syntaxic unit (like ")", "]", "}", or "end").

So it cannot occur in the middle of an _expression_ (except possibly immediately after "(", "[", "{" or ")", "]", "}", etc. but this depends on the permitted choice for the first token of annotations).

The first token used by the annotation
- MUST NOT be a valid unary operator (like "+", "-" or "not"),
- MUST NOT be a number constant or string constant.
- but it MAY be ANY other existing token

That first token MAY then ALSO unambiguouly be:
- binary operators like ("*", "..", "div", "or", "and", "<", "=", etc.) or ","
- or other reserved keywords used in compound statements ("begin", "end", "if", "then", "else", "do", "while", "repeat", "until", "for", "return", "break", "local", "function", etc.)
- or currently unused token like "@"
- or even possibly ";" (but I would not allow it as it would be errorprone with source code that is possibly partially commented out)

provided that:
- the annotation is ENTIRELY surrounded by "(...)" or "[...]" or "{...}" which would be required (except after some tokens like "local" which mark an explicit start of a new statement that can be annotated)
- or the first token is a currently undefined one like "@" (where the previous surrounding is not always needed)

So we have a large choice for defining them unambiguously and generalizing them!

I still think that choosing "@" for the first token is the best choice. But this does not invalidate the choice of "*" or "<", provided the surrounding rule is used.
and for the proposed syntax "<annotation>" is the worst choice, if we need to surround it by additional parentheses to avoid ambiguous shift-reduce conflicts in some places, resulting in the horrible "(<annotation>)" !