lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Note: this is exactly the same problem as in the syntax of Fortran (where it was evern more critical): Fortran had keywords for its instructions or some operators, and also allowed identifiers to have multiple letters; but whitespaces separators between them were optional (to save precious encoding size on punchcards...). This had a consequence: a very slow parser, very complex to write and maintain (even if the compiler produced a very fast program). Novel dialects of Fortran disallowed this practice of optional whitespaces between keywords and/or identifiers, forcing to use whitespaces everywhere it would create an ambiguity.

The Lua syntax is normally made to be parsed very esasily and efficiently with minimal code, so having to use rollbacks from a previous "shift" attempt to retry with a "reduce" was disallowed by design. This simplifies a lot the treatment of syntax error in the input, and requires simpler data structures if it does not have to maintain the possibility of rolling back.
As a consequence, we have a ";" in Lua, and we must use it everytime there's a possible ambiguity.

As well, Lua does not support "f(x=0)" because of currification, and then chosed to sacrifice the possibility of performing assignments in the middle of expressions.
C/C++/Java do not have currification, they still have ";" needed between statements but not always at end of them (if the statement is terminated by a "}" at end of a block), so they can have assignments in the middle of expressions (which can be statements by themselves like in Lua).

Lua could have chosen to use a keyword (like "void") to start a statement performing a function call without assigning its return value(s) in an assignment statement, this would also have solved the ambiguity of currified function calls. But it preferred to use ";" when needed (following the common pratice and because it was just simpler to type and read).

Lua also allowed parentheses around parameters of function to be discarded. But with currification it would have been even more problematic to disambiguite things without rolling back, if there was no ";" or "void" keyword before the next instruction. Still Lua is ambiguous about the parentheses surrounding the function parameters: it chooses to favor the "shift" action (and then parses the first opening parenthese after any leading subexpression, as meaning it marks a function call and not parentheses surrounding an _expression_ (the disambiguation is then needed when expressions are used in list of expressions for table initialisers or for fuinction parameters; Lua solved the problem by requiring a "," between expressions as as members an initializer list for a table, or expressions given as function parameters).




Le mer. 5 juin 2019 à 21:47, Philippe Verdy <verdy_p@wanadoo.fr> a écrit :


Le mer. 5 juin 2019 à 16:42, Dibyendu Majumdar <mobile@majumdar.org.uk> a écrit :
On Wed, 5 Jun 2019 at 15:35, Matthew Wild <mwild1@gmail.com> wrote:
>
>   local resource x = foo()
>
> to be interpreted as
>   local <toclose> x = foo()
> or
>   local resource; x = foo()
>

The former as the latter is impossible.
So if you say:

local resource = 0

Then resource is a variable.

But if you say:
local resource x = 0
Then x is a variable, and resource is a qualifier.
No because it is also equivalent to
  local resource
  x = 0
which declares a new variable named "resource" (without an initialiser, so initialized to nil) and then performs an assignment to the variable named "x" in the current scope.
Writing it on one line as
  local resource x = 0
does not change things.

Let's remember that ";" separators of statements are already optional in Lua; they are suggested only in some cases where disambiguation is needed (caused by the already permitted currification of function calls):
  print f; (x)(y)[0] = 1
  print f(x); (y)[0] = 1
where you have to wonder what is the meaning of:
   print f(x)(y)[0] = 1
It's true that Lua instructions must either start by a keyword, otherwise they are assignments or function calls; currification without required ";" complicates the separation when there are multiple assignments or function calls. So the ";" was finally introduced (the Lua designer most probably wanted to avoid the ";" need everywhere, but introducing the currified syntax required the addition of ";" to separate ambiguous statements.

Ambiguous statements are not permitted in Lua syntax, or have a mandatory associativity (in a LR parser, normally the parser should should choose the "shift" action instead of "reduce", to favor longer instructions (however the parser can still rollback if needed to retry with a "reduce" action); if it chooses the "reduce" action, it favors smaller instructions but once the reduce action has been performed, the parser can no longer rollback to use a "shift" instead, because "reduce" actions cannot be easily rolled back; rolling back from a "shift" is trivial, it just means pushing back the unprocess tokens in excess to a cache of the token input stream)

Adding the ";" (which remains optional) allows the parser to "reduce" immediately without trying a "shift" and processing a (possibly infinite) stream, which it will do by default. And no rollback is ever needed. The currification of function calls becomes possible and safe.