is there a good reason why the rule block: { stat* } block: `Block{ stat* } ?
Several actually.
First, readability: one of the main design goals was to keep AST as human-readable as possible, and the main limiting factor proves to be verbosity. If we can spare a node or tag without affecting readability, we try to.
In a canonical AST, blocks only appear in fixed locations: as a chunk's root, and in predefined positions in loops, functions and conditionals (
`Forin `Fornum `Function `If). You know simply by the node's position that it must be a block, you never have to test it with
"if x.tag=='Block' then ... end".
For instance, the AST for "function(x, y) if y then return y else return x end" is:
`Function{ { `Id "x", `Id "y" }
{ `If{ `Id "y"
{ `Return{ `Id "y" } }
{ `Return{ `Id "x" } } } }
This is more readable, IMO, than:
`Function{ { `Id "x", `Id "y" }
`Block{ `If{ `Id "y"
`Block{ `Return{ `Id "y" } }
`Block{ `Return{ `Id "x" } } } }
(Actually, if you add a 'Block' tag on your blocks, most tools won't even notice, since we always know that the node is a block without looking at its tag)
As a counter-example, there is a `Pair{ } tag around key/value pairs in literal tables, which could have been avoided (since it's illegal to have blocks as table elements). { "a", b="c" } is encoded as:
`Table{ `String "a", `Pair{ `String "b", `String "c" } }
That's because:
- code often needs to test whether a literal table element AST is a hash pair or a list value; to do so, "if x.tag=='Pair' then ... end" is a nice and straightforward idiom;
- when a human being reads an AST, he tends to skip braces and rely on indentation, as with most Lisp-ish languages. Tag-less pairs were too easy to misinterpret as two consecutive values, misreading the AST of { "a", b="c" } as that of { "a", "b", "c" }.
Moreover, tag-less blocks are not the same as `Do{ } blocks: they don't delimit a scope (although the surrounding loop/function/if might create one). For the compiler, `Do{ stat1, stat2, stat3, stat4 } is semantically equivalent to `Do{ stat1, { stat2, stat3 }, stat4 }, and the latter can be generated from `Do{ stat1, stat4 } with table.insert(x, 2, {stat2, stat3}). This proves very practical for AST manipulations, such as splicing argument in quasi-quotes. With multi-values discarded by Lua in non-final elements of a table, we'd otherwise miss the equivalent to Lisp's unquote-splicing operator ",@".