[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Could Lua itself become UTF8-aware?
- From: nobody <nobody+lua-list@...>
- Date: Fri, 5 May 2017 03:39:08 +0200
On 2017-05-01 21:41, Andrew Starks wrote:
There is one single language understood by every culture that is
doing scientific work: math.
Using math symbols in code requires Unicode. (Sure, you can write their
names, but then you're back at the problem of different languages.)
What problem does it solve? Is support for UTF-8 useful for
automated script processing or some sort of DSL application?
Unicode is extremely useful in combination with custom mixfix[1]
operators/notations. Without that, not so much. Lua does not even
permit defining custom operators. (Which is fine, it just makes
Unicode support much less useful.)
For DSL-purposes (and I'd count math as a DSL), Lua's syntax is already
extremely flexible. Liberally sprinkling everything with `__call`, you
can write `x 'op'` or `x 'op' (y)`, where 'op' can be any string
(including Unicode). (The mandatory parentheses are pretty annoying but
not absolutely terrible.) And if you want warts-free custom syntax,
there's also LPEG and/or ltokenp.
So while I know from experience that Unicode support plus mixfix
definitions can be absolutely awesome, Lua has neither custom operators
nor custom mixfix notations, and they're not compatible with what Lua is
/ how Lua works.[3][4] So adding Unicode support would add _some_
flexibility/convenience, but not very much. Given the complexity, it's
probably not worth it.
-- nobody
[1]: In Agda you can say
if_then_else_ true e₁ _ = e₁
if_then_else_ false _ e₂ = e₂
to define the usual `if <cond> then <expr1> else <expr2>` notation
(which isn't built-in, because booleans aren't built-in[2]). Just with
plain ASCII, this already allows defining more mathy notation; add
Unicode and you can literally type most of the usual notation (at least
of some sub-fields – logic, PLT, …) into your editor and have runnable code.
[2]: Well, technically they _are_ built-in, just not in the usual way.
You usually go
data Bool : Set where
false true : Bool
{-# BUILTIN BOOL Bool #-}
{-# BUILTIN TRUE true #-}
{-# BUILTIN FALSE false #-}
i.e. you define them yourself, and for some basic types you (optionally)
also tell the compiler that what you have here is equivalent to some
built-in concept so it can treat them specially. (This is particularly
important for natural numbers, where storing one bignum is vastly more
efficient than storing (succ (… (succ zero)…)) as a linked list of
(essentially dummy) nodes.
[3]: How would you load custom operators defined in a different file?
(That's incompatible with the current "return chunk as a function" model
– the current file is parsed before `require` runs and pulls in other
code & definitions.) And permitting any operators while parsing, only
complaining once they're found missing at run-time would be a terrible
choice. (Many many typos would no longer be load-time detectable but
would have to be triggered at run-time.)
[4]: A weak form of "custom operators" might be feasible: Haskell
permits putting arbitrary functions in back-ticks to use them as a
binary operator, e.g.
mod x y == x `mod` y
and having syntactic sugar x `f` y --> f( x, y ) would not be much work.
Because of the clear boundary, you can weaken the identifier rules and
so you could have x `⊗` (y `⊗` z) or something like that.
But that's not much better than x '⊗' (y '⊗' (z)) at the syntax level
(although it would get rid of a lot of `__call`s.)