Hello!
Just to clarify for you, the libraries I'm talking about are the ones you need to understand what unicode characters really are. UTF8 is about codepoints, which are numbers, and thus you cannot just say we will allow any sequence of codepoints not currently excluded to be an identifier: you will include many language's whitespace and punctuation. An example of this is the mongolian vowel separator, which can cause problems and has bitten languages in the past[1]. Thus you need to follow the proposal[2] mentioned in Philippe's reply which needs a library to be able to manipulate unicode data, the raw data file[3] for the unicode manipulation library julia uses is over half the size of stock lua 5.3.1 on my PC!