[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: Curiosity of decimal numbers followed by concatenation
- From: Gé Weijers <ge@...>
- Date: Sat, 06 Feb 2010 04:39:27 -0800
Peter,
I'd have to agree with you: this is strange behavior (even a bug). The
lexer takes a shortcut by scanning past the first '.'. It should stop
there.
There's more: the lexer will read all of this as a single number, and of
course fail to convert it:
1.2.3.4.5e+say_cheese
The lexer's routine 'read_numeral' matches the regular expression
[0-9.]+(e[+-]?)[a-zA-Z0-9]*
The routine is called when the input either begins with a digit or '.'
followed by a digit.
The relevant code: (Lua 5.1.4)
> static void read_numeral (LexState *ls, SemInfo *seminfo) {
> lua_assert(isdigit(ls->current));
> do {
> save_and_next(ls);
> } while (isdigit(ls->current) || ls->current == '.');
> if (check_next(ls, "Ee")) /* `E'? */
> check_next(ls, "+-"); /* optional exponent sign */
> while (isalnum(ls->current) || ls->current == '_')
> save_and_next(ls);
> save(ls, '\0');
> buffreplace(ls, '.', ls->decpoint); /* follow locale for decimal point */
> if (!luaO_str2d(luaZ_buffer(ls->buff), &seminfo->r)) /* format error? */
> trydecpoint(ls, seminfo); /* try to update decimal point separator */
> }
On Thu, 2010-02-04 at 23:53 +0000, Peter Cawley wrote:
> Hello all,
>
> I was recently writing some syntax highlighting code for Lua, and
> while trying to duplicate the behaviour of the standard Lua parser, I
> noticed a curious behaviour. The first two of the following examples
> are completely normal. The third is unexpected to me - according the
> reference manual on numbers, "A numerical constant can be written with
> an optional decimal part and an optional decimal exponent", so I would
> expect it to be parsed like the first example, as the parser should
> stop trying to match a number at the second decimal point. The fourth
> example is included for completeness; it could in theory be parsed
> like the second example, or as 1. followed by . and ""
>
> > =1. ..""
> 1
> > =1 ..""
> 1
> > =1...""
> stdin:1: malformed number near '1...'
> > =1..""
> stdin:1: malformed number near '1..'
>
> Are there are reasons why the third example is parsed like it currently is?