[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: parser hacking: stringification
- From: Sven Olsen <sven2718@...>
- Date: Fri, 16 Nov 2012 13:45:45 -0800
Hrm, so I'm starting to doubt that a patch file is the best way of sharing this hack. My hunch is that most people who might want it are probably already using hacked versions of lparser.c, so, a machine readable diff isn't going to be ideal :)
But the code is easy enough to talk through, so here's a quick implementation-level description.
The entry point for the table shorthand is in lparser.c : field(), where you can hook into the table constructor by adding a switch case:
static void field (LexState *ls, struct ConsControl *cc) {
/* field -> listfield | recfield */
switch(ls->t.token) {
case TK_CONCAT: {
luaX_next(ls);
table_shorthand(ls,cc);
break;
}
Implementing the shorthand is then just a matter of writing table_shorthand(), a modified version of rectfield() that gets both the key and the value from the _expression_ following '..'. Such a modification is fairly easy, once you realize that the lexer stores semantic info for the most recent string literal, name, or numeric constant inside ls->t.seminfo.
In the case that the most recent seminfo is a numeric constant, we probably want to throw a syntax error. It's unclear if {..77} should be interpreted as {[77]=77}, or {["77"]=77}, and, neither interpretation seems likely to be that useful in practice.
But, if the most recent stored seminfo is a string, it's probably a sensible choice for our key.
/* a quick shorthand hack that transforms {..f} to {f=f}. */
static void table_shorthand (LexState *ls, struct ConsControl *cc) {
expdesc key, val;
FuncState *fs = ls->fs;
int reg;
int rkkey;
checklimit(fs, cc->nh, MAX_INT, "items in a constructor");
cc->nh++;
expr(ls, &val);
reg = ls->fs->freereg;
if(ls->t.seminfo.ts) {
codestring(ls, &key, ls->t.seminfo.ts);
}
else {
luaX_syntaxerror(ls, ".. shorthand used with a non-string _expression_.");
}
rkkey = luaK_exp2RK(fs, &key);
luaK_codeABC(fs, OP_SETTABLE, cc->t->
u.info, rkkey, luaK_exp2RK(fs, &val));
fs->freereg = reg; /* free registers */
}
Implementing the function shorthand is a little trickier. Rather than calling expr() from explist(), we will call dup_expr(), an expr() wrapper that will check for and handle the .. shorthand. However, the implementation of dup_expr() is necessarily ugly, because we won't be ready to insert the seminfo string into the call stack after we're finished parsing the _expression_. As I understand Lua's bytecode generation, the best way to do this is by reserving a spot before we start the _expression_ parse, and then using a kludge to convince luaK_exp2nextreg to write the string into the empty slot.
/* another parser hack. this one turns foo(..bar) into foo("bar",bar). */
static void dup_expr(LexState *ls, expdesc *v, int *np) {
int dup = testnext(ls,TK_CONCAT);
FuncState *fs = ls->fs;
int reg;
if(dup) {
reg=fs->freereg;
luaK_reserveregs(fs, 1);
}
expr(ls,v);
if(dup) {
if(ls->t.seminfo.ts) {
int old_free = fs->freereg;
expdesc varname;
codestring(ls, &varname, ls->t.seminfo.ts);
/* trick luaK_exp2nextreg into writting to the */
/* previously reserved register. i believe this is safe.. */
fs->freereg=reg;
luaK_exp2nextreg(fs, &varname);
fs->freereg=old_free;
(*np)++;
}
else luaX_syntaxerror(ls, "stringification shorthand used on a non-string _expression_.");
}
}
static int explist (LexState *ls, expdesc *v) {
/* explist -> expr { `,' expr } */
int n = 1; /* at least one _expression_ */
dup_expr(ls, v,&n);
while (testnext(ls, ',')) {
luaK_exp2nextreg(ls->fs, v);
dup_expr(ls, v,&n);
n++;
}
return n;
}
I should note that my own parser is based on the 5.2 source -- but, I suspect that both hacks would also work with 5.1. If anyone tries it, let me know whether you have any success :)
-Sven