lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Saluton!

I pushed on the relexicalisation branch[1] the Makefile that should be make the job regarding i18n options on every platform which support the "!=" syntax.

Now I want to focus on dynamic (un)loading of reserved keywords aliases.

I already changed the code to dynamicaly load the token with malloc rather than using a static const struct.

To go step by step, I would first already be able to load the 18n_eo.lua, so that each assigned value of this code become a token alias of its value. Possibly this might be change to be stored in a single returned table. As a first approach I may go and just make some scanf of each line, that might already be a good next step. But may using some facilities provided by some lua libraries would make more sense.

So I'm looking at provided function which can simply take a lua code which return a table of string usable from the the C code. I need to read more of the Lua documentation to have a better sense of what is possible. So far my attempt to include call to a lua script from i18n_tokenproxy.c just lead to segmentationfault at first evaluation.

In fact, even with the basic example[2] I do have some problem on Fedora 24, using gcc, just using the provided example and using `cc -o test test.c -I/usr/local/include -L/usr/local/lib -llua -lm -ldl` to compile, `./test` returns:

Couldn't load file: script.lua5: syntax error near 'i'

If I remove the loop and just let the io.write instructions, it seems to work.

On further perspective, I think the loaded translation might be stored in some G.__i18n table. There might be a sub-table for each locale, so G.__i18n["eo"]["="] might have ["samas", "egalas"] affected for example. That could let user (un)load this tables, and possibly weight precedence. That might useful even with just the English default tokens and an other language that might have some colliding tokens. Of course there are all the performance problematics which should be took into account in how the feature is implemented… So if you have ideas on how it should look like in term of API and/or implementation details, please provide feedback.


[1] https://github.com/psychoslave/lua-i18n/tree/relexicalisation

[2] http://lua-users.org/wiki/SimpleLuaApiExample


Le 23/11/2016 à 17:29, mathieu stumpf guntz a écrit :

Ok, maybe this attached file is a somewhat portable solution, providing the environemnt have a shell with the test command and that the $(shell […]) syntax is portable. I don't know how yet if this conditions holds in most places where you have a make implementation installable. Feedback is welcome.


Le 23/11/2016 à 14:53, mathieu stumpf guntz a écrit :

Ok, so I began to work on the Makefile part to isolate changes introduced in Lua-i18n. The idea is to have several Makefile variables, so that each i18n facility can be enabled or disabled at compile time. So if you disable all of them, you end up with the same result that you get with the official Lua.

Ideally, you should just have to let each internationalization cflags alone when you want the feature, and comment the others. Then make should compile everything seamlessly, the Makefile adding dependencies (or not) based on cflags (un)definition state.

The easy portable part is defining C macro variables in the Makefile to make changed code conditional. The more difficult part is making part of dependencies conditional in a portable fashion. Indeed most make implementation out there include some conditional control structure, but it's seems that currently there is no standard straight forward portable solution. At least, I didn't found one.

But maybe you have some thoughts to share on the matter. :)

Related reading:

http://gallium.inria.fr/blog/portable-conditionals-in-makefiles/

https://docs.google.com/document/d/1oUR7iMnaNzkeT3TTOS-Gwul6_V3TE8caIDAd1FwPyNc/edit#

https://www.gnu.org/software/make/manual/html_node/Conditional-Syntax.html

https://www.mkssoftware.com/docs/man5/makefile.5.asp


Le 22/11/2016 à 08:26, mathieu stumpf guntz a écrit :


Le 22/11/2016 à 06:25, Dirk Laurie a écrit :
But (finally returning to the topic of the thread) the fact that APL
and Lua for all practical purposes inhabit different lexical worlds
makes it possible to write a single interpreter that mixes APL and
Lua at use input level. I'd hate to give that up. Si I guess that I don't
want internationalization in standard Lua. Call it Lua-i18n as the
OP said, but please keep that i18n in the name at all levels. Don't
call it Lua.

For the part which enable to use other lexems than the default reserve ones, the idea is to make that only through a loadable scope contained feature.

For the internationalization of error messages, it should not introduce any change in the language interpretation. Possibly this might have an impact on the executable size, but it also might have some way to manage that as an optional feature at compile time so you get zero extra weight. Possibly you may even scrape some bytes with a "null" locale which get rid of messages or at least use void string. Feedback is welcome if you have some suggestions on this topic.

For the unicode support, I don't yet have enough knowledge to forecast what the consequence would be. But once again, this might be implemented as an optional feature at compile time. You might even think of more character encoding if you are nostalgic of the good old ISO 8859 mess or that you have some peace of software that you need, can't make evolve, and rely on non-ASCII character encoding.

Which point would still sound problematic to your mind? Can you please provide some details, possibly with an concrete code example of how it would be problematic to you?




nee = '~'
disauxe = '^'
superas = '>'
malinfraas = '>'
suras = '>='
almenauxas = '>='
malsubas = '>='
egalas = '=='
samas = '=='
malsamas = '~='
neegalas = '~='
infraas = '<'
malsuperas = '<'
subas = '<='
malsuras = '<='
malalmenauxas= '<='
kaje = '&'
auxe = '|'
sobsxove = '>>'
sorsxove = '<<'
plus = '+'
mal = '-'
kontraux = '-'
minus = '-'
disige = '/'
divide = '/'
ozle = '/'
onige = '//'
parte = '//'
pece = '//'
kvociente = '//'
module = '%'
kongrue = '%'
alt = '^'
potencige = '^'
io.write("This is coming from lua.\n")
/*
* proxy.c
* lexer proxy for Lua parser -- allows aliases
* Luiz Henrique de Figueiredo <lhf@tecgraf.puc-rio.br>
* Sun Nov 13 09:24:13 BRST 2016
* This code is hereby placed in the public domain.
* Add <<#include "proxy.c">> just before the definition of luaX_next in llex.c
*/

#define TK_ADD		'+'
#define TK_BAND		'&'
#define TK_BNOT		'~'
#define TK_BOR		'|'
#define TK_BXOR		'^'
#define TK_DIV		'/'
#define TK_GT		'>'
#define TK_LT		'<'
#define TK_MINUS	'-'
#define TK_MOD		'%'
#define TK_POW		'^'
#define TK_SUB		'-'

/**
TODO:
 - how to make the lexem type scope confined in this file? adding static will make compilation failâ?¦
 - manage unload of lexicon
 * */
typedef struct {
    char *name;
    int token;
} lexem;

#include <lua.h>
#include <lauxlib.h>
#include <lualib.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

/**
 * length is assigned with the number of element in the returned table of lexem struct, because it seems that in C
 * there is no way to get this value. An other way to manage that would be to have a conventional value for the last
 * element.
 */
static lexem* load_lexicon(int *length)
{

    /*
     // This part alone if uncommented will just coredump the lua interpreter as soon as it tries to evaluate something
    int status, result, i;
    double sum;
    lua_State *L;
    L = luaL_newstate();

    luaL_openlibs(L);

    status = luaL_loadfile(L, "i18n_eo.lua");
    if (status) {
        fprintf(stderr, "Couldn't load file: %s\n", lua_tostring(L, -1));
        exit(1);
    }

    lua_close(L);
     */

    const int aliases_number = 35;
    *length = aliases_number;
    lexem * aliases = malloc(aliases_number * sizeof *aliases);
    /*
     // rather than this, keywords should be loaded from a lua source file
     * */
    aliases[ 0].name = "nee";	        aliases[ 0].token = TK_BNOT;
    aliases[ 1].name = "disauxe";	    aliases[ 1].token = TK_BXOR;
    aliases[ 2].name = "superas";	    aliases[ 2].token = TK_GT;
    aliases[ 3].name = "malinfraas";    aliases[ 3].token = TK_GT;
    aliases[ 4].name = "suras";	        aliases[ 4].token = TK_GE;
    aliases[ 5].name = "almenauxas";    aliases[ 5].token = TK_GE;
    aliases[ 6].name = "malsubas";	    aliases[ 6].token = TK_GE;
    aliases[ 7].name = "egalas";	    aliases[ 7].token = TK_EQ;
    aliases[ 8].name = "samas";   	    aliases[ 8].token = TK_EQ;
    aliases[ 9].name = "malsamas";	    aliases[ 9].token = TK_NE;
    aliases[10].name = "neegalas";	    aliases[10].token = TK_NE;
    aliases[11].name = "infraas";	    aliases[11].token = TK_LT;
    aliases[12].name = "malsuperas";    aliases[12].token = TK_LT;
    aliases[13].name = "subas";	        aliases[13].token = TK_LE;
    aliases[14].name = "malsuras";	    aliases[14].token = TK_LE;
    aliases[15].name = "malalmenauxas";	aliases[15].token = TK_LE;
    aliases[16].name = "kaje";	        aliases[16].token = TK_BAND;
    aliases[17].name = "auxe";	        aliases[17].token = TK_BOR;
    aliases[18].name = "sobsxove";	    aliases[18].token = TK_SHR;
    aliases[19].name = "sorsxove";	    aliases[19].token = TK_SHL;
    aliases[20].name = "plus";	        aliases[20].token = TK_ADD;
    aliases[21].name = "mal";    	    aliases[21].token = TK_MINUS;
    aliases[22].name = "kontraux";	    aliases[22].token = TK_MINUS;
    aliases[23].name = "minus";	        aliases[23].token = TK_SUB;
    aliases[24].name = "disige";	    aliases[24].token = TK_DIV;
    aliases[25].name = "divide";	    aliases[25].token = TK_DIV;
    aliases[26].name = "ozle";	        aliases[26].token = TK_DIV;
    aliases[27].name = "onige";	        aliases[27].token = TK_IDIV;
    aliases[28].name = "parte";	        aliases[28].token = TK_IDIV;
    aliases[29].name = "pece";	        aliases[29].token = TK_IDIV;
    aliases[30].name = "kvociente";	    aliases[30].token = TK_IDIV;
    aliases[31].name = "module";	    aliases[31].token = TK_MOD;
    aliases[32].name = "kongrue";	    aliases[32].token = TK_MOD;
    aliases[33].name = "alt";	        aliases[33].token = TK_POW;
    aliases[34].name = "potencige";	    aliases[34].token = TK_POW;

    return aliases;
}

static int nexttoken(LexState *ls, SemInfo *seminfo)
{
    int n;
    lexem* aliases = load_lexicon(&n);
	int t=llex(ls,seminfo);
	if (t==TK_NAME && strcmp(getstr(seminfo->ts),"sia")==0) {
		seminfo->ts = luaS_new(ls->L,"self");
		return t;
	}
	if (t==TK_NAME) {
		int i;
		for (i=0; i<n; i++) {
			if (strcmp(getstr(seminfo->ts),aliases[i].name)==0)
				return aliases[i].token;
		}
	}
	return t;
}

#define llex nexttoken