lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 02/03/2014 03:30 PM, Javier Guerra Giraldez wrote:
this is called a tokenizer or lexer (although sometimes a lexer has a little more knowledge about the specific language, and produces a syntax tree instead of a token stream). check http://en.wikipedia.org/wiki/Lexical_analysis for some discussions on this domain.

Thanks Javier,
but i do not care about the means of tokens (is this correct?).
My work (?) is really, really more trivial!
I need only to collect word, number, control character, etc.
in the correct order.
I will do a musical interpretation of this, so it's not so important
(now!) about meaning.
My try is something like this:

for line in testo:lines() do
    for L in string.gmatch(line, "%S+") do


and this work if i have a word surrounded by space, and not
always is it.
For example a get this: "(c)" and i would like to subdivide again this
in his 3 part, i.e. "(", "c", ")".
Or i have a number followed by comma, "2008," and i want
"2008", ",".
This should work for every possible combinations and i'm not able
to generalize this.
Hope not to be boring and to be clear in my wish:)

ciao,
francesco.