lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Forked over from Re: Project lead nominations for standard libraries?

On 1/1/11 4:28 PM, KHMan wrote:
On 1/1/2011 11:03 PM, Henning Diedrich wrote:
[snip]
On 1/1/11 3:10 PM, steve donovan wrote:
On Fri, Dec 31, 2010 at 11:58 PM, Henning Diedrich wrote:
So I plead with you, Roberto, for the soul of Lua, as Chris called out, fix
# !
But it ain't broken, so why fix it? It works precisely as specified.
[snip]
It's a problem when people pretend that a raw Lua table is the
particular data structure they have in mind

Lua gets pretty close to pretend one structure, the table, is all
you ever need, but then without a warning backs out of it with #,
isn't it?

An implementation. Easiest way to answer critics.



Ok, I implemented a 'tru count' operator (%) in the core to demonstrate what I mean. It returns actual amount of elements in a table, or an approximation of characters in a utf-8 char. That may be a stupid idea to mix in, please disregard for now.

I chose % but it can be anything of course, just by changing the char in line 850 in file src/lparser.c.

It should maybe be a function call rather than an operator. I would have liked ° as postfix. But that is out of range I guess.

Download: git clone git@github.com:Eonblast/Lua.git lua52cnt
Diff to Lua 5.2.2.0 alpha: https://gist.github.com/766137
Nicely formatted: https://github.com/Eonblast/Lua/compare/Lua_5.2.2.0_alpha_original...master
(ignore test/* files)

It processes all of these samples as intuitively expected: https://gist.github.com/766149
And approximates UTF-8 character amounts if applied to strings.

More: https://github.com/Eonblast/Lua/blob/master/README



Use:

/Use the % operator to retrieve the actual number of elements in a table that are not 'nil' in O(1) time./

t = {'foo', nil, 'bar'}
print(%t)
2

Or to get an estimate about how many printable characters a UTF-8 string has (O(n)).

t = "Dûnhar∂"
print(%t)
7

Compare this with #t. Again: instead of '%' one might want to have a nice and clean function.



Basic directions:

git clone git@github.com:Eonblast/Lua.git lua52cnt
cd lua52cnt
sudo make <your os> install
src/lua test/len.lua
src/lua test/strlen.lua



Implementation:

The tables get a count value. The count is updated with every table update and therefore takes 'no time' when called. It would be interesting to hear of crashes or performance penalty.

The necessary delineation of the 'perimeter of defense' turned out to be in the setobj macros in lobject.h. It's only one place where -1 is counted and three where +1 must happen. The macros themselves are also only used in a handful of places. There seems to be no handing out of pointers to table-residing TValues to beyond the functions that manage the tables, e.g. not to the API or generally into the wild. No pointer seemed to leave the local scope of the function where it was actually retrieved from the table, or created, except for a chosen few. Those 7, e.g. newkey(), luaH_set*() and luaH_get*() are a little bit all over, but they were simply all 'included in the perimeter', i.e. ignored in regard to counting. This way a pretty clear picture emerged.

In effect, it seems very possible to achieve completeness by checking all places where a table value is actually updated in the source, I found about a dozen instances. Most of those use the same macro setobj(), which was split up along predefined lines (e.g. setobj2t) and extended, basically by:

      if(ttisnil(oT) && !ttisnil(oV))
              tbl->count++;
      else if(!ttisnil(oT) && ttisnil(oV))
              tbl->count--;

https://github.com/Eonblast/Lua/blob/master/src/lobject.h, line 209+



Performance:

The execution of above is the main performance penalty incurred. There are no function calls in there, it's all macros for pointers and casts. I have not measured yet.



Order & Chaos:

The functions and constants for the COUNT (%) operator (TM_, OP_, OPR_) are implemented completely analogous to the LEN (#) operator. This can be seen nicely in the more pleasant github diffs (link see above).

Code has been touched all over but in a transparent and pin point way. Some 15 files where modified, mostly only by few characters though. Still a pitty that it's such a pervasive change, not that easily and safely to patch. Maybe making it a function would reduce that pervasiveness.



**Is there an 5.2 alpha test suite?**

Cheers,
Henning

--
*Henning Diedrich*





Eonblast Corporation
hdiedrich@eonblast.com
+1.404.418.5002 w
www.eonblast.com