lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


One big problem I have writing lua scripts is the inability to create sorted tables.  In lua, all tables are hash tables, which is great for random access lookups, but terrible for storing ordered information.  In fact, the only way you can store ordered information is with a table using integer indexes.
 
I propose a simple addition of two functions, and a modification to the table system in lua, to support sorted tables.
 
Proposal:
 
Other scripting languages and most modern languages support a form of ordered data structure in which insertion and deletion of ordered elements is quite fast, and the elements in the structure can be traversed in sorted order.  Lua presently has no such data structure.
 
As lua's syntax is intended to be extremely simple, it's runtime small, and it's learning curve short, the addition of an extra type with extra syntax, symantics, and access rules is not appropriate.  Instead, I propose a simple modification to the present table type to enable it to behave as if it was an ordered list, even though it is still internally a hashtable or array.
 
The sorted table proposal consists of the addition of two functions:
 
sorttable(t[, sort)
 
and
 
issorted(t)
 
And the addition of a pointer to an optional 'red/black' tree in the table's Value union representation in the C code to store the order of objects added to a table in addition to the hash table.
 
Rationale:
 
I'd like to be able to store sorted elements and items in lua.  I can't do that now.  I don't want to change the syntax or the simplicity of the language, but it's something I've run up against numerous times.
 
Details:
 
The function...
 
sorttable(t[, sort)
 
would cause the table t to be sorted.  This would cause the C representation of the table to add a red/black tree in addition to the hash table to the table to order the elements of the table, and place existing elements in the table in the red black tree.
 
Indexing operations would ignore the red black tree, and simply use the hash table to recover values.
 
i.e.
 
t["fred"] = "second"
t["abe"] = "first"
 
would use the hash value of the string "index" to get the value of index just as the present lua implementation would.  This would mean that sorted tables would take more time to 'add' elements (though dramatically less time than a manual sort of a normal lua array), but take no more time to index than unsorted tables.
 
However, the function foreach() would iterate through the internal 'red/black' tree representation of the table, returning the elements of the tree in sorted order.
 
i.e.
 
sorttable(t)
foreach(t,print)
 
Would reliably print the following:
 
"abe"  "first"
"fred" "second"
 
To disable sorting, and assign the red black tree to the garbage collector, the function sorttable could be called with a 0 value to turn off sorting.
 
sorttable(t, 0)
foreach(t,print)
 
Might return:
 
"fred" "second"
"abe" "first"
 
The function issorted() would return non nil if the table was sorted.
 
i.e.
 
issorted(t)
 
Would return 1 if the table t was sorted, or nil if not.
 
--------------------------------------------------------------------------------------------
 
Memory Usage Alternatives:
 
If memory usage of the simple implementation was a problem, there are two alternatives.
 
1. A possible implementation alternative would permit sorted tables whos indexes were never accessed for a 'read' operation to only retain the red/black tree implementation, and not the hash table implementation.   A hash table would be dynamically created the first time an elements was read from a table.
 
2. Another possible implementation alternative would be to determine the representations of the table via the sorttable() function, or possibly use a different function.. perhaps setsorting().
 
i.e.
 
setsorting(t, mode)
 
Where mode could be "sorted", "hashed", or "both".
 
The function getsorting() would return the sort mode.
 
i.e.
 
getsorting(t)
 
Would return "hashed" for normal lua tables, "sorted" for red/black based tables, and "both" for tables with both representations.