lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Mon, Jul 18, 2011 at 6:46 PM, J. A. "Biep" Durieux <rrrs@biep.org> wrote:
> P.S.: Being the pedantic person that I am, I often write down the thoughts I have regarding
> the software I use.  I attach my Lua thoughts for your enjoyment..

This is worth a discussion thread in its own right, so I'm forking it.
- Need for documentation:
  - Operator precedence: function call has the highest: a+b "c" = a+(b("c")), even if a+b yields a function.
  - Limitations - I realise these are probably changeable by recompilation, but what are the standard, min and max values?
    - Recursion stack size?  Is that the number of recursive calls, or do calls with many args take more space?
    - Max string, table, function, userdata size?  Max size of table constructor?
    - Max number of function args, return values, unpackable values?
    - Max number of parallel threads?  Open files?  Max usable size of file?
  - Relative cost of operations, to allow intelligent choice.  Amortised cost, for garbage collection counts.
    - Between a closure and a table store+read for storage of information.
    - Between recomputation and memoizing to access information.
    - Between a numeric index table read and a hash index table read?
    - Insertion/deletion vs. other operations.
      - After computing a value, is it worthwhile saving a stack insertion and do pointer bookkeeping instead?
      - Does the cost of table.insert and table.remove depend on the size of the table?
    - table.sort: is sorting an (almost-)sorted table expensive?  More or less expensive than the average case?
  - Likewise for space.
    - What is the overhead of having N small tables instead of one big one (e.g. for a matrix)?
    - What is the overhead of a function?  Of a closure?
  - a = {x=p, y=q} is supposed to be equivalent to a={}; a.x=p; a.y=q
    - but it isn't, nor to a={}; a.x, a.y = p, q
    - To see that, consider a = {[f()]=g(), y=1} and a={}; a[f()]=g(); a.y=1 .
    - Now f and/or g may set a metatable with __setindex on a.
    - In the first case f and g cannot access the a being defined - they act on a previous value of a, if any.
    - In the second case, f and g act on the newly defined a.
  - The function "next" and the one produced by "pairs" perform "rawget".  They don't use __index (unfortunately).

- Libaries.
  - Reification, which turns functions, the stack, etc. into a table with pointers to the innards.
    - Functions such as debug.traceback could be based on reify (and written in Lua, and user-changeable).
    - The complete workspace can also be reified.  This state can be reinstalled later on.
      - Writing out this reified workspace creates a state dump, a save.
      - (Loading a save and) reinstalling a reified workspace means restoring the program state.
      - This is useful for games, but for other long-running programs too.
      - This way, saving is automatically preceded by an aggressive garbage collection
        - This minimises the save file, compacts the memory on reload.
    - The debug library comes close to providing the stuff needed to make a restartable dump.  Missing is the following.
      - "debug.setinfo(..)" - there is no way to restore the state.
      - A more fine-grained function code pointer.  "debug.getinfo('l')" is too course, obviously.
  - Fold, which does intelligent folding of operators over many arguments (can be written in Lua).
    - For numerical operations, it tries to avoid overflow and loss of precision by judiciously ordering its args.
      - x=_MAXINT-1; map(+, {x, x, -x, -1}) --> _MAXINT-2 -- Illegal code, but you see what I mean..
    - For concatenation, it minimizes the amount of garbage produced (i.e. it would use table.concat).
    - For user defined function, it "degenerates" into a standard fold.

- Language change proposals.
  - Please let functions such as table.insert return the table - this allows for concise functional programming.
    - The current situation is the cause of countless annoying little programming errors.
    - "return table.inser(t,v)" is legal - that's what makes it worse.
  - Please let the sort function be stable.
    - It allows one to sort e.g. by minor and major key.
    - If it is impossible to make sort stable without extra cost, at least allow a flag which enforces stability.
  - Allow "[x] = y", which assigns in the current environment table.
    - This makes Lua more uniform, for in table constructors, "a=b" is already a shorthand for "['a'] = b".
    - It makes Lua more expressive, more powerful.
      - "local [3]=x" is a neat, concise way of setting an array value in the current environment.
      - "[3]=x" ought to set the nearest enclosing [3] that has been defined.
        - A metatable can ensure that in a certain environment all numerical indexes are defined.
  - Remove statements from the language; let everything be expressions.
    - The language becomes simpler, both in grammar and conceptually.
      - The anomaly of function calls as statements is removed.
      - "return" in tail position becomes the identity operation, and may slowly be deprecated.
        - the comma would become a first-class value concatenator (stack pusher).
        - "return" in other positions becomes a break (see below).
      - Typing print(..) around all expressions in the interpreter becomes unnecessary.
        - That alone would already be worth it!
    - Old programs continue to work.
    - More expressivity is possible
      - Most obviously "return if x then a else b end".
      - A for loop might return the final value(s) of its (local) loop variable(s).
        - index = for i = 1, #a do if a[i]<0 then break end end -- will find the index of the first negative value in a.
        - "index, val = for i, v in ipairs(a) do" will find both the index and the value.
      - Alternatively, "break" might take an explist the way "return" does.
        - That is more flexible and works for the same way for all loops, for, while or until.
        - It would provide another kind of throw and catch - one not for errors, but for continuations.
    - "Statements" may still return zero values.
    - loadstring and its ilk become more useful - no need to guess whether "return " needs to be prefixed.
  - More metatable functionality.
    - Allow __type in metatables, which then is returned by the type function (I know, trivial to write oneself..).
    - In line with __index which can contain a pointer to another table, __add might contain a number, etc.
      - In general: the value in the metatable, if not a function, will be used instead of the table itself.
      - So if I want equivalence classes on tables or userdata, all I need to do is set some value to __eq.
        - (Assigning them the same metatable with a non-function value for __eq would alread do it.)
      - If I want to order my userdata, I store the rank in __lt.
      - This makes the various metatable events more uniform (and some code may be shared in accessing them).
      - It allows me to annotate numbers, strings and threads: the table behaves like the value, but accepts fields.
    - Default metatables would be great: a table used for every value without explicit metatable.
      - And that includes environments in closures.
      - This would provide most "system hooks" in a Lua-worthy way.
        - The __index and __newindex in the default table metatable hook into variable access and assignment.
        - The __call in the default function metatable hooks into function calls and returns.
          - Of course, "rawcall" would be needed, in line with "rawget" and friends.
     - It would allow things such as proxies: values used before their definition.
       - Very useful for static OO initialisation: "John = man{wife=Mary}; Mary = woman{husband=John}".
       - "__index" would be programmed to return a proxy object, which would keep track of where it was assigned.
       - Assignment to a location that already held a proxy would replace all the proxy locations with the actual value.
       - After initialisation, a "run()" command might cleanup by removing the proxy-generating code.
       - Currently this can almost be made to work: it fails for proxies captured in closures and fresh tables.
    - Add "__init": "__init(table, how)" is executed when a table is created.
      - (Only useful in default metatables or with "table.clone".).
      - The second argument states how the table was created: by closure, through "{..}", through "table.create(how)"..
    - Give more freedom to events.
      - Particularly __len doesn't have much leeway: all strings and tables return their primitive length.
        - One cannot even have it point to another table of which it is to return the length.
      - Having __index and __newindex have a say when the key exist already would be very nice too.
      - I realise this might lead to loss of speed, and is therefore not allowed.  Is that correct?
  - Replace dispatch strings by tables, i.e. functions by libraries.
    - Reasons:
      - It makes the language more uniform.  Currently there are two kinds of indexing/dispatching.
      - No more need for code inspecting the argument string.
      - The programmer can add, remove or change individual functions in a seamless way.
        - (They can now, too, but it is harder.)
    - Examples:
      - The function "collectgarbage".  Let's have cg.count(), etc.
      - Hook setting would be nicer with "debug.sethook.call(..)", etc.  (But a __call event would be even better!)
        - One may set the same or different hook functions for the various hooks.
        - One may remove some hook functions while leaving others active.
      - Likewise in the file metatable, but here a method "parse" would be better.
        - file:parse(string) reads in file trying to match with string, which is a string pattern.
        - By the way, string patterns could be more like format patterns - uniformity again.
        - Currently there are THREE pattern languages: for string formatting, string matching and file reading!
  - Make package.paths into a table, an array of paths.
    - That way ";" loses its special meaning - which is always a good thing.
      - One could even free "?" by having an optional entry "pattern='?'" setting the wildcard character.
    - It becomes easier to insert or remove paths, or to reorder them.
    - It is way more Lua-like.
    - It is faster, for no string-parsing is needed to get the indivual patterns.