lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 2018-01-29 03:16, Sean Conner wrote:
It was thus said that the Great nobody once stated:
NaN being unusable as a table key is one of the bigger kinks

Just in case you are unaware (or others are unaware), NaN is a special value defined by IEEE-754 (or rather, values) that have a unique property:


I referred to that in the footnote down below...

[2] [...] have special case logic for the float paths in the table code, possibly substituting a single "canonical" NaN (recall that there's billions of them... =/) for use as a key. Last time I patched this [...] I was able to hide most of the code on error paths / behind checks that already existed.

...and because you say...

So it would be difficult to make NaN to be a table [key].

...and it wasn't actually that hard, here's the patch with my notes from
back then.  The code is still rather ugly / proof-of-concept / just make
it work somehow-style and can almost certainly be improved, I just don't
have the time to do that right now.

I think I had some tests somewhere and ran them, the code passed some
Lua test suite back then (mod obvious/expected breakage?), and I
definitely used it (until 5.3.4 it was always-on in my default Lua
interpreter and I never noticed any problems, now I'd have to find the
time to update this...) - but I don't have a deep understanding of the
table code so I'm not sure this is correct and so it's probably better
to treat this as untested experimental code and check for yourself if it
does what you expect it to do.

It applies to something like 5.3.1--5.3.3 IIRC, and I think (but didn't
check right now) that it was 5.3.4 that changed some table code.

To what's below, let me add this:

 *  It's ~40 lines, which is _a lot_ for such a pesky little detail.
    (Floats are ugly...)  But if you actually have to deal with NaN,
    those 40 lines are a lot less than what you need to make things
    work in Lua somehow.  (And if you actually (have to) do the checks,
    doing them in C is faster too.)

 *  Code doesn't do the canonical NaN thing yet, so if you put in -NaN
    and then ask for NaN, you'll get the -NaN value. (Sign is irrelevant
    as for +0.0 / -0.0, but for 0 it's hidden by the cast to int while
    for NaN you can see the wrong sign.  Always Using a canonical NaN
    would make it behave as for -0.0 --> 0.)

-- nobody

(Patch is public domain or fallback CC-0, yadda yadda.)

-----

Quite a bit of code needed (+36,-2,~1), all but one check off the main
path.

`searchnan` handles the NaN-finding logic, used by `findnan`/`getnan`.

The three cases to handle are:

 * Traversal (next, pairs), done by `findindex`.  Previously, NaN would
   have hit the `luaG_runerror("invalid key to 'next'")` branch.
   As this should be pretty rare, we check for NaN in here and possibly
   defer to `findnan`.  It just calls `searchnan` and extracts the
   offset from the resulting `Node*` (or `luaG_runerror`s if there's no
   NaN in the table.)

 * Finding (possibly for subsequent update of) an entry, done by
   `luaH_get`.  Previously, `getgeneric` would have done the (futile)
   search for NaN, never finding it.
   This needs an extra branch in `luaH_get` in the `LUA_TNUMFLT` case,
   which could be costly.  If the key is NaN, we branch to `getnan`,
   which returns the node's key or `luaO_nilobject`.

 * Adding a new node, done by `luaH_newkey`.  Previously, this did an
   explicit check for NaN and would have errored out.
   As the rest of the code now handles NaN, this check is no longer
   needed, for a slight speedup of all insertions of float keys.

Assuming code rarely uses `next` with nonexistent keys, the impact of #1
should be negligible.  #3 is a slight speedup for insertion, #2 an equal
slowdown for lookups.  Depending on the ratio of insertions to lookups,
this could come out anywhere between neutral and significant slowdown.

`searchnan` can probably still be improved.  It currently returns `NULL`
if no NaN-node exists, which is a nonstandard pattern in the code base.
It currently takes the key as argument only to find the main position.
As the key cannot be used for comparison, this could be removed if the
main position of NaN can be gotten in other ways.  (The default
`l_hashfloat` always assigns 0, but it can be re`#define`d so that's not
generally a safe assumption.)

---
 src/ltable.c | 40 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/src/ltable.c b/src/ltable.c
index 7e15b71..9e5deba 100644
--- a/src/ltable.c
+++ b/src/ltable.c
@@ -152,6 +152,31 @@ static unsigned int arrayindex (const TValue *key) {
   return 0;  /* 'key' did not match some condition */
 }

+static const Node *searchnan (const Table *t, const TValue *key) {
+  Node *n;
+  int nx;
+  lua_assert(ttisfloat(key) && luai_numisnan(fltvalue(key)));
+  n = mainposition(t, key);
+  for (;;) {
+    if (ttisfloat(gkey(n)) && luai_numisnan(fltvalue(gkey(n))))
+      return n;
+    else {
+      nx = gnext(n);
+      if (nx == 0)
+        return NULL;
+      n += nx;
+    }
+  }
+}
+
+static unsigned int findnan(lua_State *L, Table *t, StkId key) {
+  unsigned int i;
+  const Node *n = searchnan(t, key);
+  if (n == NULL)
+    luaG_runerror(L, "invalid key to 'next'");
+  i = cast_int(n - gnode(t, 0));
+  return (i+1) + t->sizearray;
+}

 /*
 ** returns the index of a 'key' for table traversals. First goes all
@@ -177,8 +202,11 @@ static unsigned int findindex (lua_State *L, Table
*t, StkId key) {
         return (i + 1) + t->sizearray;
       }
       nx = gnext(n);
-      if (nx == 0)
+      if (nx == 0) {
+        if (ttisfloat(key) && luai_numisnan(fltvalue(key))) /* it's NaN! */
+          return findnan(L, t, key);
         luaG_runerror(L, "invalid key to 'next'");  /* key not found */
+      }
       else n += nx;
     }
   }
@@ -449,8 +477,6 @@ TValue *luaH_newkey (lua_State *L, Table *t, const
TValue *key) {
       setivalue(&aux, k);
       key = &aux;  /* insert it as an integer */
     }
-    else if (luai_numisnan(fltvalue(key)))
-      luaG_runerror(L, "table index is NaN");
   }
   mp = mainposition(t, key);
   if (!ttisnil(gval(mp)) || isdummy(mp)) {  /* main position is taken? */
@@ -533,6 +559,12 @@ const TValue *luaH_getshortstr (Table *t, TString
*key) {
   }
 }

+static const TValue *getnan (Table *t, const TValue *key) {
+  const Node *n = searchnan(t, key);
+  if (n == NULL)
+    return luaO_nilobject;
+  return gval(n);
+}

 /*
 ** "Generic" get version. (Not that generic: not valid for integers,
@@ -576,6 +608,8 @@ const TValue *luaH_get (Table *t, const TValue *key) {
       lua_Integer k;
       if (luaV_tointeger(key, &k, 0)) /* index is int? */
         return luaH_getint(t, k);  /* use specialized version */
+      else if (luai_numisnan(fltvalue(key)))
+        return getnan(t, key);
       /* else... */
     }  /* FALLTHROUGH */
     default: