lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 8/26/2012 12:21 AM, Paul K wrote:
Hi Sergey,

The improved version does still suffer from the bug caused by appending
(99-#z): "012b" > "12a", whereas it should be "012b" > "12a".
In case "#t == 1" you shouldn't append #r, because "0.11" < "0.2".
Thank you for pointing this out (I think you meant "012b" < "12a" in
the first comparison). Here is a better solution that fixes both of
these issues (although it's also not perfect in dealing with leading
zeroes):

function alphanumsort(o)
   local function padnum(d) local dec, n = string.match(d, "(%.?)0*(.+)")
     return #dec > 0 and ("%.12f"):format(d) or
("%s%03d%s"):format(dec, #n, n) end
   table.sort(o, function(a,b)
     return tostring(a):gsub("%.?%d+",padnum)..("%3d"):format(#b)
          < tostring(b):gsub("%.?%d+",padnum)..("%3d"):format(#a) end)
   return o
end

This also sorts decimal numbers:

Abc 0.100a
Abc 0.1b
Abc 0.11
Abc 0.2

I updated the page to capture various solutions we discussed (from
simplest to the best so far) as well as their results:
http://notebook.kulchenko.com/algorithms/alphanumeric-natural-sorting-for-humans-in-lua.

"." < " "   -- you'll face the ("." > "/") bug instead
I didn't quite get this comment...

I doubt the real numbers case has a use though :)
That I agree with ;).

Paul.

On Sat, Aug 25, 2012 at 6:46 PM, GrayFace <sergroj@mail.ru> wrote:
On 26.08.2012 8:09, Paul K wrote:
Agree; I just sent an improved version that shouldn't suffer from this
problem.

I see we used the same idea of first comparing the length of numbers and
then using lexicographical comparison of numbers.
The improved version does still suffer from the bug caused by appending
(99-#z): "012b" > "12a", whereas it should be "012b" > "12a".
In case "#t == 1" you shouldn't append #r,_because "0.11" < "0.2"_.
You made me also think about the real numbers case. My attempt at
implementing it has these bugs so far:
"." < " "   -- you'll face the ("." > "/") bug instead
"0.100a" > "0.1b"
I doubt the real numbers case has a use though :)


--
Best regards,
Sergey Rozhenko                 mailto:sergroj@mail.ru



Emphasis added. There is one case that for for some users will actually be more frequent than the real number case: sorting section numbers in or from legal documents (at least in the US). In this case, 0.11 > 0.2 because leading zeros after the decimal point are omitted, so 0.2 sorts against 0.11 as 0.02, not 0.20 as in the real number case. Also it is necessary to contend with multiple decimal points, such as 6.2.1 or even 9.3.3.5, so help me. I intend to solve this one myself, but it will be a while before I can write the code. Meanwhile, anyone who wants to try it is quite welcome to. I can see no way to auto-detect this case in a way that will reliably distinguish it from the real number case, but an optional parameter indicating which way to sort numbers with decimal points would be good enough.

Alternatively, a separate function for the "legal sort" case could be provided, thus keeping the main natural sort function simpler. (This is probably best.)

Sincerely