lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Am 14.05.2011 22:39, schrieb Dirk Laurie:
> On Sat, May 14, 2011 at 07:26:18PM +0200, Michael Rose wrote:
>> jump chr = str:byte(i) in 0,255 do
>>   "a".."z" do print("lower") break end
>>   "A".."Z" do print("higher") break end
>>   32,"_-:" do print("delimiter") break end
>>   0 .. 31, 127 .. 255 do print("unprintable") break end
>>   do print("other") end
>> end
> 1. "a".."z" and 0 .. 31 already have meanings in Lua.

That's true, but so what?
The context after 'in' and the area where the entry indices are
specified before each entry clause only accept literal values.
Otherwise the jump table could not be created at compile time and
the whole construct would have to be compiled to if-elseif chains
(this can and must be done with Lua 5.2 standard constructs
and not with the jump table construct).
Therefore operators can be freely used in this context and why not use
".."? Of course if there is a wish for another operator token,
be free to make suggestions. It doesn't change essential things.

> 2. 0,255 and 0 .. 255 both seem to mean the range 0 to 255.

The ranges after 'in' are needed to correctly set up the jump table
as a table of jump opcodes before the coding of the entry clauses.
The compiler makes a min/max calculation of these entries.

> 3. Comma already has a meaning in Lua.

The comma has already several meanings in Lua:
* In table constructions separate key=value parts from each other.
* In assignments make sequence of variables and sequences of
  expressions, which are subsequently matched to do the assignments.
* In 'for var in start,end [,step] do' separate three expressions from
  each other to specify the respected values.
* In function definition separate formal parameters, in call expressions
  separate the actual parameters.

So why not add another meaning as suggested to separate index ranges
from each other in the special context of entry index specifications.

> 4. "in" expects an iterator function in Lua.

This is not true. "for var in iter do" expects an iterator function.
But "jump expr in range do" could use the keyword differently without
any harm. As above the keyword could be changed, but I tried to avoid
using more new keywords as necessary, because any added keyword could
break existing code. But if it turns out that a new keyword should be
used in the end, why not? I don't mind. For me only a minor thing in
the discussion.

> 5. Lua already has a way of writing ranges in a pattern.
> 6. This syntax is at least as verbose as doing it with if,then,elseif.

I think a specification like

0 .. 10, 15, 19 .. 24 do ... end

is not as verbose as

elseif (0 <= i and i <= 10 or i == 15 or 19 <= i and i <= 24) then ...

and the former has one indirect jump through the jump table for all
coded clauses, but the latter single clause has five relations and
five conditional branches to be passed sequentially.

> If all that "jump" is going to achieve is to make it easier to respond 
> to different values of a character, I'll stick to Lua 5.2.

In fact I first coded the jump table for integer indices only and than
afterwords thought that it would be a good idea to allow the compact
coding of character ascii values as well, because they also occur often
enough to be of value.

> match = string.match
> if match(chr,"[a-z]") then print("lower")
>   elseif match(chr,"[A-Z]") then print("upper")
>   elseif match(chr,"[\32_%-:]") then print("delimiter")
>   elseif match(chr,"[\0-\31\127-\255]") then print("unprintable")
>   else print("other")
> end

See the attachment. If speed matters, this code is almost four times
slower than a corresponding jump table without being less verbose from
my point of view.

> Dirk


-- File: tst_match.lua
-- Benchmark between string.match and jumptable.

-- Output:
--   vocal's               = 	176468
--   letter's (no vocal's) = 	280699
--   digits's              = 	23069
--   other's               = 	489479
--   --------------------------------
--   all chr's             = 	969715

-- CPU-Timing for count constructs:
-- time tst_match          (without jump tables):
--   CPU-seconds needed: 0.83
-- time test_match jump    (with jump tables):
--   CPU-seconds needed: 0.22

local f ="")
local g = f:read("*a")

local aeiou,digit,letter,other = 0,0,0,0

local clock = os.clock()

if ... ~= "jump" then
  -- Implementation as suggested by Dirk Laurie:
  -- Note that counting vocals also as letters would complicate
  -- the logic here, while in the jump table case just one 'and' keyword
  -- has to be added to achieve that.
  local match = string.match
  for k=1,#g do
    local chr = g:sub(k,k)
    if     match(chr,"[aeiouAEIOU]") then aeiou  = aeiou  + 1
    elseif match(chr,"[a-zA-Z]")     then letter = letter + 1
    elseif match(chr,"[0-9]")        then digit  = digit  + 1
    else                                  other  = other  + 1 end
  -- Implementation with jump tables (lua-5.2.0-alpha-jumptable-v2.patch):
  for k=1,#g do
    jump g:byte(k) in "0z" do
      "aeiouAEIOU"      do aeiou  = aeiou  + 1 end    -- and
      "a".."z","A".."Z" do letter = letter + 1 end
      "0".."9"          do digit  = digit  + 1 end
      do other = other + 1 end

clock = os.clock() - clock

print("\nCPU-seconds needed:",clock)
print("vocal's               = ",aeiou)
print("letter's (no vocal's) = ",letter)
print("digits's              = ",digit)
print("other's               = ",other)
print("all chr's             = ",#g)