lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Mike Pall wrote:
Since a PEG implementation leads naturally to a bytecode
interpreter, you might want to check the literature on optimizing
interpreters first. A good starting point is [Ert02] at:
  http://www.complang.tuwien.ac.at/papers/

Thanks for the link.

I have current around 25 opcodes [...]

Umm ... IMHO it's much simpler to start with a few basic opcodes.

Well, Roberto did that for me that already. :-)

In the next step don't guess,

I don't really guess, I have a quite precise idea of what will be the code behind (thanks to Roberto) and what the new opcode will bring.
I don't do that at random... :-P

but do some profiling on realistic
examples. Only then try combining opcodes (superinstructions).

Well, I cheat as I have specialized operators leading to specialized opcodes. I am not yet at the stage of optimizing the generated opcode, not even at detecting infinite loops or such! :-( Eg. I have @'c' which is sugar for (!'c' .) * 'c', but if the user types the second expression, he will get a non-optimized bytecode... Detecting such constructs and folding them to the specialized opcode will come later... if ever! That's not only to help the bytecode compiler, I also find this more readable (personal taste, of course).

And don't forget that efficient dispatch is as important as
opcode selection.

Indeed.

[ Umm, in case you ever want to write a JIT compiler for it: it's
much simpler if you have only very simple opcodes. Folding in
some tests is ok, but grouping arbitrary opcodes will make things
more difficult later on. ]

Not a chance... Well, sorry if somebody else ever want to do it.

others: my OP_CHARS can handle one or two chars, at the cost of comparison of a flag to 0.

That doesn't look cheap. The main goal when designing for modern
CPUs should be to reduce the number of branches (in particular
unpredictable branches) and cache misses to a minimum. Anything
else will pale in comparison. An unpredictable branch has a
penalty of 10-30 cycles, about as much as a division. A cache
miss may take anything from 10 to 1000s of cycles.

OK, will take this in account, thanks for the advices!
I confess I still reason in 8bit assembly language, I don't master modern CPUs, although I learnt a lot from your posts...

--
Philippe Lhoste
--  (near) Paris -- France
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --