Thankfully I've recently removed the last major stumbling block
(better trace linking) and the benchmark results demonstrate that
going for a trace compiler was a sound design decision after all.
But I have to say it was an expensive decision: I've considerably
underestimated the amount of research and trial-and-error which
was needed to convert a research toy into a production compiler.
There are some important implementation details which the few
papers about trace compilers completely fail to mention ... :-|