Re: OP_HALT is really useful

Anyway, because we had low overhead breakpoints that could run code, the technique I used instead of further investment in specialized instrumentation was to use breakpoints that ran code to log/collect information. It was higher overhead and more manual, but also much more flexible, so it was a better answer.

So the answer I picked _was_ to do more of it in Lua and OP_HALT was what made that viable whereas the debug.hook based solutions imposed too much overhead to get accurate results. It is actually still dtrace-like, but our regular code/logging breakpoints instead of a new type of tracing breakpoint.

Interesting. Though I'll admit, I'm having a hard time reading between the lines here to figure out just what you did to implement this "manual instrumentation" method. It sounds roughly like a sample gathering technique in which you peppered OP_HALTs into the bytecode for specific parts of the code base you wanted more information on?

"Trace points" were starting to prove very handy as a quick/dynamic (and very low overhead) way to measure time/paths taken in certain segments of code that were executing. Combined with our low overhead sampling profiler's more holistic view of which code was running when pausing periodically, it could save a lot of time quantifying the costs/distributions of different code paths of particular interest. The two approaches (top down and bottom up) were proving complementary.

In my own (far less sophisticated) experiments with instrumentation, I've quickly come to a similar high-level conclusion. What one seems to need are multiple, complimentary tools rather than one single technique.

The technique I started with was to gather runtimes for a complete call stack. That proved dramatically expensive; even after modding the VM with an eye towards minimizing data collection overhead, collecting my call stack data increased the runtime of the function I was profiling from 2.5 seconds to 6 seconds (somewhat better than the 13x cost nobody reported, but, nonetheless very expensive). The high cost of complete call trace instrumentation makes the data I'm collecting suspect -- as true bottlenecks (particularly C-function calls), won't suffer nearly as much runtime inflation as other areas of the code. However, even so, those traces did give me a rough sense of the "lay of the land", and I could then follow up on that with hand-targeted instrumentation. And of course, when I'm just instrumenting a handful of key function calls, instead of hooking into each and every OP_CALL, it's much easier to get undistorted runtime information. So both techniques have their drawbacks, but, they're also synergistic.

It seems to me like there's a lot to be said for having a couple different tools in one's instrumentation toolbox.

Thanks again for your replies Dan, it's been fascinating to learn a bit about how these issues played out in the context of a bigger project.

-Sven