lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> > ** the sampling profiler does not use the line or instruction hook

> > because the overhead of running that way severely distorts the
> > results, which defeats the purpose of using a time profiler
> > *** our breakpoints don't use the line/count hooks either for the
> > same reason: way too slow in a largish codebase, so they have
> > no overhead until they are hit
>
> Are you able to say how these profiling/breakpoint mechanisms do work?
>
> A way to do this faster than hook-based approaches sounds really useful.
Sure. I actually shared the details about breakpoints a few years ago:

http://lua-users.org/lists/lua-l/2010-09/msg00585.html

There were a couple of bugs in that patch that have since been fixed, but
it's mostly been very solid and I'd never be able to go back to line hook
based breakpoints at this point.

> Are these tools built into the C level, or are they Lua code that just makes
> use of the currentpc/basepc you mentioned?
Both the breakpoints mechanism and the hook I added for our sampling
profiler were C modifications to Lua. The profiler does use the currentpc/basepc
fields to allow profiling stripped Lua and symbolicating it later, but otherwise
the changes are unrelated.

The change to support our sampling profiler was to add a counter to the
global part of the Lua state which could be atomically incremented from
another thread (as long as the state was still alive). When that counter is
non-zero, it invokes a hook function which records a traceback and the
value of the counter (and clears the counter). This means the main overhead
is just a slightly more complex condition at the top of luaV_execute to see
if the hook function needs to be invoked (so zero if debugging is off entirely*
and low even when debugging is enabled)

This essentially charges perf time against the first Lua opcode that runs
after a trigger is sent. We often trigger at relatively low frequencies (10-40 Hz
depending on what you're trying to detect) and while it isn't perfect it's been
enormously helpful to have a profiler that doesn't distort execution performance.

It also means we can get near instant execution pausing without needing the
count or line hook active (both of which slow down execution severely). Our
pause function triggers the state and it halts on the next opcode.

Hopefully that helps.

DT