Low Overhead Breakpoints (no hooks)

lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Low Overhead Breakpoints (no hooks)
From: Dan Tull <dtull@...>
Date: Fri, 17 Sep 2010 09:38:45 -0700

I recently made modifications to Lua for use in the Lua code editing
and debugging environment used by the Lightroom team that I thought
might be of interest to this list.

We are using a modified Lua 5.1.2 at this point. If there is some
interest in the details, I could probably map them into a patch for the
changes based on a stock 5.1.4 version of Lua.

The full description is below. Comments and questions welcome.

Dan Tull
Adobe Systems (Lightroom)


== Problem ==
Despite having a reasonable debugger for examining state when
exceptions were thrown (or a special function called "halt" was
called), we didn't have real "breakpoints" in our IDE.

The line hook method of implementing breakpoints imposes too much
overhead and the fact that it must be either left in place or put in
place in each lua_State was troublesome.

Even the call/return hook has more overhead than we wanted, or
I'd have followed this helpful post's suggestions:
http://lua-users.org/lists/lua-l/2007-03/msg00016.html

== Solution ==
After spending some time analyzing the Lua VM implementation I chose
this approach and it worked quite well:

- Introduce a new opcode called HALT to implement breakpoints.
  It is the last opcode, so bytecode compatibility is preserved.

- Add top level APIs for setting/clearing halts at a particular 
  instruction offset in a function.  The previous instruction at that
  offset is copied aside into a structure describing the halt.

- Add an array of these halt descriptors that hangs off of the Proto
  structure that holds the function's bytecode. An opcode of the
  HALT instruction is used to map to the array index of the halt's
  descriptor so it can be quickly looked up.

- In the main VM execution loop, HALT becomes a new case in the
  switch. After its callback completes, it rewrites the instruction
  and uses a goto to jump back to the top and resume execution of the
  "real" instruction.

- The halt's callback mechanism is modeled off of the other hooks
  (line, count, call/return) in order to keep it from screwing up the
  state necessary for the VM's proper resumption of execution.

== Advantages ==
- The VM runs at essentially full speed except when a halt is actually
  hit. This is a bit better than even optimized methods of using the
  line and call/return hooks.

- Strictly speaking, because the low level API speaks in terms of
  instruction offsets, it is actually possible to use this even with
  chunks that have been compiled and stripped of their line to
  instruction mapping information.

- No need to maintain call/return and line hooks set into all the 
  Luas states (more complicated even if it were as fast).

== Disadvantages ==
- You have to be able to locate a reference to the function in which
  you want to set the breakpoint. We actually made another mod
  to make this faster and easier but that's a separate change.

- It does expose bytecode level details to the immediate caller, but
  it was easy enough to add the mapping of line numbers to instruction
  offsets so this abstraction "leak" remains tightly isolated.

== Gotchas == 
- Since there's no mapping of halt opcodes into the bytecode format,
  logic had to be added to luaU_dump to strip out the halts and put
  the "real" instructions back into the stream during serialization.

- A few instructions use other instructions as a sort of extended operands:
  * OP_CLOSURE's MOVE/GETUPVAL instructions
  * the JMP instruction following the instructions used for conditionals
  * OP_SETLIST uses the next instruction as an integer when C is 0

  These not executed by the main VM switch statement and are thus never
  "hit", strictly speaking.  To avoid complicating their execution
  behavior, the logic for sethalt avoids altering these
  pseudo-instructions and places the halt instead on the main
  instruction for these sequences.

- There are a few other stray cases where assertions are made about
  the instructions in the bytecode stream. These had to be modified to
  lookup and compare the "real" instruction or to tolerate HALT as
  appropriate. Note that I did not need to modify the bytecode
  validation logic because halt instructions are never serialized out.
  From what I understand, that got yanked in 5.2 anyway.

Follow-Ups:
- Re: Low Overhead Breakpoints (no hooks), steve donovan

Prev by Date: [ANN] LuaCairo updated
Next by Date: LuaJIT 2 fast function limit
Previous by thread: Re: [ANN] LuaCairo updated
Next by thread: Re: Low Overhead Breakpoints (no hooks)
Index(es):
- Date
- Thread