Thought I'd try something new and post to the mailing list looking for
existing solutions instead of writing one myself :)
I have two areas of interest here.
First I am finding it really helpful to be able to profile at trace granularity. Traces seem to me like the natural unit of code optimization: try to have the right collection of them and make sure each is internally sensible. Here is the patch from Mike that I am using for this:
Second I am in the middle of hacking a low-level interface to the x86 PMU (Performance Monitoring Unit) by using dynasm to access the RDPMC instruction. This makes it possible to track fine-grained CPU performance events over arbitrary bits of Lua code.
I am racing to finish this before our second child is born.. any day now :).
Output looks like this:
selftest: pmu
328 counters found for CPU model GenuineIntel-6-3F
I would quite like to add this to LuaJIT directly. The obstacle is that I depend on Cosmin's extended dynasm that can be used from Lua code. I am considering bringing that into our LuaJIT branch perhaps as a submodule to replace the built-in dynasm.
This is me working my way up to our real application from the humble starting point: