lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi all,

I have some new very interesting results. I've implemented a pure Lua
adaptive integration routine for numerical integration, the QAG
routine.

I've made an adaptation of the same routine implemented in the GSL
library which is based itself of the quadpack implementation. The
translation in Lua was quite straightforward. I've used also the
template engine but in this case it was not really essential.

For information, I've made also an adaptation of the non-adaptive
Gauss-Kronrod integration routine but for this latter I don't have any
benchmark.

Here the result of the benchmark for the QAG adaptive routine:

C code / GSL (-O2 -mfpmath=sse -march=native) : 0m0.891s
LuaJIT2 (git HEAD): 0m0.438

The results obtained with LuaJIT2 are also completely accurate.

So, once again LuaJIT2 confirms à factor 2x or even slightly more over
a conventional optimised C implementation. For the other side on my
Linux box the results are a little bit different (same benchmark as
before but with 8 times more points):

C code / GSL (-O2 -mfpmath=sse -march=native) : 0m1.281s
LuaJIT2 (git HEAD): 0m1.310

So on Linux the C optimized code is performing slightly better. May be
it is because the gsl library have better optimization or may be
because the GCC with ubuntu is better ?

In any case I consider the results completely satisfying because in
the worst case LuaJIT2 is performing almost at well as C optimized
code. It is also true that for the QAG algorithm I was not able to do
any significant optimization over the C implementations like I've done
with the ODE RKF45 algorithm.

Note also that in this case the routine was performing well since the
beginning. This is probably because the routine is simpler then the
ODE one in array form. I've also respected Mike's guidelines to help
the JIT compiler to produce optimal code.

I'm very happy of this result as this confirm yet again that LuaJIT2
is a viable and performant alternative for numerical computations.

I include the files in attachement for the curious, they work with a
plain LuaJIT2. The implementation files for the QAG algorithm are
qag.lua.in, err.lua.in and gauss-kronrod-x-wgs.lua.in . Then I include
also an updated version of the template module and the benchmark I've
used in Lua and C.

I will take this opportunity to talk a little bit about GSL shell, if
someone is interested. The latest fixes of LuaJIT2 are already
included in GSL shell. With GSL Shell you have also the realine
support plus some other goodies like a graphical systems, for example
;)
There is also another major improvements in GSL Shell because now it
is 100% compatible with Lua 5.1. I've managed to get rid of the LNUM
patch so now it can works both with standard Lua or with LuaJIT2. I
don't rely anymore in the axact behaviour of the GC for the weak
tables so I don't need any special patch or workaround. I've also
enabled a more clean modular architecture, now you can choose
LUA_BUILD in the makeconfig file and every functions will go in a
specific namespace without polluting the global namespace. I have a
namespace for GSL function, one for complex number and another one for
graphical functions. If you choose the LUA_BUILD option the 'short
function syntax' is not enabled so that Lua purists are happy :-)
For the complex number I've a userdata implementation based on LHF's
implementation (heavily modified). Complex number specific function
goes is a specific namespace to not pollute the global namespace.

In addition I've made some important improvements is the arithmetic of
matrix and complex numbers. Now the '*' operator perform the matrix
multiplication, instead of the elemet-wise multiplication. I think
that this latter behaviour is much more useful. The matrices are now
automatically promoted to complex when needed and the matrix functions
handle transparently both real and complex matrix. So now you can also
mix freely real and complex numbers with real and complex matrix and
GSL shell will do 'the right thing (R)' for you :-)

I will not probably release a new version of GSL Shell very soon
because there is a lot of work in progress with LuaJIT2 but if you
checkout the 'luajit2' branch it is already there and fully
functional. In the 'rk4' branch I have also all the numerical
algorithm I'm implementing right now.

Francesco
--
-- A Lua preprocessor for template code specialization.
-- Adapted by Steve Donovan, based on original code of Rici Lake.
--

local M = {}

-------------------------------------------------------------------------------
local function preprocess(chunk, name, defs)

   local function parseDollarParen(pieces, chunk, s, e)
      local append, format = table.insert, string.format
      local s = 1
      for term, executed, e in chunk:gmatch("()$(%b())()") do
	 append(pieces,
		format("%q..(%s or '')..", chunk:sub(s, term - 1), executed))
	 s = e
      end
      append(pieces, format("%q", chunk:sub(s)))
   end

   local function parseHashLines(chunk)
      local append = table.insert
      local pieces, s, args = chunk:find("^\n*#ARGS%s*(%b())[ \t]*\n")
      if not args or find(args, "^%(%s*%)$") then
	 pieces, s = {"return function(_put) ", n = 1}, s or 1
      else
	 pieces = {"return function(_put, ", args:sub(2), n = 2}
      end
      while true do
	 local ss, e, lua = chunk:find("^#+([^\n]*\n?)", s)
	 if not e then
	    ss, e, lua = chunk:find("\n#+([^\n]*\n?)", s)
	    append(pieces, "_put(")
	    parseDollarParen(pieces, chunk:sub(s, ss))
	    append(pieces, ")")
	    if not e then break end
	 end
	 append(pieces, lua)
	 s = e + 1
      end
      append(pieces, " end")
      return table.concat(pieces)
   end

   local ppenv

   if defs._self then
      ppenv = defs._self
   else
      ppenv = {string= string, table= table, template= M}
      for k, v in pairs(defs) do ppenv[k] = v end
      ppenv._self = ppenv
      local include = function(filename)
			 return M.process(filename, ppenv)
		      end
      setfenv(include, ppenv)
      ppenv.include = include
   end

   local code = parseHashLines(chunk)
   local fcode = loadstring(code, name)
   if fcode then
      setfenv(fcode, ppenv)
      return fcode()
   end
end

local function read_file(filename)
   local f = io.open(filename)
   if not f then 
      error(string.format('error opening template file %s', filename))
   end
   local content = f:read('*a')
   f:close()
   return content
end

local function process(filename, defs)
   local template = read_file(filename)
   local codegen = preprocess(template, 'template_gen', defs)
   local code = {}
   local add = function(s) code[#code+1] = s end
   codegen(add)
   return table.concat(code)
end

local function require(filename)
   local f = loadstring(process(filename .. '.lua.in', {}), filename)
   if not f then error(string.format('error loading module %s', filename)) end
   return f()
end

local function load(filename, defs)
   local f = loadstring(process(filename, defs), filename)
   if not f then error(string.format('error loading module %s', filename)) end
   return f()
end

M.process = process
M.require = require
M.load    = load

return M
local template = require 'template'
local q = template.load('num/qag.lua.in', {limit=1000, order=21})

local sin, cos, pi = math.sin, math.cos, math.pi
local epsabs, epsrel = 1e-6, 0.01

function bessel_gen(n)
   local xs
   local fint = function(t) return cos(n*t - xs*sin(t)) end
   return function(x)
	     xs = x
	     return q(fint, 0, pi, epsabs, epsrel)
	  end
end

local J12 = bessel_gen(12)

-- p = graph.fxplot(J12, 0, 30*pi)

local xold, xsmp = -100, 10
for k = 1, 4096*8 do
   local x = (k-1) * 30 * math.pi / (4096*8)
   local y = J12(x);
   if x - xold > xsmp then
      print(x, y)
      xold = x
   end
end
#include <stdio.h>
#include <math.h>
#include <gsl/gsl_integration.h>

struct bessel_param {
  double x;
  int n;
};

double f (double t, void * params) {
  struct bessel_param *p = (struct bessel_param *) params;
  double x = p->x;
  int n = p->n;
  return cos(n * t - x * sin(t));
}

int
main (void)
{
  gsl_integration_workspace * ws = gsl_integration_workspace_alloc (1000);

  struct bessel_param param = {0.0, 12};
  double result, error;
  double xold = -100, xsmp = 10;
  int k;

  gsl_function F;
  F.function = &f;
  F.params = &param;

  for (k = 0; k < 4096 * 8; k++)
    {
      param.x = (k * 30 * M_PI) / (4096 * 8);

      gsl_integration_qag (&F, 0, M_PI, 1e-6, 1e-4, 1000, GSL_INTEG_GAUSS21, ws, &result, &error);

      if (param.x - xold > xsmp)
	{
	  xold = param.x;
	  printf ("%.18f %.18f\n", param.x, result);
	}
    }

  gsl_integration_workspace_free (ws);

  return 0;
}

Attachment: qag.lua.in
Description: Binary data

Attachment: err.lua.in
Description: Binary data

Attachment: gauss-kronrod-x-wgs.lua.in
Description: Binary data