lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

It was thus said that the Great Tim Hill once stated:
> > On Nov 16, 2020, at 5:26 AM, Luiz Henrique de Figueiredo <> wrote:
> > 
> >> You know the first three rules of optimisation, right?  They are, in order:
> >> 
> >> 1. Measure
> >> 2. Measure
> >> 3. Measure
> > 
> > Also:
> > 1. Don't.
> > 2. Don't... yet.
> > 3. Profile before optimizing
> > 
> > See for instance
> >
> I’ll be honest .. I REALLY dislike this “premature optimization” aphorism.
> It (and rules like it) are often quoted as if they are laws of physics,
> when in fact they are at best vague generalizations.
> First, lua is already a *massively* optimized language, which is one of
> its attractions. The leanness of the language, VM, and libraries is a
> testament to that. So saying you should not optimize Lua seems dubious
> (I’m not defending the OPs changes, I’m talking in general here).
> Second, the idea that optimization is something that can be “bolted on
> later” when something is found to be sub-performant applies only to
> certain very limited cases, mostly related to local implementation
> details. It certainly does NOT apply to, for example, architectural
> design, when “optimization” can often mean “start over”.  Should we use a
> bubble sort, and only later switch to (say) quicksort when we have
> carefully “proved” that it is too slow? Or write a video Codec in Perl,
> and then act surprised when it cannot process video data in real time?

  The advice isn't saying "be stupid when writing code."  By all means, use
C when writing a codec, or use the language supplied or standard library
supplied sort routines (which should be "good enough" for most cases).

  The advice to profile the code ("measure, measure, measure") is because
the bottle necks in performance may not be where we think they are.  I wrote
a Lua based program at work.  It was put into production five years ago. 
The code was written in a straightforward way with no thought to
optimization---correctness was the overriding goal at the time [1].

  It was only last year that performance was a bit of a concern---not enough
to panic, but maybe we should look into it.  I did.  I was completely
surprised at the results:

  I did *NOT* expect the following bit of code to be the hotspot:

	local n = tonumber(capture,16)

  It turned out I didn't need to convert the number, and that hot spot went
away [2].  Had I tried to "optimize" the code without measuring, I would
have wasted my time.  Had I tried optimizing the code five years ago, I
again, would have probably wasted my time, as we were able to go five years
before it even hit our radar.

  Right now, it's the LPEG code that is the hot spot, which isn't
surprising.  It's not a real concern yet, but now I have the time to
investigate some other approaches.

> What SHOULD be explained is that the choice of when to optimize (if ever)
> should be based on the *cost* of that optimization effort and the
> *probability* that it will be needed. 

  If the users of of the software are complaining about performance, then
you optimize by first profiling the program, identifying the hot spots and
how to speed that code up.  The best speed ups come with algorithmic changes
(a simple example is changing a linear search with a binary search, or
Quicksort with MergeSort because the input data creates a pathological case
with Quicksort).  Less so with so called micro-optimizations like changing
strcmp() with strncmp() or replacing a divide with a reciprical multiply.

> Changing fundamental architectural
> design is massively expensive, as is re-writing a complex software system
> in a new language. That is why these decisions *must* be optimized
> up-front .. and quoting “we can optimize that later” is, in these cases,
> nonsense.

  Again, not always.

  Another example (that I have to yet write about).  I wrote an HTML parser
in LPEG, but it bloated the memory footprint of the programs that use it by
eight times the original size.  I found a PEG compiler for C, and using the
existing LPEG code as a base, I was able to convert the HTML parser to C. 
Not only did it decrease the memory usage (to maybe double the original
size, not the eight times of the Lua version) but it ran in about 1/10 the
time as well.

  It wasn't a drop-in replacement, but it wasn't a drastic change either:

	dom = html:match(doc)


	dom = html(doc)

  But again, before I went to the trouble, I did double check to make sure
it was the LPEG HTML parser that was causing the bloat and not something
else (it was the cause).  You don't have to rewrite the *entire* thing in a
different language if you can replace a bit in a different language.

> The very phrase “premature optimization” in fact gives a lie to itself.
> What do we mean by “permature”? If you mean “when you find the program
> does not perform as expected” then (as I have explained) you are going to
> be out of luck in most cases.

  Not in my experience.  The only time I failed to optimize a program
properly (dropping the run time from a year to a day) was because I didn't
fully understand floating point arithmatic.  In my defense, it was in
college, and the foilbies of floating point just wasn't a thing taught in
the Comp Sci department at the time (was *was* taught in the Math department
though).  It was only 25 years later did I realize my mistake.


[1]	The actual goal as I was writing it was a "proof-of-concept" using
	Lua and LPEG to do the heavy lifting and get a feel for the problem
	I was trying to solve, and then maybe to a reimplementation in C for
	speed, as it did involve processing SIP messages for a major
	cellular carrier in the US.  My manager at the time put the Lua+LPEG
	version into production, and I didn't find out until a few months

	Go figure.

[2]	There were other hot stops after that, but that one was the worst
	offender, and completely surprising to me.