That was replied in a following message. Yes this is limitating, and for now the only workaround is to use a boolean flag variable to track conditions, and to retest that variable in subsequent ifs to do common actions:
> var do_call = false;
> if op == OP_TAILCALL then
> ...
>
do_call = true;
> elseif op == OP_CALL then
> do_call = true;
> end
> if do_call then
> ...
> end
There's no such problem in C/C++, even if you jump to the scope of a local variable (which may then turn unitialized; C/C++ preallocates all local variables at the same time at the function entry scope without initializing them, or some security-enhanced C/C++ compiler zero them all before actual initialization; but within the same function, random gotos may leave their value at their previous value without properly reinitilizing them where they are declared but in a statement avoided by the direct goto's. This is sometimes used however to create optimized parsers/scanners and finite state automatas, but programmers have to follow the logic of varaible states more scrupulously.
Tracking the variable scopes inside a function that allows breaks, random gotos, or exception catching is tricky for compilers, espêcially in C++ with accessors, implicit constructors. In Lua this is simpler to track, but the syntax has its limitations.
The general solution using explicit boolean flag variables however is warrantied to work, even if it has a small performance penalty, not a problem with programs running in modern PCs, but may be a problem for small devices with limited memory (small stacks), few registers, and limited code sizes, running at low frequency: adding extra variables adds a cost.
However even the smallest devices today have CPUs that run near the Gigahertz range and embed a reasonnable local memory cache in their chip. Compilers (that should do their work on a decent machine) can still find tricks to track all varaible usages and value domains. Unfortunately, the Lua language has a dynamic type system (at the public programming API), and so compilers have to generate an alternate typing system (this is done already in _javascript_, basically the dynamic datatypes are converted using interface signatures built by the compiler, and at run time the interfaces become static classes that can be optimized according to each value domain, so that the compiler can determine when it really needs to assign varaibles with complete values, or just use the few bits that are needed for correct execution. Local flag variables will then have little or no cost, and may even be left completely unallocated when their only use is to determine NOT a data flow but a code flow with conditional jumps (unconditional jumps will be dropped as much as possible by the compiler which will then attempt to implement the "fallthru" strategy itself
----
Compilers (as well CPUs in their internal scheduler from ISA to micro-ops in their multiple pipelined threads) make lot of efforts to determine data dependency. What was costly in old generations of CPUs is less important today (except where it can impact security, notably for branch prediction and speculative execution due to possible time-based attacks: this has seriously complicated the design of new processors as it becomes much arder for them to track the dataflow, and some new instructions like "barriers" are being added and must be used with by compilers to avoid these issues, with some performance costs: data flow and code flows are no longer considered fully independant and it is extremekly complex to track them in modern CPUs that have many parallel flows, schedulers, dynamic register stacks and multiple layers or caches and complex management of internal clocks and internal buses with variable frequency rates, plus now recovery procedures for random conditions; this wil lbecome even harder in the future when processors will start using very fast but probabilistic algorithms instead of imperative instructions, as we have reached now the physical limits where exact binary processing is possible; tomorrow processors will work with possible "errors" and will include many statistic and predictive behavior with delayed corrections, and the possibility to continue working even if some parts of the chip are wearing or overheating: data will constantly and dynamically be reflowed; step by step executiion, driven by clocks will no longer be the only solution, and single data paths will be harder to isolate from each other but will have to manage mutual side-effects).
For now, there's not a single programming language (not even Lua which is still based on the Turing machine model) that is really ready and enough theorized to work with future processors. And binary processing using bits is not near from the end for faster processing. We'll start using probabilistic qbits, and we already have processings (for networking) that use more complex coding (see for example DSL and 5G codecs, as well as audio/video encoding, or "machine learning" algorithms). for newer architectures, impertive binary processing with exact clocks, barriers, isolated buses, imperative errors or exception handling will be insufficient, and softwares will need a language that allows backtracking and feedbacks for correction until a stability condition is reached. Microbecnchmarking will then become non-sense, when the goal will just be to get more global performance. Algorithms will need to be self-adaptative (it won'ty then matter if internal microstates are wrong when the expected response is more global.
Some interesting results are however being observed today with "machine learning" and "big data", but as well in 3D video acceleration for gaming, and large-scale simulations (fluid dynamics, meteorology, behavior analysis, fundamental physics...). The algorithm they use are no longer based on discrete variables, but use large matrixes of statistics that are "computed" with massive parallelism allowing "errors" or instability at the internal microlevels, and strict data locality is simply removed, allowing also interesting features like scalability, resilience and recovery for temporary or permanent "defects". Even the "defects" contain interesting data that can be used. Newer algorithms will use better strategy than basic artihmetic, they will work with other numeric models, with things like fourier transforms, stochastic equations, statistic sampling and evaluators, data feedback, error corrections, resonance attenuation for maintaining stabiility and reliability. These algotithms will work not just on bits but on fractions of bits, and we'll even see non numeric solution using analog processing for even better precision. Binary clocks will also be eliminated in many parts (they are the major cause of heat dissipation at high frequency rates above 1GHz and cause faster wearing of components in chips).