Note: math.floor() is not required toi return a native binary integer. It may return a variant type using integers for some ranges, and doubles outside the range. So even in this case the the inference would not help eliminate the necessary tests on the effective type of the variant... The compiler has to know also the value range.
But your existing function fromjulianday(jd) does not check the value range of d, so it's not possible to determine the value range for math.floor() and then reduce it to code using only native integers.
What your code can do however is to favorize the integer case by making branch prediction and avoiding most branches for that case. The CPU will iteself make its own branch prediction anyway and possibly adapt itself using a branch prediction cache.
What is much more useful is to know how instructions are scheduled and pipelined: augment the number of distinct instructions that can handle data in parallel, according to the number of native registers you have and avoid scheduling two costly execution units (like FPUs) in parallel.
Instruction scheduling (and correct understanding of how CPU pipelines are working (and how the central pipeline stage, i.e. execution in a ALU/FPU/VPU port or memory access, can avoid using the same cycles: this also mitigates the time attacks based on dynamic branch predictions and speculative execution).
A good optimizer also can know equivalent patterns of instructions that avoid branch predictions (e.g. it's perfectly possible to implement the function abx(x) without using any conditional branch, by computing the two parallel branches and combining them with a binary or: this uses two parallel pipelines and a final "rendez-vous" where some other independant instructions are scheduled just before computing the binary OR): it is difficult to find other instructions in an isolated abs(x) function, but not if the function is inlined within more complex expressions. Detecting such patterns requires a good database of equivalent binary expressions (e.g. optimizing multipliations by shifts, and optimizing shifts by additions, and avoiding patterns like "store Register[n] to [x]; read Register[m] from [x]" and replacing it by "move Register[n] to [x]; move Register[n] to Register[m]" where the two instructions can run in parallel...). Such rewriting required analyzing the datapaths and which virtual intermediate result is reused later or not, so that you can reduce the number of registers needed. This will then largely reduce the footprint on the datacaches (from the stack or actual native registers).
Note that the ompiler must still be correct semantically (notably for handling exceptions/errors and pcall() correctly).
As well a compiler can autodetect constant expressions like "(a+b)*(c-(a+b))". But this is not easy at all if some elements are actually functions like in "Rand()+Rand()" where not only the two calls to function Rand() is supposed to return different numbers, but also the function must be called twice (the number of calls may be significant is the function uses variables kept across calls in its "closure")