lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On Mar 23, 2015, at 8:21 PM, Pierre-Yves Gérardy <pygy79@gmail.com> wrote:

> Sibling calls are a subset of tail calls that can be implemented on
> top of the C calling convention under the following conditions:
> 
> * Caller and callee have the same calling convention. It can be either
> c or fastcc.
> * The call is a tail call - in tail position (ret immediately follows
> call and ret uses value of call or is void).
> * Caller and callee have matching return type or the callee result is not used.
> * If any of the callee arguments are being passed in stack, they must
> be available in caller’s own incoming argument stack and the frame
> offsets must be the same.

For the sake of the archives, let me see if I understand one specific case: 

llvm can(?) optimize tail-position calls to functions with the same (non-vararg) signature as the caller.

> If I understand properly, the last point is quite restrictive.


I am uncertain whether the documentation requires that stack-passed arguments must be unmodified, or just that the argument slots must match up. If the arguments must be unmodified, it is quite restrictive indeed--only useful for shims and forwarding.

Architectures other than i386 seem like they should not run into that stack restriction as easily. Traditional-mips, arm, and amd64 have 4 function parameter registers, non-Windows amd64 has 6, and n32/n64/eabi MIPS have 8. (Passing and returning structs by value may cause trouble; some ABIs pass pointers instead. :-/ ABIs with hardware floating point allocate from a separate pool, but one soft-float double eats two 32-bit registers.)

> BTW, when I say "you can't do that" regarding "you still clean up the
> stack frame, push on the return values, and jump", I meant that you
> can't do it manually with LLVM. It does it for you if you ask nicely
> :-).

Oh, I'm sorry--I misunderstood what you were saying. Yes, that makes sense.

Jay

> 
> 
> —Pierre-Yves
> 
> 
> On Sun, Mar 22, 2015 at 7:34 PM, Jay Carlson <nop@nop.com> wrote:
>> Well, clang 3.4.1 on Ubuntu trusty targeting amd64 seems to generate the tail call/sibling call/whatever-they-call-it.
>> 
>> My guess is that the Linux amd64 ABI is always "fastcall" from the point of view of that documentation.
>> 
>> Jay
>> 
>> On Mar 21, 2015, at 6:03 PM, Pierre-Yves Gérardy <pygy79@gmail.com> wrote:
>> 
>>> The trouble is that you can't do that with LLVM.
>>> 
>>> It does support TCO, though, at the expense of easy C interop (you
>>> have to use incompatible calling conventions which breaks the ABI). As
>>> long as you don't want to mix C and Lua/Ravi code, FFI-style, you
>>> should be fine.
>>> 
>>> http://llvm.org/docs/CodeGenerator.html#tail-call-optimization
>>> http://llvm.org/docs/LangRef.html#calling-conventions
>>> —Pierre-Yves
>>> 
>>> 
>>> On Sat, Mar 21, 2015 at 4:23 PM, Coda Highland <chighland@gmail.com> wrote:
>>>> On Fri, Mar 20, 2015 at 3:25 PM, Dibyendu Majumdar
>>>> <mobile@majumdar.org.uk> wrote:
>>>>> On 20 March 2015 at 22:16, Doug Currie <doug.currie@gmail.com> wrote:
>>>>>>> I haven't yet figured out how to properly implement OP_TAILCALL in a
>>>>>>> JITed function.
>>>>>> 
>>>>>> 
>>>>>> The usual approach is to emit code to pop the stack frame and convert the
>>>>>> tail call to a jump.
>>>>>> 
>>>>> 
>>>>> In the recursive case, yes, but Lua also uses tail calls for
>>>>> non-recursive scenarios. For these a different function may be called
>>>>> so it is not possible to handle this in a JITed function - without
>>>>> replacing the function being executed as well.
>>>>> 
>>>>> Regards
>>>>> 
>>>> 
>>>> The technique still applies to the non-recursive case -- you still
>>>> clean up the stack frame, push on the return values, and jump.
>>>> 
>>>> /s/ Adam
>>>> 
>>> 
>> 
>> 
>