[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: OP_TAILCALL versus OP_CALL question
- From: Jay Carlson <nop@...>
- Date: Tue, 24 Mar 2015 23:01:25 -0400
On Mar 23, 2015, at 8:21 PM, Pierre-Yves Gérardy <pygy79@gmail.com> wrote:
> Sibling calls are a subset of tail calls that can be implemented on
> top of the C calling convention under the following conditions:
>
> * Caller and callee have the same calling convention. It can be either
> c or fastcc.
> * The call is a tail call - in tail position (ret immediately follows
> call and ret uses value of call or is void).
> * Caller and callee have matching return type or the callee result is not used.
> * If any of the callee arguments are being passed in stack, they must
> be available in caller’s own incoming argument stack and the frame
> offsets must be the same.
For the sake of the archives, let me see if I understand one specific case:
llvm can(?) optimize tail-position calls to functions with the same (non-vararg) signature as the caller.
> If I understand properly, the last point is quite restrictive.
I am uncertain whether the documentation requires that stack-passed arguments must be unmodified, or just that the argument slots must match up. If the arguments must be unmodified, it is quite restrictive indeed--only useful for shims and forwarding.
Architectures other than i386 seem like they should not run into that stack restriction as easily. Traditional-mips, arm, and amd64 have 4 function parameter registers, non-Windows amd64 has 6, and n32/n64/eabi MIPS have 8. (Passing and returning structs by value may cause trouble; some ABIs pass pointers instead. :-/ ABIs with hardware floating point allocate from a separate pool, but one soft-float double eats two 32-bit registers.)
> BTW, when I say "you can't do that" regarding "you still clean up the
> stack frame, push on the return values, and jump", I meant that you
> can't do it manually with LLVM. It does it for you if you ask nicely
> :-).
Oh, I'm sorry--I misunderstood what you were saying. Yes, that makes sense.
Jay
>
>
> —Pierre-Yves
>
>
> On Sun, Mar 22, 2015 at 7:34 PM, Jay Carlson <nop@nop.com> wrote:
>> Well, clang 3.4.1 on Ubuntu trusty targeting amd64 seems to generate the tail call/sibling call/whatever-they-call-it.
>>
>> My guess is that the Linux amd64 ABI is always "fastcall" from the point of view of that documentation.
>>
>> Jay
>>
>> On Mar 21, 2015, at 6:03 PM, Pierre-Yves Gérardy <pygy79@gmail.com> wrote:
>>
>>> The trouble is that you can't do that with LLVM.
>>>
>>> It does support TCO, though, at the expense of easy C interop (you
>>> have to use incompatible calling conventions which breaks the ABI). As
>>> long as you don't want to mix C and Lua/Ravi code, FFI-style, you
>>> should be fine.
>>>
>>> http://llvm.org/docs/CodeGenerator.html#tail-call-optimization
>>> http://llvm.org/docs/LangRef.html#calling-conventions
>>> —Pierre-Yves
>>>
>>>
>>> On Sat, Mar 21, 2015 at 4:23 PM, Coda Highland <chighland@gmail.com> wrote:
>>>> On Fri, Mar 20, 2015 at 3:25 PM, Dibyendu Majumdar
>>>> <mobile@majumdar.org.uk> wrote:
>>>>> On 20 March 2015 at 22:16, Doug Currie <doug.currie@gmail.com> wrote:
>>>>>>> I haven't yet figured out how to properly implement OP_TAILCALL in a
>>>>>>> JITed function.
>>>>>>
>>>>>>
>>>>>> The usual approach is to emit code to pop the stack frame and convert the
>>>>>> tail call to a jump.
>>>>>>
>>>>>
>>>>> In the recursive case, yes, but Lua also uses tail calls for
>>>>> non-recursive scenarios. For these a different function may be called
>>>>> so it is not possible to handle this in a JITed function - without
>>>>> replacing the function being executed as well.
>>>>>
>>>>> Regards
>>>>>
>>>>
>>>> The technique still applies to the non-recursive case -- you still
>>>> clean up the stack frame, push on the return values, and jump.
>>>>
>>>> /s/ Adam
>>>>
>>>
>>
>>
>