Nynaeve
April 22nd, 2008, 22:00
The __fastcall calling convention is the last major major C-supported Win32 (x86) calling convention that I have not covered yet. (There still exists __thiscall, which I’ll discuss later).
__fastcall is, as you might guess from the name, a calling convention that is designed for speed. In this spirit, it attempts to borrow from many RISC calling conventions in that it tries to be register-based instead of stack based. Unfortunately, for all but the smallest or simplest functions, __fastcall typically does not end up being a particularly stellar thing performance-wise, for x86, primarily due to the (comparatively) extremely limited register set that x86 sports.
This calling convention has a great deal in common with the x64 calling convention ("http://www.nynaeve.net/?p=10") that Win64 uses. In fact, aside from the x64-specific parts of the x64 calling convention, you can think of the x64 calling convention as a logical extension of __fastcall that is designed to take advantage of the expanded register set available with x64 processors.
What this boils down to is that __fastcall will try to pass the first two pointer-sized arguments in the ecx and edx registers. Any additional registers are passed on the stack as per __stdcall ("http://www.nynaeve.net/?p=53").
In practice, the key things to look out for with a __fastcall function are thus:
The function that we are going to call is declared as so:
This is consistent with our previous examples, save that it is declared __fastcall.
The function call that we shall make is as so:
With this code, we can expect the function call to look something like so in assembler:
This is actually a bit different than we might expect. Here, the compiler has been a bit clever and used some basic optimizations with setting up constants in registers. These optimizations are extremely common and something that you should get used to seeing as simply constant assignments to registers, given how frequently they show up. In a future series, I’ll go into some more details as to common compiler optimizations like these, but that’s a tale for a different time.
Continuing with __fastcall, here’s what the implementation of FastcallFunction1 looks like in assembler:
As you can see, in this particular instance, __fastcall turns out to be a big saver as far as instructions executed (and thus size, and in a lesser degree, speed) of the callee. This kind of benefit is usually restricted to extremely simple functions, however.
The main things, then, to consider if you are trying to identify if a function is __fastcall or not are thus:
http://www.nynaeve.net/?p=63
__fastcall is, as you might guess from the name, a calling convention that is designed for speed. In this spirit, it attempts to borrow from many RISC calling conventions in that it tries to be register-based instead of stack based. Unfortunately, for all but the smallest or simplest functions, __fastcall typically does not end up being a particularly stellar thing performance-wise, for x86, primarily due to the (comparatively) extremely limited register set that x86 sports.
This calling convention has a great deal in common with the x64 calling convention ("http://www.nynaeve.net/?p=10") that Win64 uses. In fact, aside from the x64-specific parts of the x64 calling convention, you can think of the x64 calling convention as a logical extension of __fastcall that is designed to take advantage of the expanded register set available with x64 processors.
What this boils down to is that __fastcall will try to pass the first two pointer-sized arguments in the ecx and edx registers. Any additional registers are passed on the stack as per __stdcall ("http://www.nynaeve.net/?p=53").
In practice, the key things to look out for with a __fastcall function are thus:
Taking these into account, let’s take a look at the same sample function and function call that we have been previously dealing with in our earlier examples, this time in __fastcall.
The callee assumes a meaningful value in the ecx (or edx and ecx) registers. This is a tell-tale sign of __fastcall (although, you may sometimes see __thiscall make use of ecx for the this pointer).
No arguments are cleaned off the stack by the caller. Only __cdecl functions have this property.
The callee ends in a retn (args-2)*4 instruction. In general, this is the pattern that you will see with __fastcall functions that use the stack. For __fastcall functions where no stack parameters are used, the function typically ends in a ret instruction with no stack displacement argument.
The callee is a short function with very few arguments. These are the most likely cases where a smart programmer will use __fastcall, as otherwise, __fastcall does not tend to buy you very much over __stdcall.
Functions that interface directly with assembler. Having access to ecx and edx can be a handy shortcut for a C function that is being called by something that is obviously written in assembler.
The function that we are going to call is declared as so:
Code:
__declspec(noinline)
int __fastcall FastcallFunction1(int a, int b, int c)
{
return (a + b) * c;
}
The function call that we shall make is as so:
Code:
FastcallFunction1(1, 2, 3);
Code:
push 3 ; push 'c' onto the stack
push 2 ; place a constant 2 on the stack
xor ecx, ecx ; move 0 into 'a' (ecx)
pop edx ; pop 2 off the stack and into edx.
inc ecx ; set 'a' -- ecx to 1 (0+1)
call FastcallFunction1 ; make the call (a=1, b=2, c=3)
Continuing with __fastcall, here’s what the implementation of FastcallFunction1 looks like in assembler:
Code:
FastcallFunction1 proc near
c= dword ptr 4
lea eax, [ecx+edx] ; eax = a + b
imul eax, [esp+4] ; eax = (eax * c)
retn 4 ; return eax;
FastcallFunction1 endp
The main things, then, to consider if you are trying to identify if a function is __fastcall or not are thus:
That’s all for __fastcall. More on other calling conventions next time…
Usage of the ecx (or ecx and edx) registers in the function without loading them with explicit values before-hand. This typically indicates that they are being used as argument registers, as with __fastcall.
The caller does not clean any arguments off the stack (no add esp instruction to clean the stack after the call). With __fastcall, the callee always cleans the arguments (if any).
A ret instruction (with no stack displacement argument) terminating the function, if there are two or less arguments that are pointer-sized or smaller. In this case, __fastcall has no stack arguments.
A retn (args-2)*4 instruction terminating the function, if there are three or more arguments to the function. In this case, there are stack arguments that must be cleaned off the stack via the retn instruction.
http://www.nynaeve.net/?p=63