Log in

View Full Version : Visual Basic importing win32api functions


Silver
January 5th, 2006, 07:15
Hi guys,

I'm interested in a specific aspect of how VB works. To call win32api functions from VB you must declare them:

Code:
Declare Auto Function MBox Lib "user32.dll" Alias "MessageBox" ( _
ByVal hWnd As Integer, _
ByVal txt As String, _
ByVal caption As String, _
ByVal Typ As Integer) _
As Integer


(from MSDN). The real API call looks like this:

Code:
int MessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType);


Fine. VB only comes with a small set of built in types (string, integer etc). I'd like to understand how VB translates the built in types into win32api C compatible vars.

For example, the function above is easy. We could have a lookup table that says "HWND is a dword, which is pretty much int-compatible so we'll map HWND:Integer. LPCTSTR is a const string, so we'll map String:LPCTSTR" etc.

That's fine, but what happens with the more complex calls - the ones that take more exotic types and modify them. Consider this:

Code:
int GetWindowText(HWND hWnd, LPTSTR lpString, int nMaxCount);


lpString receives the windowtext. So whilst at VB level the user does this:

Code:
Declare Function Lib "user32.dll" Alias "GetWindowText" (ByVal hWnd As Integer, ByRef lpString as String, ByVal nMaxCount as Integer)
GetWindowText(wnd, mystring$, 200)


Somewhere within the VB interpreter there must be logic that understands lpString receives a value rather than provides it, and thus lpString must be created on the heap before the call is dispatched...

But from the function declaration in VB there's absolutely no indication of this. There's nothing to clearly say "this LPTSTR receives a value", as opposed to all the other functions that have an LPTSTR param and provide a value. There's also nothing that links the last int (max chars to return) to the sizeof the LPTSTR. Hopefully my question makes sense...

VB must do this programmatically rather than through a predefined lookup table of win32 functions, because VB can do the same for any DLL you give it.

Does anyone know how VB does this?

babar0ga
January 5th, 2006, 13:43
Hello

As you know when API deal with strings(LPCTSTR lpText) it really receive ptr to it so,
when VB sees string in API declaration it knows what to do...
It will just pass ptr to it. Nothig special to it.

Consider this code
Code:

Option Explicit
Private Declare Function GetWindowTextA Lib "user32.dll" ( _
ByVal hwnd As Long, _
ByVal lpString As String, _
ByVal cch As Long) As Long
Private Declare Function GetWindowTextX Lib "user32.dll" Alias "GetWindowTextA" ( _
ByVal hwnd As Long, _
ByVal lpString As Long, _
ByVal cch As Long) As Long


Private Sub Form_Load()
Dim strTextU As String
Dim strTextA As String
Dim lret As Long

strTextU = String(50, "A"
Debug.Print LenB(strTextU)

'so, VB will convert it from unicode to ansi and give its ptr to api
lret = GetWindowTextA(Me.hwnd, strTextU, 50)
'after that it will convert back to unicode and present it to us
If lret Then
Debug.Print Left$(strTextU, lret)
End If

strTextU = String(50, "A"
'here we convert it to ansi
strTextA = StrConv(strTextU, vbFromUnicode)
Debug.Print LenB(strTextA)



Dim lP As Long
'now we are going to get a ptr to string
lP = StrPtr(strTextA)
lret = GetWindowTextX(Me.hwnd, lP, 50)
If lret Then
Debug.Print Left$(StrConv(strTextA, vbUnicode), lret)
End If
End Sub

It's a same thing written in two different ways.
First time VB will do the job and second time we will do it by passing long(ptr)...

Hope I helped a bit.


Regards

Admiral
January 5th, 2006, 15:02
You're right, Silver. The VB IDE doesn't know that the String parameter is an OUT value. In fact, this caused me a major headache some time last year.

Interfacing VB with non ActiveX DLLs has always been a pain. Since structures are becoming more and more pointer-oriented, getting VB's limited understanding of pointers to work has become increasingly difficult, and the data-type 'casting' has become incredibly contrived.
For example there are three ways to pass a string to an API function.

LPCSTR seems to be compatible with String
BSTR, LPSTR and char[] and char* match up (in my experience) with 'ByVal String', 'StrPtr()' and 'ObjPtr()' (but probably not in that order).
To be honest, I usually find myself trial-and-erroring with the help of OllyDbg to find the correct match.
But even though we can usually get 'IN String's to work okay, 'OUT String' has to potential to cause the most horribly inconsistent bugs (with the VB IDE 'crash-to-desktop'ing) making for long and painstaking debugging sessions.
I read somewhere that the VB String data type is synonymous with the now (as far as I know) antiquated BSTR type. That is, the first word holds the length of the string and the rest of the structure is a char array. Hence passing the address of a String could potentially throw you into an 'off-by-two' situation.

I found out, after what must have been at least eight hours of doing it the hard way that passing 'ByVal String' (which was probably the first thing I tried) was in fact the correct way to get a returned string from whichever User32 function I was calling, but that the string had to be preallocated to prevent a nasty page-fault.

So the long-and-short of it is that VB treats all Strings in external calls as constant arguments. If you are expecting your function to return you a string, it is your responsibility to ensure that the String's Byte-array is big enough to store the data before you make the call. the safest way to do this is to fill it with nulls:

Code:

Declare Function Lib "user32.dll" Alias "GetWindowText" (ByVal hWnd As Integer, ByRef lpString as String, ByVal nMaxCount as Integer)

Dim mystring As String
Dim wnd As Integer

...

mystring = String(0, 200)
GetWindowText(wnd, mystring, 200)


(Edit: Better keep my variable names consistent )

Regards
Admiral

nikolatesla20
January 6th, 2006, 00:21
Admiral, you are correct. Passing a string ByVal is the correct way to do it. VB replaces the byval with the address of the string. Internally VB uses BSTR'S, so when it passed the address, it's smart enough to pass the pointer to the data inside the BSTR. You should never ever try to use VarPtr or ObjPtr (not sure about StrPtr) on a VB string unless you want to crash (and no need to use StrPtr if you pass byval VB does that for you). However, just as you mentioned if you are recieving data back in (like a string by reference) the string has to be pre-allocated such as:

Dim MyString as string * 200

As an example. One crappy side effect of this is now the string will always have 200 chars even if they are not full so it messes havoc with VB's string routines, so the best thing to do is assign the value right away to a normal string after the API.

Dim MyString as string * 256
Dim CopyString as string

Call ApiFunc(byval MyString)

CopyString = MyString
CopyString = Trim(CopyString)

Also, BSTR is not antiquated. COM uses it everywhere still. Also the nice thing about BSTR's is they can be used to store any binary data since they are not null terminated (length is the first 2 bytes) so it is possible to store structures in a BSTR array.

-nt20

Silver
January 6th, 2006, 09:29
Thanks guys.

Quote:
So the long-and-short of it is that VB treats all Strings in external calls as constant arguments. If you are expecting your function to return you a string, it is your responsibility to ensure that the String's Byte-array is big enough to store the data before you make the call.


Ok, I can understand that. Allocate a huge char[] on the heap then use that. But this leads me to the other part of the problem - the difference between in and out.

Even with a BSTR, the difference between a function that does this:

Code:
void Blah1(char*& szIn)
{
szIn = new char[100];
}


and this:

Code:
void Blah2(char*& szIn)
{
char szSomething = new char[strlen(szIn)+1];
strncpy(szSomething, szIn, strlen(szIn));
}


is invisible to the caller. We all know the 1st one screws with szIn because we're humans with eyes and brains, and we can read MSDN. Even though the function prototypes are the same, we can see the very important difference.

Automated code to handle this *cannot* see the difference. It can't analyze the code inside the function, all it can see is the prototype. As far as any code that tries to interact with these functions is concerned, they're identical.

Assume a higher-level (VB equivalent language) that tries to call this function:

Code:
dim mystr as string * 200
call Blah1(byval mystr)
call Blah2(byval mystr)


The interpreter looks at the call, looks at the prototype, shrugs its shoulders and promptly crashes.

Incidentally I'm not actually working with VB here but the principle is identical. If I know how VB does it and I can replicate the behaviour that will solve my problem.

Admiral
January 6th, 2006, 13:21
Every point you've made has been true, but I'm failing to see the problem.

Would your interpreter be appeased if your prototypes looked like:

void Blah1(char*& szIn);
void Blah2(const char*& szIn); ?

If I were resposible for dealing with designing a heuristic to determine how to treat the char*&, I'd 'assume the worst'. That is, assume the location of the returned string will be different from that of the passed string unless the const keyword is used.

On the other hand, VB 'assumes the best', which is why you have to fill the string first. The reason for this difference lies in the way that VB allocates its Strings. There are no pointers to the heap and as far as the programmer is concerned, the start of the string data is always in the same place.

Perhaps if you could give some more specific details of what your interpreter is failing to do I could be more helpful.

Regards
Admiral

nikolatesla20
January 6th, 2006, 18:41
Quote:
[Originally Posted by Silver]
Somewhere within the VB interpreter there must be logic that understands lpString receives a value rather than provides it, and thus lpString must be created on the heap before the call is dispatched...

But from the function declaration in VB there's absolutely no indication of this. There's nothing to clearly say "this LPTSTR receives a value", as opposed to ....?




Nope, the VB interpreter doesn't do squat. The programmer has to do it, since API declares are "open territory". That VB does not even know what to do until you read MSDN and do the right thing is clearly evident by the fact that it will crash if you pass the string the incorrect way. By declaring ByVal as string, you tell VB to pass the pointer to the string, period. If it's an OUT parameter and you didn't pre-size it, then you'll just crash, which indicates to you, the programmer, that you need to read the documentation and make sure to allocate some space yourself. VB does not do it automatically whatesoever, so the interpreter does not handle it either.

-nt20

Silver
January 7th, 2006, 09:17
Quote:
Nope, the VB interpreter doesn't do squat. The programmer has to do it


Aha! So it's buyer-beware, the caller is responsible for explaining exactly what's about to happen to the interpreter. That makes sense, thanks. Not being a VB user myself I assumed it all worked automagically.

Admiral:

Quote:
That is, assume the location of the returned string will be different from that of the passed string unless the const keyword is used.


That's lining up for some extremely nasty problems though. Depending on how the passed string is declared (stack based or heap based), you'd either cause an exception when the stack based one is messed with or a mem leak with the orphan heap based one.

Here's what I'm trying to do. Assuming an arbitrary language with 4 core language types (bool, string, integer, float), call an arbitrary win32api function using the language types as parameters. This is exactly what VB does.

So if my interpreter understands some equivalent of ByVal and ByRef I should be able to solve it. When a param is passed ByVal, I do a simple conversion (language "string" to api "const char*". When a param is passed ByRef and it's a string, I can check the existing size of the param and create a new char* on the heap, then use that for the api call. If the existing size is 0, I can assume the api call will allocate the char* on the heap for me.

Thanks!

Silver
January 13th, 2006, 06:34
Okay, so here's a followup question. Does anyone have any thoughts about an elegant way to call an unknown function with an unknown number of parameters (C++)?

Accepting an unknown number of params in a function you code is easy enough with "..." and va_arg, but the problem is how I reflect this to an imported call.

I've got arbitrary function F that takes parameters A & B, imported from DLL D - how do I actually call it?

goggles99
January 13th, 2006, 22:12
Quote:
[Originally Posted by Silver]Okay, so here's a followup question. Does anyone have any thoughts about an elegant way to call an unknown function with an unknown number of parameters (C++)?

Accepting an unknown number of params in a function you code is easy enough with "..." and va_arg, but the problem is how I reflect this to an imported call.

I've got arbitrary function F that takes parameters A & B, imported from DLL D - how do I actually call it?


An unknown function, does that mean that the data type , direct or by pointer & the number of args is not known?

Hmm that sounds pretty tough. I think that maybe some kind of intense analyzer would do the trick (in depth analysis like IDA does).

Maybe you could brute force some very simple fuctions, especially if a success or failure was returned from the function (I guess you wouldn't know that though since the function is unknown). By brute force, I mean just trying several different parameters/data types/quantity of parameters, inside a SEH untill the expected result is returned.

What purpose would it serve to call an unknown function anyway? You don't even know what you are going to give and get back. I don't get it.

LLXX
January 14th, 2006, 04:35
Quote:
[Originally Posted by Silver]Okay, so here's a followup question. Does anyone have any thoughts about an elegant way to call an unknown function with an unknown number of parameters (C++)?

Accepting an unknown number of params in a function you code is easy enough with "..." and va_arg, but the problem is how I reflect this to an imported call.
Do you mean the function can take a variable number of parameters (as you seem to indicate with references to "..." and va_arg), or that it has a fixed number, but that number is not known?

In the first case, examining how it uses the stack would be essential to finding out how it can count how many parameters it received. In a standard example, printf(), the format specifier string performs this function.

In the second case, just find the highest [esp+xx] or [ebp+xx] reference, which will be the first parameter that was pushed.

Quote:
[Originally Posted by Silver]I've got arbitrary function F that takes parameters A & B, imported from DLL D - how do I actually call it?
Assuming C++ calling convention...
Code:

push B
push A
call [F_import]
add esp, 8
Regarding data types, on IA32 all parameters are 4 bytes, i.e. dords, or multiples thereof, so there shouldn't be a problem with that.

Silver
January 14th, 2006, 08:41
Quote:
Regarding data types, on IA32 all parameters are 4 bytes, i.e. dords, or multiples thereof, so there shouldn't be a problem with that.


That turned out to be the solution - align all the arguments on 32bit boundarys then pass them to the GetprocAddress function pointer. Needed some creative casting but it works without asm required.

Quote:
What purpose would it serve to call an unknown function anyway? You don't even know what you are going to give and get back. I don't get it.


Goggles, consider what happens when a VB exe imports and calls a win32api function. The coder of the VB program knows what parameters are being passed and received, but the coder of VB who wrote the VB-win32api interface didn't. All he knew was that people would call an arbitrary function via a function pointer, which may or may not have N arguments requiring T types and returning an R type.

Thanks guys, this one is solved