Log in

View Full Version : OllyDbg and Sysenter


nikolatesla20
February 8th, 2006, 18:21
Has anyone researched or wrote a plugin for Olly to step over SYSENTER commands? Olly works fine for Int2E calls on older Win2K but it chokes on SYSENTER. Seems maybe we just need to query the MSR registers? (Write a small driver)

The following article has a good explaination:

http://www.codeguru.com/Cpp/W-P/system/devicedriverdevelopment/article.php/c8223/


The EIP for sysenter is stored in an MSR. I wrote a driver once long ago that could read MSR registers.

I was thinking about playing around with this more, unless someone knows about a solution existing already.

Oh, and also some CPUS support MSR registers for Branching - they will keep track of the last Jump instruction. I think this could be handy for some protections which jump to the OEP and have stolen bytes..instead of tracing you could put BPM or such on OEP area and then get last branch from MSR. (The driver I wrote a long time ago I believe used this MSR actually).

The only part to watch out for is to hide the driver from detection.

Maybe Kayaker will have some words to share about this as well.

-nt20

blabberer
February 9th, 2006, 11:10
chokes as in you cant find the return address and break there after a sysenter call ?
is that what you are meaning by choking

for example in win2k int2e mechanism one could set a break on NtContinue() ->pcontext.eip [[esp+4]+0xb8] for stopping on returns

if yes before sysenter is executed the eip is saved in edx register

you can use it to break on return from kernel

if you notice the syscall would be like this
mov eax,the service id
mov edx,K_USER_SHARE_PAGE address mostly 0x7ffe300 as far as remember
(dont have xp so cant do a detailed post)
call edx <--- this will push the return address that is the next instruction

here in call edx you would see it is saving the esp value into edx before doing sysenter

mostly ox7ffe300 will always be this instruction
mov edx,esp
and ox7ffe304 will be sysenter

you can set your break on this address

Kayaker
February 9th, 2006, 23:36
Interesting idea nikolatesla. I'm not quite sure what you mean by Olly chokes though either. I made up a little test app for tracing, simply a call to MessageBeep, which uses an underlying Int2E/SYSENTER call. In either 2K or XP Olly seems to be able to step over the syscall OK, though if you F8 over the Int2E it doesn't necessarily return *directly* to the RET immediately after. For MessageBeep at least, Olly returns to the calling function User32!_NtUserCallOneParam, why it seems to miss a RET I'm not sure. I suppose an EXCEPTION_DEBUG_EVENT is generated and the handler gets that ret address from the exception record.

One interesting thing I did notice is that Olly sets a BP immediately after the MessageBeep call in user code. This effect is totally hidden unless you use Softice as well to set a BP deeper in the system code and trace back. I'd like to check if Olly does this with *all* API calls. I don't know if other debuggers do this, but it almost seems like a protection mechanism by Olly so an API call never "gets away" from it. By having a BP already set after the call, the debugger ensures control will get back to it no matter what the API might do.

For example if you set a Softice BP on MessageBeep, or deeper on the actual syscall it uses, win32k!UserSoundSentryWorker, you can single step trace back to the user code. When doing this I noticed Olly had set an Int3 0xCC opcode at 00401128 immediately after the initial API call.

00401121 6A00 PUSH 0 // MB_OK
00401123 E838010000 CALL <JMP.&user32.MessageBeep
00401128 E98A000000 JMP TestDlg.004011B7 // Olly sets a BP here

Normally you wouldn't notice this without the ring0 debugger since Olly would have already handled the BP transparently.



As for the LastBranch MSR, that's quite an interesting idea. As I remember Softice for Win9x output that info as part of its BPM command, I don't think it does anymore at least not directly. There's an internal Sice function that IceExt makes use of for its !LASTBRANCH command. This function was identified back in Icedump as pRecordLastBranchInfo and using the byte pattern given in IceExt we can find and analyse it. The proper MSR numbers can be identified from string references in Sice. It's all based on setting the LBR bit of the IA32_DEBUGCTL MSR. See sec 14.5.1 of Intel vol3.

Code:

.text:00017279 MSR_RecordLastBranchInfo proc near ; CODE XREF: .text:0001340Bp
.text:00017279 cmp cs:MSR_LBR_BitEnabled, 0
.text:00017281 jnz short @RecordLastBranchInfo
.text:00017283 retn
.text:00017284 ; ---------------------------------------------------------------------------
.text:00017284
.text:00017284 @RecordLastBranchInfo: ; CODE XREF: MSR_RecordLastBranchInfo+8j
.text:00017284 pusha
.text:00017285 mov ebp, esp
.text:00017287 push ds
.text:00017288 db 66h
.text:00017288 mov ds, cs:wNTICE_SS
.text:00017290 mov ecx, 1DBh ; LastBranchFromIP
.text:00017290 ; "Last Branch From EIP"
.text:00017295 call c_RDMSR
.text:0001729A mov dMSR_LAST_BRANCH_0, eax
.text:0001729F mov ecx, 1DCh ; LastBranchToIP
.text:0001729F ; "Last Branch To EIP"
.text:000172A4 call c_RDMSR
.text:000172A9 mov dMSR_LAST_BRANCH_1, eax
.text:000172AE mov ecx, 1D9h ; DebugCtlMSR
.text:000172AE ; "Debug Control"
.text:000172B3 call c_RDMSR
.text:000172B8 or eax, 1 ; Set LBR bit of IA32_DEBUGCTL MSR
.text:000172BB call c_WRMSR
.text:000172C0 pop ds
.text:000172C1 popa
.text:000172C2 retn
.text:000172C2 MSR_RecordLastBranchInfo endp


14.5.1. IA32_DEBUGCTL MSR (Pentium 4 Processors)
The IA32_DEBUGCTL MSR enables and disables the various last branch recording mechanisms
described in the previous section. This register can be written to using the WRMSR
instruction, when operating at privilege level 0 or when in real-address mode. A protected-mode
operating system procedure is required to provide user access to this register. Figure 14-2 shows
the flags in the IA32_DEBUGCTL MSR. The functions of these flags are as follows:

LBR (last branch/interrupt/exception) flag (bit 0)
When set, the processor records a
running trace of the most recent branches, interrupts, and/or exceptions taken
by the processor (prior to a debug exception being generated) in the last branch
record (LBR) stack. Each branch, interrupt, or exception is recorded as a 64-bit
branch record (see Section 14.5.2., “LBR Stack (Pentium 4 Processors)”. The
processor clears this flag whenever a debug exception is generated (for
example, when an instruction or data breakpoint or a single-step trap occurs).


I don't know how something like this could best be implemented in Olly, but as you say the *mechanism* is there anyway.

Cheers,
Kayaker

ZaiRoN
February 10th, 2006, 08:24
Quote:
I'd like to check if Olly does this with *all* API calls.
As far as I have seen, Olly puts a breakpoint on the return address of every call (API and not) that is stepped in 'step over' mode (using F8). To find the code look at exported functions like _Setbreakpointext and _Tempbreakpoint. From a quick glance seems like 4350AE address is where Olly decides all:
4350AE: cmp [LOCAL.2], 1
where:
LOCAL2 == 0 if step into (F7)
LOCAL2 == 1 if step over (F8)

At the moment I can't get the reason for a new *hidden* breakpoint...

blabberer
February 10th, 2006, 10:42
as to a temporary break olly always sets a temporary break on all CALLS
doesnt matter if it is api or not if you use f8

to test it you can do some thing like

401034 call sub_401084
40103# <may be 5 bytes or tw0 bytes>

let the sub look like this
401084 nop
401085 nop
401086 nop
401087 nop
401088 ret

set a break point on 401086 (simple f2 break point)
use right click new origin here on 0x 401034
and step over it using f8
ollydbg will break inside that sub due to the break point (at 0x401086)
now you can read memory and you will see 0x 40103# will have a temp breakpoint set
you can also confirm by pressing f9 (run) when you have broke on 401086

olly will neatly transfer control to 0x40103# and will break there as if you just stepped over the earlier call

also if one cant stop on 0x7ffe304 due to page protection not modifiable
problem

one can always memory search for sequence in ntdll.dll
for
mov eax,const
mov edx,const
call edx

and write a simple macro to set break on all returns (using commandline plugins macro execution)
or a more genreic search like
mov r32,const
mov r32,const
call r32

or a matching sequnce search like
mov r32,const
mov ra,const
call rb

the difference between the last two generic sequance being
the first one would higlight
code seqeuence like
mov eax,const
mov edx,const <-- reg
call ecx <--- reg is different

whereas ra and rb would highlight only those sequences
where the sequence is
mov edx,const
call edx <-- ra and rb should match and be same

int2e stepped over will return to the eip pointed by PCONTEXT.EIP passed to NtContinue ()

nikolatesla20
February 10th, 2006, 11:30
Ok guys I guess I have to do some more testing. I just noticed in ActiveMark protect that on some exceptions that go thru SYSENTER that if I tried to F8 into them Olly would just do nothing. The program would continue "running" but would never actually run, and Olly just sat there saying "running" but nothing happened. So it was like the whole thing was lost in hyperspace. That's why I was thinking Olly probably cannot handle SysEnter very well ? But perhaps there is another way as blabberer said.

I long ago wrote a driver to read MSR regs. What I did was for Asprotect programs was debug them and wait til the 12th exception, then I overwrote the entire code section with int3's, and then let it continue, which would cause an int3 exception, which my debugger would catch, and then query the driver for LastBranchFromIP, so I could get where the call to OEP came from, and then you could SI and get the stolen bytes..(worked at one time that way until Alexy deleted the Stolen code hehe)

here was the code in the driver actually..

Code:

////////////////////////////////////////////////////////////////////////
// GetMSRDevice::GETMSR_READMSR_Handler
//
// Routine Description:
// Handler for IO Control Code GETMSR_READMSR
//
// Parameters:
// I - IRP containing IOCTL request
//
// Return Value:
// NTSTATUS - Status code indicating success or failure
//
// Comments:
// This routine implements the GETMSR_READMSR function.
// This routine runs at passive level.
//

NTSTATUS GetMSRDevice::GETMSR_READMSR_Handler(KIrp I)
{
NTSTATUS status = STATUS_SUCCESS;

t << "Entering GetMSRDevice::GETMSR_READMSR_Handler, " << I << EOL;
// TODO: Verify that the input parameters are correct
// If not, return STATUS_INVALID_PARAMETER

// TODO: Handle the the GETMSR_READMSR request, or
// defer the processing of the IRP (i.e. by queuing) and set
// status to STATUS_PENDING.

// TODO: Assuming that the request was handled here. Set I.Information
// to indicate how much data to copy back to the user.
unsigned long Buffer_Data = 0;

// Get the buffer data sent in (if any)
memcpy(&Buffer_Data,I.IoctlBuffer(),sizeof(Buffer_Data));

//t << "Buffer_Data is " << Buffer_Data << EOL;

//t << "Output buffer size is " << I.IoctlOutputBufferSize() << EOL;

// Read the MSR

// DEBUGCTLMSR is 0x1D9
// LASTBRANCHFROMIP is 0x1DB
// LASTBRANCHTOIP is 0x1DC
// LASTINTFROMIP is 0x1DD
// LASTINTTOIP is 0x1DE

// How to read an MSR register:
// Load ECX with the register's address (above)
// execute the rdmsr command
// Values returned in EDX : EAX - EDX is high dword, EAX is low dword.

_asm
{
int 3
push eax
push ecx
mov ecx, 0x1DB
rdmsr
mov Buffer_Data,eax
pop ecx
pop eax
}

// Copy the value of the MSR back out to caller
memcpy(I.IoctlBuffer(),&Buffer_Data,sizeof(Buffer_Data));

I.Information() = sizeof(Buffer_Data);

return status;
}



oops I noticed I have an int3 in the driver code - I think that was for testing purposes only at the time, so don't do that !..


Although on my older win2K systems SI would report LastBranch and such if you used a BPM, I noticed on my XP system it would not do so. Perhaps as you said Kayaker, MSR needed to be enabled on the processor (both are PentIII)



-nt20

Kayaker
February 11th, 2006, 00:39
Quote:
olly always sets a temporary break on all CALLS ... if you use f8


In retrospect that was a silly observation on my part. The logical retort is - of course a debugger sets a BP if you step over a call else how would it be able to continue tracing you silly bugger? It was an error in my testing, while I could *see* the embedded 0CCh in the user code using Softice, I didn't see one when stepping over a call in say user32.dll. I wondered why there was a difference, but it was just because I didn't do it right, the effect is apparent now.

What it does indicate though is that one can use Softice effectively with Olly. You can step over a call in ring3, but if you're interested in any kernel mode stuff it does you can still set a Softice BP. Once you leave Sice with F5, control will return automatically to the call you stepped over back in Olly.
Oh no, not a marriage!?


That was sort of neat nico. When you overwrote the entire code section with Int3's, this caused an exception when code execution went from high mem Asp to the OEP in 400000 range? And this was the trigger and stoppage in execution that allowed you to use LastBranch. Cool, that shows that LastBranch has some nifty uses. Reading the Intel docs it seems there are some really interesting possibilities using this feature of the MSR's, particularly the function of Branch Trace Messages (BTMs) and Branch Trace Store (BTS).

This is worth some further cogitation.
..envisions a cow standing in a field slowly chewing its cud over the problem..

Kayaker