Ring3 Circus
April 13th, 2008, 22:22
Things have been quiet over here since installing Life 2.0, so to start warming things up again I present a simple trick to counter a frustrating problem. This isn’t particularly clever, but it didn’t occur to me first few times ’round so maybe it will save some newbies a little time.
How many times have you attempted a run-trace in Olly with a break-condition set, only to find that the break never occurs and you have to start over? It happens to me all the time, and it’s usually my fault. But sometimes the cause isn’t a poorly thought-out condition, but the presence of a system call during the trace. Being a lowly ring3 process, OllyDbg doesn’t have permission to follow the CPU’s execution of your program into kernel space, so when it encounters a SYSCALL, SYSENTER, or CALL/JMP FAR out of the debuggee’s address-space, no choice remains other than to let the process run away and hope it comes back. As we’ve established, this doesn’t always happen.
Now, the most commonly encountered system calls are those in ntdll, but these pose no problem to us. Set ‘Always trace over system DLLs’ in Olly’s ‘Trace’ options and you’ll never need to trace over such an instruction directly. (If this still isn’t working for you, be sure to ‘Mark [ntdll] as system DLL’ in the ‘Executable modules’ dialog). But if you play with some nastier targets, such kernel transitions will occur in modules you need the trace data for.
Olly intrinsically supports three modes of tracing (although it doesn’t give you much control over which it uses): hardware breakpoints, software breakpoints and execution trapping (using the EFlags ‘trap bit’
. Unfortunately, given the way Windows implements its kernel transitions, we can’t use any of these to our advantage. Arguments are passed to these system calls via the general-purpose registers, rather than by the stack-frame system we’re all used to, and moreover the return address isn’t always that of the instruction following the system call. Hence without complete knowledge of these ‘fastcall’ specifications, there’s no way to be sure where user-mode execution will resume, and so we’re at a loss when it comes to placing the next breakpoint. For this reason, there is no one-size-fits-all solution that I’m aware of, but it is possible to clean up this little mess on a per-case basis with only minimal effort.
Take a look at the run trace produced by your failed run. It’s probably a lot shorter than you expected, with its dying instructions indicating an upcoming transition to ring0. Scroll up a little and work out a location where execution is very likely to pass through shortly after ring3 execution resumes. This doesn’t need to be the actual return address, but ideally something that follows soon after. Usually, the saved return-address belonging to the function at the top of the call stack does a good job. Our goal is to trap execution at this point and resume the trace. Of course, we could do this manually by placing a breakpoint and keeping our fingers on ctrl-F11, but that’s the dumb way. The last piece of the puzzle is to make OllyDbg automate this for us: employ a conditional breakpoint, with an impossible condition (say, 1 == 2). Now our trace executes over the system call flawlessly without the need for us to lift a finger.
http://www.ring3circus.com/rce/tracing-over-system-calls-in-ollydbg/
How many times have you attempted a run-trace in Olly with a break-condition set, only to find that the break never occurs and you have to start over? It happens to me all the time, and it’s usually my fault. But sometimes the cause isn’t a poorly thought-out condition, but the presence of a system call during the trace. Being a lowly ring3 process, OllyDbg doesn’t have permission to follow the CPU’s execution of your program into kernel space, so when it encounters a SYSCALL, SYSENTER, or CALL/JMP FAR out of the debuggee’s address-space, no choice remains other than to let the process run away and hope it comes back. As we’ve established, this doesn’t always happen.
Now, the most commonly encountered system calls are those in ntdll, but these pose no problem to us. Set ‘Always trace over system DLLs’ in Olly’s ‘Trace’ options and you’ll never need to trace over such an instruction directly. (If this still isn’t working for you, be sure to ‘Mark [ntdll] as system DLL’ in the ‘Executable modules’ dialog). But if you play with some nastier targets, such kernel transitions will occur in modules you need the trace data for.
Olly intrinsically supports three modes of tracing (although it doesn’t give you much control over which it uses): hardware breakpoints, software breakpoints and execution trapping (using the EFlags ‘trap bit’

Take a look at the run trace produced by your failed run. It’s probably a lot shorter than you expected, with its dying instructions indicating an upcoming transition to ring0. Scroll up a little and work out a location where execution is very likely to pass through shortly after ring3 execution resumes. This doesn’t need to be the actual return address, but ideally something that follows soon after. Usually, the saved return-address belonging to the function at the top of the call stack does a good job. Our goal is to trap execution at this point and resume the trace. Of course, we could do this manually by placing a breakpoint and keeping our fingers on ctrl-F11, but that’s the dumb way. The last piece of the puzzle is to make OllyDbg automate this for us: employ a conditional breakpoint, with an impossible condition (say, 1 == 2). Now our trace executes over the system call flawlessly without the need for us to lift a finger.
http://www.ring3circus.com/rce/tracing-over-system-calls-in-ollydbg/