OpenRCE_omega_red
November 24th, 2007, 18:50
During development of my unfinished crackme I encountered several interesting discrepancies in exception handling on various OS/vm configurations.
First of them caused my 32bit code to run fine on 32bit XP but crash on 64bit XP. It was caused by difference in behaviour of (Rtl)RaiseException: on 32bit XP it captured full thread context, but on 64bit XP it didn't - debug registers were missing. And that caused problems, because I wanted to play with DRs in my SEH.
Second "trick" is behaviour of system exception dispatcher regarding DR6 and its status bits. Intel docs say that CPU itself never clears these bits after a hardware breakpoint occurs. 32bit Windows clears them though, but it also depends whether OS is running in a VM or not...
Third glitch is similar to old "prefetch queue" tricks. I didn't have time to investigate it much - maybe its mechanism is completely different, but it surely looks familiar. Let's say we execute UD2 instruction or cause exception in some other way. Then, in SEH handler, we set hardware breakpoint on instruction immediately following the one that caused exception. What will happen after return from SEH? It depends... On some systems bpx will be triggered, on some not... Adding single NOP between exception trigger and bpx target will ensure that bpx always hits.
Following table contains results of my experiments in various environments. Numbers after OS name indicate whether it's 32 or 64bit version, number after slash means that it's running in VMware server on XP 32 or 64bit host.
2k 32 = 32bit 2k on real machine
xp 32/64 = 32bit XP inside VM hosted on 64bit XP
Most OSes were fully updated.
dr6 column shows whether OS clears DR6 status bits after hardware breakpoint hits.
context column shows whether (Rtl)RaiseException captures full CPU context or misses debug registers.
prefetch column shows the last test - whether the tricky breakpoint gets hit without "padding" nop or not.
vista 64/64: In "context capturing" test I've got different results today than some time ago. Not sure what caused the difference - earlier test being run on pre-release build, maybe some updates...
FASM code for all examples:
http://omeg.pl/code/exception_glitches.zip
https://www.openrce.org/blog/view/835/Dancing_with_exceptions
First of them caused my 32bit code to run fine on 32bit XP but crash on 64bit XP. It was caused by difference in behaviour of (Rtl)RaiseException: on 32bit XP it captured full thread context, but on 64bit XP it didn't - debug registers were missing. And that caused problems, because I wanted to play with DRs in my SEH.

Second "trick" is behaviour of system exception dispatcher regarding DR6 and its status bits. Intel docs say that CPU itself never clears these bits after a hardware breakpoint occurs. 32bit Windows clears them though, but it also depends whether OS is running in a VM or not...
Third glitch is similar to old "prefetch queue" tricks. I didn't have time to investigate it much - maybe its mechanism is completely different, but it surely looks familiar. Let's say we execute UD2 instruction or cause exception in some other way. Then, in SEH handler, we set hardware breakpoint on instruction immediately following the one that caused exception. What will happen after return from SEH? It depends... On some systems bpx will be triggered, on some not... Adding single NOP between exception trigger and bpx target will ensure that bpx always hits.
Following table contains results of my experiments in various environments. Numbers after OS name indicate whether it's 32 or 64bit version, number after slash means that it's running in VMware server on XP 32 or 64bit host.
2k 32 = 32bit 2k on real machine
xp 32/64 = 32bit XP inside VM hosted on 64bit XP
Most OSes were fully updated.
dr6 column shows whether OS clears DR6 status bits after hardware breakpoint hits.
context column shows whether (Rtl)RaiseException captures full CPU context or misses debug registers.
prefetch column shows the last test - whether the tricky breakpoint gets hit without "padding" nop or not.
Code:
os dr6 context prefetch
----------------------------------------
2k 32 clear full ?
2k 32/32 preserve full hit
2k 32/64 clear full hit
xp 32 clear full hit
xp 32/32 preserve full hit
xp 32/64 preserve full hit
xp 64 clear partial miss
2k3 32 clear partial ?
2k3 32/32 preserve partial miss
2k3 32/64 preserve partial miss
vista 32 clear partial ?
vista 32/64 preserve full hit
vista 64/64 clear full? hit
vista 64/64: In "context capturing" test I've got different results today than some time ago. Not sure what caused the difference - earlier test being run on pre-release build, maybe some updates...
FASM code for all examples:
http://omeg.pl/code/exception_glitches.zip
https://www.openrce.org/blog/view/835/Dancing_with_exceptions