Bypassing Debug Registers Protection [Archive] - RCE Messageboard's Regroupment

View Full Version : Bypassing Debug Registers Protection

Opcode

February 29th, 2004, 21:37

I wrote an article that involves a new method to bypass Debug Register protections, in the particular case, a rootkit detector named PatchFinder.
Maybe the method can be useful to someone.

The article:
http://packetstormsecurity.org/filedesc/bypassEPA.pdf.html

Regards,
Opcode

volodya

March 1st, 2004, 13:12

The idea is funny. Definitely worth thinking around

Opcode

March 1st, 2004, 13:53

Quote:

[Originally Posted by volodya]The idea is funny. Definitely worth thinking around

Why the idea is funny?
Can you teach us another method to bypass a doubleword memory position read-write protected by an Debug Register and the GD flag in DR7 actived?

I'll really apreciate your answer, volodya.

P.S.: Did you readed the PatchFinder2 source code ?

volodya

March 1st, 2004, 14:32

Irony is the mighty thing my friend

The answer will be - it depends on who is the first on int1.
Anyway, the idea is interesting and I'm going to thoroughly investigate the sources of Joanna's tool, so thank you again

Opcode

March 1st, 2004, 14:51

Quote:

[Originally Posted by volodya]
The answer will be - it depends on who is the first on int1.

That's the whole point of my article.
Its obvious that is the Rooktit Detector the first on int1.

I expect to soon publish the first undetectable rootkit by PatchFinder.

Regards,
Opcode

mmk

March 1st, 2004, 20:45

Opcode I read your article to circumvent the simple Debug Register protection. There are many ways to write to the IDT. You could write all 8 bytes to IDT[1] at once (use CMPXCHG8B) and then you'll get a debug breakpoint due to the write breakpoint @ IDT[1]. Of course, you were smart enough to write your handler to IDT[1] so you'll now get control. Now you have control of the debug registers.

Or you could setup your own IDT, install your own INT1 handler, generate a debug breakpoint, and your handler now has control of the debug registers. From here, restore original IDTR and do whatever you wish with the debug regs.

The protection they use is lame. Obviously they don't know much about the Intel architecture.

Happy rootkitting

Opcode

March 1st, 2004, 22:37

Quote:

[Originally Posted by mmk]There are many ways to write to the IDT. You could write all 8 bytes to IDT[1] at once (use CMPXCHG8B) and then you'll get a debug breakpoint due to the write breakpoint @ IDT[1].

Hi, mmk.
Thanx for reading my article.

I never used the CMPXCHG8B instruction, but currect me if I'm wrong:

1. The PatchFinder utility, once installed protect with DR0 and DR1 all the 8 bytes of IDT[1] (write protection).
2. It enable the GD bit in DR7. If any driver
attemp to access any DRx register, the PatchFinder will see it.

So, if I try to use CMPXCHG8B in the IDT[1], I think that I'll don't
work because this will trigger the DR0 and DR1! Do you agree?

Look in the driver.c source code of PatchFinder the following:

Code:

#ifdef DR_PROTECTION	

	KdPrint (("pf: setting hardware protection...\n");



	dbProtect (DB_DR0, (int)getIntGateAddr(NT_DEBUG_INT), DB_LEN_4B, DB_PROT_WRITE);

	dbProtect (DB_DR1, (int)getIntGateAddr(NT_DEBUG_INT)+4, DB_LEN_4B, DB_PROT_WRITE);

	dbProtect (DB_DR2, (int)NewDebugHandler1, DB_LEN_4B, DB_PROT_RW);

	dbSetGeneralProtection ();

#endif

volodya

March 1st, 2004, 22:48

Opcode, consider another idea. We are in the multi-process system! Each and every task (including your so-loved drivers) gets the quantums of time from the OS-scheduler. What prevents one from disconnecting the driver from the scheduler, ah?

Opcode

March 1st, 2004, 22:59

Quote:

[Originally Posted by volodya]Opcode, consider another idea. We are in the multi-process system! Each and every task (including your so-loved drivers) gets the quantums of time from the OS-scheduler. What prevents one from disconnecting the driver from the scheduler, ah?

Hi, volodya

If you read the book "Inside Windows 2000" you will see that the
scheduler is in deep controlled by 3 structures:

- KiDispatcherReadyListHead
- KiWaitInListHead
- KiWaitOutListHead

So, I can hook/change this structures and I can intercept any access to it.
But, suppose that such "disconnector" driver exists. What parameters
shold this driver use to find what is the correct driver to disconnect from
the scheduler. Heuristics??? Binary pattern search??? Come on!!!

The objective of my research is show that there is no safe way to protect
a system against rootkit technology, because the kernel structure of
Windows is poorly architected in the security point-of-view.
The IPD was a good idea, but fail.

Regards,
Opcode

P.S.: I don´t really like drivers, I like ring zero programming

sgdt

March 1st, 2004, 23:39

Ring 0 is definitely fun. Drivers can be fun too. Using the "standard" way of getting ring 0 (drivers) isn't as boring as you make it out to be, there is still plenty of fun to be had.

As you can probably guess from my user name (sgdt), I too enjoy other ways of getting to ring 0...

mmk

March 2nd, 2004, 00:14

Quote:

[Originally Posted by Opcode]1. The PatchFinder utility, once installed protect with DR0 and DR1 all the 8 bytes of IDT[1] (write protection).

The IDT is still writable (I assume PatchFinder doesn't write-protect it; it only sets two DWORD write BPs at IDT[1]). The debug breakpoint exception occurs only AFTER the write has occured. Thus, if you write a new IDT[1] entry with the CMPXCHG8B instruction, it will be your debug handler that will be called and not whatever was there before the write.

Quote:

2. It enable the GD bit in DR7. If any driver
attemp to access any DRx register, the PatchFinder will see it.

You don't need to use the debug registers.

Quote:

So, if I try to use CMPXCHG8B in the IDT[1], I think that I'll don't
work because this will trigger the DR0 and DR1! Do you agree?

You will get a debug breakpoint, but since you wrote your handler to IDT[1], you will get control, and not PatchFinder. Remember that for data breakpoints, the debug breakpoint exception is generated AFTER the read or write to the address with the breakpoint. Execution breakpoints are different since they occur BEFORE an instruction is executed.

I checked the Intel manual and CMPXCHG8B might not be so good because you'd have to know the value of IDT[1]. But you could use MMX instructions to write to it though, eg. MOVQ [esi+8],MM0 (esi=idt).

Another way to get control is to create your own temp IDT and generate a debug exception.

cli
lidt my_idtr
icebp ; DB 0F1h

...
my_idtr:
dw 0Fh
dd linear addr of temp idt

temp idt:
dd ?,?
dd x,y ; your INT1 descriptor here

Whenever a debug exception is generated, the processor clears the DR7.GD bit and sets some DR6 bits. Before returning, you would want to set DR7.GD again, clear some bits in DR6 that got set due to the debug exception, and restore the old IDTR.

There are other ways of detecting extra CPU cycles from rootkits. Here are a few:

* Use RDTSC to count processor clock cycles
* Use some performance monitoring MSRs to read processor-specific counters. Check the Intel/AMD manuals for processor specific MSRs. There are lots of them that can be used.

Expect future "rootkit detectors" to use these techniques. But remember that you can also stop these MSRs.

Kayaker

March 2nd, 2004, 01:56

Hi

There's a lot of interesting systems information coming out of this rootkit stuff, including the BlackHat papers and SANS conference publications by Hoglund, Rutkowska, etc. Getting a deeper understanding of it all is definitely on my to-do list.

Opcode's interesting article and Volodya's comment on multiprocessor systems allows me to ask a question I've been wondering about for a while. As I recall, each processor sets up its own IVT and when switching from real->protected mode for the first time makes a copy of it with LIDT as an IDT. I was playing with some old IDT hooking code on my new multiprocessor system when I noticed that the SIDT command would somewhat randomly retrieve either IDT table address, depending it seemed on which processor was currently active.

The second 'processor' is an Intel ICH5R South bridge controller chip that supposedly integrates various I/O functions associated with IDE, APIC, USB, PCI functions etc. I assume the system flip flops from this subsytem processor to the main 82865PE chip as required.

I noticed this behaviour because in my original code (written on an older single processor system), I used the SIDT command to retrieve the IDT in both my hooking AND unhooking code. i.e. I didn't save the "original" address of the IDT that I hooked for later unhooking, instead just blindly assumed SIDT would retrieve the same address. On my new 2 processor system this would sometimes fail and I would end up unhooking a different IDT table than which I orginally hooked. If I used the Softice IDT command at the time, it too would confirm that either one or the other IDT table addresses was active.

So my point/question then is, Opcode's code also uses the SIDT command (as does most generic IDT hooking code), but is it possible that it might retrieve the address of one of multiple IDT tables? Or does it matter and this is of no concern?

On my Win2K system the "main" IDT is at 80036400, which I believe is what most people have. The second IDT is an almost duplicate at 820A57E8. I say 'almost' because INT1 and INT3 have slightly different addresses when Softice is loaded, but they do lead to the same interrupt handler in Compuware's cpthook.sys driver.

If one were to do a byte search in memory for one of the interrupt addresses listed in the main IDT, one could probably find this second duplicate IDT (assuming you have a 2-processor system). As an example, here are the beginning of my 2 IDT's as listed by Sven Schreiber's w2k_spy:

Code:



Main processor IDT:

80036400..800364FF: 256 valid bytes



Address  | 0000 0002-0004 0006 : 0008 000A-000C 000E | 00 02 04 06 08 0A 0C 0E

---------|---------------------:---------------------|------------------------

80036400 | 6CF0 0008-8E00 8046 : 7729 0008-EE00 B3A8 | l= .. ?. ?F w) .. e. ¦¿

80036410 | 7747 0008-8E00 B3A8 : 7765 0008-EE00 B3A8 | wG .. ?. ¦¿ we .. e. ¦¿

80036420 | 7284 0008-EE00 8046 : 73C8 0008-8E00 8046 | r? .. e. ?F s+ .. ?. ?F

...



Subsystem processor IDT (so it seems):

820A57E8..820A58E7: 256 valid bytes



Address  | 0008 000A-000C 000E : 0010 0012-0014 0016 | 08 0A 0C 0E 10 12 14 16

---------|---------------------:---------------------|------------------------

820A57E8 | 6CF0 0008-8E00 8046 : 7738 0008-EE00 B3A8 | l= .. ?. ?F w8 .. e. ¦¿

820A57F8 | 7756 0008-8E00 B3A8 : 7774 0008-EE00 B3A8 | wV .. ?. ¦¿ wt .. e. ¦¿

820A5808 | 7284 0008-EE00 8046 : 73C8 0008-8E00 8046 | r? .. e. ?F s+ .. ?. ?F

...

Anyway, I just wanted to bring up this multiple IDT situation and wonder how it fits in to any rootkit scenario where IDT patching is involved.

Cheers,
Kayaker

mmk

March 2nd, 2004, 02:35

1. I hope you're tired Kayaker because your "Intel ICH5R SB" cannot execute x86 code, or any code for that matter.

2. If you have more than one logical (hyperthreading) or physical processor, you will have more than one IDT. Each processor has its own IDTR register and thus it must be initialized. It can be the same value between different processors but I guess MS has a reason for not doing that.

3. The OS switches tasks, but a rootkit is part of the OS and can easily disable interrupts. That's why my example code used "cli" before switching to the temp IDT or an interrupt could've occured afterwards leading to a CPU reset.

Opcode

March 2nd, 2004, 08:56

Hi, Kayaker and mmk.

Unfortunately, I dont have access to a multiprocessor system, so multiple IDT is new for me.

I expect to have access to one asap.

mmk, are you sure that

Quote:

[Originally Posted by mmk]The debug breakpoint exception occurs only AFTER the write has occured.

The #DB (Debug Exception) is a fault or trap ???

Code:

Faults 

	A fault is an exception that can generally be corrected and that, once corrected,

allows the program to be restarted with no loss of continuity. When a fault is

reported, the processor restores the machine state to the state prior to the beginning

of execution of the faulting instruction. The return address (saved contents

of the CS and EIP registers) for the fault handler points to the faulting instruction,

rather than the instruction following the faulting instruction.



Traps 

	A trap is an exception that is reported immediately following the execution of

the trapping instruction. Traps allow execution of a program or task to be

continued without loss of program continuity. The return address for the trap

handler points to the instruction to be executed after the trapping instruction.

Opcode

March 2nd, 2004, 09:03

Look at the Intel® Architecture Software Developer’s Manual
Volume 3:

Code:

15.3.1. Debug Exception (#DB)—Interrupt Vector 1

The debug-exception handler is usually a debugger program or is part of a larger software

system. The processor generates a debug exception for any of several conditions. The debugger

can check flags in the DR6 and DR7 registers to determine which condition caused the exception

and which other conditions might also apply. Table 15-2 shows the states of these flags

following the generation of each kind of breakpoint condition.

Instruction-breakpoint and general-detect conditions (see Section 15.3.1.3., “General-Detect

Exception Condition”) result in faults; other debug-exception conditions result in traps. The

debug exception may report either or both at one time. The following sections describe each

class of debug exception. See Chapter 5, “Interrupt 1—Debug Exception (#DB)”, for additional

information about this exception.

mmk

March 2nd, 2004, 11:31

#DB is both a fault and a trap. It depends on what triggered it.

faults: IBPs and DR7.GD #DBs
traps: DBPs, EFLAGS.TF, and the rest

Just do what I described in my previous posts and you will control the debug regs.

Opcode

March 2nd, 2004, 12:55

Quote:

[Originally Posted by mmk]
Just do what I described in my previous posts and you will control the debug regs.

I'm writing a driver to test what you'd said.

volodya

March 2nd, 2004, 14:39

I'm a little bit shocked

He told you the truth... It is a trap. But how come? Are the intel guys idiots? So, GD is nothing?

mmk

March 2nd, 2004, 16:56

DR7.GD=1 causes a #DB fault exception.

A data breakpoint is a trap. That's why you can write your IDT handler to IDT[1] and get control next time #DB is triggered. In the PatchFinder example, it would be triggered right after you write to IDT[1].

To Opcode, I'd recommend using your own temp IDT instead of using MMX registers to write to the IDT since you would otherwise change the FPU/MMX registers and other FPU/MMX state which the user process may be using.

Code:

; some init func

	mov	eax,offset int1_handler

	mov	my_idt+8,eax

	mov	my_idt+8+4,eax

	mov	word ptr my_idt+8+2,cs

	mov	word ptr my_idt+8+4,8E00h ; 32-bit int gate



; some anti-patchfinder rootkit code

	pushfd

	cli

	sub	esp,8

	sidt	[esp+2]

	lidt	new_idtr

	db	0F1h	; icebp => triggers #DB

	jmp	skip

int1_handler:

	lidt	[esp+3*4+2]	; 3*4 = pushed eflags,cs,eip

; you may have to restore old (pushed) EFLAGS.IF here

	call	do_whatever	; rootkit code 

	mov	eax,dr6

	and	eax,not (1 shl ???) ; clear bit that got set due to icebp instruction

	mov	dr6,eax

	mov	eax,dr7

	or	eax,1 shl 13	; set dr7.gd

	mov	dr7,eax		; only do this if dr7.gd was set by eg. patchfinder

	iretd

skip:

	add	esp,8

	popfd



...

new_idtr	label fword

	dw	8*2-1

	dd	offset my_idt

my_idt	dd	?,?

	dd	?,?

That code is untested so there may be a mistake or two.

Opcode

March 2nd, 2004, 18:06

Thanx, mmk

I'll test your code.
I hope to finish the driver tomorrow.

Is really nice to find people to share knowledge.

Kayaker

March 3rd, 2004, 08:11

Quote:

[Originally Posted by mmk]If you have more than one logical (hyperthreading) or physical processor, you will have more than one IDT.

Right you are. If I turn off Hyperthreading in the bios the second IDT disappears and the number of active processors reported on my system (say with ZwQuerySystemInformation / class SystemBasicInformation) drops from 2 to 1. Reading the Intel docs on hyperthreading was illuminating, one physical processor appears as two logical processors and each maintains a complete set of the architecture state including most of the registers.

I wasn't under the illusion the subprocessor was executing any user code, just that it was the 2nd processor reported (physical) and was in some way the cause of the extra IDT table. What bothered me with that though was that I had d/l the specs for the chip and didn't see an IDTR register. I'm glad that hyperthreading provides a "logical" explanation

Kayaker

Opcode

March 4th, 2004, 10:11

Yes, it works !!! Tanx, mmk.

Now I'll dedicate to find some another way to hide process to
avoid Klister (www.rootkit.com) detection.

Regards,
Opcode

volodya

March 5th, 2004, 19:03

M-m-m, it has been already done. 90210/WASM.RU has written the utility that works directly with EPROCESS and avoids Klister's detection. Your previous article is much-much more interesting thing.

Opcode

March 6th, 2004, 00:01

Quote:

[Originally Posted by volodya]M-m-m, it has been already done. FatMoon/WASM.RU has written the utility that works directly with EPROCESS and avoids Klister's detection. Your previous article is much-much more interesting thing.

Hi, volodya.
Can you send me a copy of the text/article/code, because I can´t
actually access the wasm.ru site. It's always offline for me.

But I can´t believe in just manipulation of EPROCESS, because the list that the Dispatcher uses points to a linked list of ETHREADS structs.
Well, but if you said, I believe

I'm waiting for the article, thanx a lot !!!

evaluator

March 7th, 2004, 06:52

hi here.

little comment:

mmk wrote in source

"clear bit that got set due to icebp instruction"
F1h-Icebp should NOT modify DR6. OK (or not to OK)?

new IDT+LIDT is good method here.

mmk

March 7th, 2004, 07:17

Quote:

[Originally Posted by evaluator]mmk wrote in source

"clear bit that got set due to icebp instruction"
F1h-Icebp should NOT modify DR6. OK (or not to OK)?

Each time #DB is triggered the CPU should update DR6. I don't remember if ICEBP will set a flag in DR6, but if it does, I think it sets the same flag as EFLAGS.TF=1 #DBs do.

volodya

March 7th, 2004, 19:44

Quote:

Can you send me a copy of the text/article/code, because I can´t
actually access the wasm.ru site. It's always offline for me.

http://29a.host.sk/main.html -->

29a7\Binaries\90210\PHIDE.ZIP
29a7\Utilities\29A-7.026

You go get it

tgodd

March 8th, 2004, 01:57

So now propose what you are going to do in the event that the application clears the GD flag.
How are you to trap that?

Good luck....

TGODD

mmk

March 8th, 2004, 05:31

Quote:

[Originally Posted by tgodd]So now propose what you are going to do in the event that the application clears the GD flag.
How are you to trap that?

Good luck....

TGODD

Which app? Rootkit or anti-rootkit?

anti-rootkit: DR7.GD cleared => probably rootkit present.
rootkit: always restores DR7.GD to its original state

evaluator

March 8th, 2004, 06:16

mmk,

don't say: "I think.."

always test your ideas.
& Icebp works like 0CCh(e.g. INT-call).

mmk

March 8th, 2004, 07:39

Quote:

[Originally Posted by evaluator]mmk,

don't say: "I think.."

always test your ideas.
& Icebp works like 0CCh(e.g. INT-call).

don't say: "always test your ideas." unless you have "tested your ideas" yourself.

ICEBP, like any other #DB will update DR6. When I said "I think", it was because I remember it changing something but didn't remember what. I checked some old notes and DR6.B[3:0] is updated.

So you're wrong in saying that ICEBP is like INT3 (#BP). INT3 doesn't update DR6. ICEBP does.

tgodd

March 8th, 2004, 08:26

The hardlock driver by alladin clears the GD flag.
I am not familiar with the rootkit, all of my code is
written by following the intel docs.

The GD bit is not trapped, nor, as far as I can read
from the intel docs, is there a way to trap if the GD
bit gets manipulated.

tgodd

evaluator

March 8th, 2004, 16:30

ok, i done some tests from Ring3, & seems F1 updates B0-B3 bits in DR6 register.
so F1 can't be treated as INT-call.
so i was wrong.

Thanks

Opcode

March 9th, 2004, 07:22

Hi, volodya

PHIDE was a nice try, conecting the PID and EPROCESS to another process was original. But look at:

www.rootkit.com - "Cheating klister?"

Regards,
Opcode

evaluator

March 9th, 2004, 16:38

i want write 1 note here

while set-up-ing new IDT,
there IS the way to Read/Write in memory WITHOUT triggering active DRx.

**
"That is.." - said Intlel

volodya

March 9th, 2004, 20:25

evaluator
there IS the way to Read/Write in memory WITHOUT triggering active DRx

Do you mean the original article? Via physical memory?

evaluator

March 10th, 2004, 03:07

yes, i mean via phisical memory.
what is "original article"?

mmk

March 10th, 2004, 04:23

Original article = original post made by Opcode in this thread.

When paging is enabled, you can't use physical addresses; all addresses are linear addresses. What you and Opcode have done is map the same physical address to different linear addresses. This is nothing special at all, but of course it can be used to fool programs such as PatchFinder.

evaluator

March 10th, 2004, 04:52

heh, heck me, i did not read this article, because of its title.

here i want to write some notes for author:

1.
>we found the PHYSICAL address of 0x8003F408 to be 0x03F00
maybe 0x0003F000 ?(1 more zero)
>(value obtained experimentally).
why so? you simple can grab it from PAGE TABLE.
& after also simply put it in new place of PAGE TABLE.

2.
>invoke ExAllocatePool
easier is MmAllocateNonCachedMemory

3. Why you need to disable WP at all?

4. after updating PAGE TABLE, will good WBINVD, etc.

5.>Once time in Ring Zero, forever Ring Zero.
for this Intl created 4 rings..

Opcode

March 10th, 2004, 07:30

Quote:

[Originally Posted by evaluator]
1.
>we found the PHYSICAL address of 0x8003F408 to be 0x03F00
maybe 0x0003F000 ?(1 more zero)

Yes, one more zero (error in typing)

Quote:

[Originally Posted by evaluator]
>(value obtained experimentally).
why so? you simple can grab it from PAGE TABLE.
& after also simply put it in new place of PAGE TABLE.

value obtained experimentally means just a supposition.
I get the value from PAGE TABLE.

Quote:

[Originally Posted by evaluator]
2.
>invoke ExAllocatePool
easier is MmAllocateNonCachedMemory

Yes, but is just a proof-of-concept, just to show how-to.
Thanx for the hint.

Quote:

[Originally Posted by evaluator]
3. Why you need to disable WP at all?

Some memory areas of XP are SUPERVISOR level, but for READING-ONLY,
then, just for precaution, I use this. (P.O.C)
Read www.wiretapped.net/~fyre/sst.html

Quote:

[Originally Posted by evaluator]
4. after updating PAGE TABLE, will good WBINVD, etc.

Another good hint

Quote:

[Originally Posted by evaluator]
5.>Once time in Ring Zero, forever Ring Zero.
for this Intl created 4 rings..

In this point, I'm refering to the Ring-0 war between softwares.
The rootkit versus detector. Another rings are useless in this
'war'. It´s not possible to defeat rootkits attacks without ring-0.

I'm still waiting the Joanna response. After she does, I'll make
a more depurated tool.

Regards,
Opcode

mmk

March 10th, 2004, 08:08

invlpg [mem] or reloading of cr3 (mov eax,cr3 / mov cr3,eax) is the preferred and documented way of flushing the TLBs. wbinvd may work, but it's not documented to flush TLBs (if it does do that in some processors). wbinvd is used for the data caches. It Writes Back and INValiDates the data in the data caches.

evaluator

March 10th, 2004, 09:55

>Some memory areas of XP are SUPERVISOR level, but for READING-ONLY,
>then, just for precaution, I use this.

Nop, WP not helps you here. (re-read manual).
For this case only way is to check & set R/W in PTE.

mmk

March 10th, 2004, 10:05

Quote:

[Originally Posted by evaluator]Nop, WP not helps you here. (re-read manual).
For this case only way is to check & set R/W in PTE.

Read the Intel manual (or test yourself

). Check Table 4.2 where it says the supervisor (non-ring3 code) always can write to any present page no matter what the page tables say so long as CR0.WP=0. If CR0.WP=1 the processor will check the page tables even for supervisor writes.

evaluator

March 10th, 2004, 12:50

in chapter 2.5 done main description of WP flag.

Now i read in 4.11.3

>When the processor is in supervisor mode and the WP flag in register CR0 is clear
>all pages are both readable and writable.

this absolutely another BTW(?)-future of WP flag not explained in main description.

again, Thanks me for reading that + Sorry me from Intl.