Debugger detection methods... WHEN to call them? [Archive]

View Full Version : Debugger detection methods... WHEN to call them?

kunai

November 20th, 2010, 11:15

Hi. I've implemented various ant-re methods to detect debugger presence. I'm not trying to create unbreakable app, just wanting to see how it all works. So, having those method I can call them at the beginning of program and they will do what they should... but if app is not debugged at that time but instead dbg will be attached later - the whole mechanism is broken.

So here's my question - when and how often should those method be called? Should I have separate thread for this? (somehow it does not sound reliable...)

Indy

November 21st, 2010, 10:25

o The ring of transitions.
- Features implementing functions of converting trap-frame. For example: register using Gs instead of the Ds(debugger its zeroes, ie the trace is not available), context on return from service and current(Ecx & Edx, EFlags etc., morph perfect technique..), the use of zeroing in segments between the ring switching(Iret, Retf. if the selector is zero, then the field of RPL reset the processor), determining whether a swapping(Smsw), etc.
o The use of delays in the trace and using the debug port(SEH, VEH, UEF), measurement of the delay in between the ring transitions(Int 0x2A etc).
o Direct detection of the debug port and its removal(ProcessDebugObjectHandle, NtRemoveProcessDebug etc.)
o Exit from a trace(NtContinue, Int 0x2B - very effective ways to U-mode impossible to cheat

).
o And many, many other ways

kunai

November 21st, 2010, 15:44

Indy, thank you for your answer

I think, however, that you misunderstood my question. The second option is that your answer is too complex for me to understand

I have already implemented some anti-re methods, not so complex as those that you described, although for now I will stick to what I already have.

Let's say I have GUI app. At the beginning of main() I do bunch of anti-re checks. If they completed successfully it means no debugger is attached. Then app starts doing some real code, like secret algorithm. And NOW someone attach debugger which obviously is not detected in this approach.

My ideas are to either put those checks in one of (often triggered) event handlers, or into separate thread. Which one is better?

Indy

November 23rd, 2010, 21:36

kunai
This is a simple and effective methods. Can be divided into several types:
o Detection of trace(TF).
o Detection of breakpoints.
o Detection of debug port.
o Outside of trace.
o Bypass breakpoints.
o Disable debug port or hide the thread of it.

To reset the TF can use the following methods:
o Services NtSetContextThread, NtRaiseException and NtContinue allow you to load in the context of flags without imposing on them a mask for TF. Например:

Code:

Context.TF -> 0

NtContinue(@Context)

o Using the shadow callbacks. There is a stack of shadow callbacks. For each entry stored context and on return from callback loaded into the processor. In the context of TF also copied a previously saved:

Code:

; +

; Сохранение контекста.

; o Eax: ID NtUserEnumDisplayMonitors.

; o Esi: адрес возврата.

; o Edi: ссылка на стек.

;

TsSave proc C

	xor ecx,ecx

	push esp	; Параметр для калбэка - ссылка на стек для его восстановления.

	Call @f	; Ссылка на калбэк.

; Калбэк.

; typedef BOOL (CALLBACK* MONITORENUMPROC)(HMONITOR, HDC, LPRECT, LPARAM);

	mov esp,dword ptr [esp + 4*4]	; Восстанавливаем стек, ссылка для восстановления передаётся параметром.

	xor eax,eax

	retn

@@:

	push ecx

	push ecx

	mov edx,esp

; BOOL

; NtUserEnumDisplayMonitors(

;     IN HDC             hdc,

;     IN LPCRECT         lprcClip,

;     IN MONITORENUMPROC lpfnEnum,

;     IN LPARAM          dwData)

	Int 2EH	; NtUserEnumDisplayMonitors

; При восстановлении контекста возвращается не ноль.

	.if !Eax	; Калбэк не был вызван изза исчерпания лимита вызовов/вложенности(0x2C).

	add esp,4*4	; Удаляем параметры сервиса.

	mov eax,STATUS_STACK_OVERFLOW

	retn

	.else

	mov esp,edi

	add eax,3*4	; Ссылка на стек, который был при вызове TsLoad(для передачи параметров.

	jmp esi

	.endif

TsSave endp



; +

; Восстановление контекста.

;

TsLoad proc C	; stdcall

	xor eax,eax

	mov edx,3*4

	push eax

	push eax

	push esp

	mov ecx,esp

	Int 2BH

	add esp,3*4

	retn		; В случае ошибки возвратит STATUS_NO_CALLBACK_ACTIVE.

TsLoad endp

or macro:

Code:

QUERY_SERVICE_ID macro

	mov eax,dword ptr [_imp__EnumDisplayMonitors]

	mov eax,dword ptr [eax + 1]	; ID NtUserEnumDisplayMonitors

endm



; +

; Сохранение контекста.

; o Сохраняются регистры:

;   - EFlags

;   - Ebp

;   - Ebx

;   - Esp

; o NPX не сохраняется.

;

SAVE_TASK_STATE macro Ip

Local Break

	mov esi,Ip

	mov edi,esp

	Call @f

	jmp Break

@@:	

	pushad

	xor ecx,ecx

	push esp

	Call @f

	mov esp,dword ptr [esp + 4*4]

	popad

	retn

@@:

	push ecx

	push ecx

	QUERY_SERVICE_ID

	mov edx,esp

	Int 2EH

	xor eax,eax

	mov esp,edi

	jmp esi

Break:

endm



; Восстановление контекста.

;

RESTORE_TASK_STATE macro

	xor eax,eax

	mov edx,3*4

	push eax

	push eax

	push eax

	mov ecx,esp

	Int 2BH

endm

This method is not possible to cheat in U-mode. Stack of shadow callbacks in the kernel and is not available to user code, even for reading

http://wasm.ru/forum/viewtopic.php?id=37300

o At the kernel of TF can be reset, loading descriptor in the IDT and calling gateway.

o Removing breakpoints. They can be of several types(hardware Dr*, in memory and on memory). Hardware detected and cleared as well as TF. For memory type PAGE_GUARD duped change attribute memory or load limits of the region in TEB.Tib(StackBase & StackLimit). Then the memory manager will not deploy an exception. Setting breakpoints in memory can be disabled altogether. To this end, the code must be placed in not the file section of the map only for reading. This mechanism can be used for the entire module, remap all its file sections. After such a manipulation of a write in memory will not be possible.

o Disable debug port is a system mechanism. Service NtQueryInformationProcess(ProcessDebugObjectHandle) copies handle port, then port is disconnected(service NtRemoveProcessDebug). It is also a way to detect the port. Thread can be marked for exclude from the Dbg-processing(NtSetInformationThread(ThreadHideFromDebugger)). In the presence of a debug port of the processing speed of exceptions is slow in reduced to ~200%2000 times.

o Using registry Gs. This register is not changed scheduler, and reset when converting trap-frame in the context. The code uses it can not be traced without changing the kernel http://wasm.ru/forum/viewtopic.php?id=38465&p=1

o Detection of TF in the context of the just, for this there are many ways and it makes no sense to describe them.

Iwarez

November 24th, 2010, 16:06

I think it's best to not notify the user instantly. You are better of killing your app a step at a time when you detect a debugger. The checks are often not very time consuming. So throw one check in a timer, another one when a user clicks a button, another when a user selects a menuitem and mess with the results of the program if your check returns positive. If there is no messagebox or program termination it's much harder to detect the antidebugger code.

Indy

November 25th, 2010, 15:04

Iwarez
For example the use of Gs in the trace itself is not possible, and no other checks are not needed. Can proceed as follows. Assemble code into a special format(mutation independed), to create a graph for it, modify it by introducing a set of prefixes Gs and compile the graph. Ie use morphing. This code will normally displayed in the disassembler(debugger), but tracing it is very problematic - for this will need to patch the kernel. For example:

Code:

	push ds

	pop gs	; KGDT_R3_DATA | RPL_MASK

	mov eax,dword ptr gs:[Var]

Maximus

November 26th, 2010, 06:52

indy, may i make you a question? The NtContinue trick is cool, but I am having some problem with the R3 callback.

* we call directly a GDI r0 function that uses R3 callback - ok (by the way, do we need to go manually on it, or we can just use the 'legal' call?)
* Since the GDI needs to support multiple calls to the kernel, as callbacks can call the kernel, a sequence of stack frames are saved, including processor state for each sequence.
* You start with a call to the R0 function then we let allthe code executes normally, then let's say we start tracing the code with a debugger, at some point in the traced code we execute the 'end of callback' code, so our previous register status (with TF=0) gets restored and we are freed of tracing, right?

The GS trick: I am a bit of lost for it - I will try it for sure, but i would like to understand why...
hmm... the base of GS selector is changed like FS for addressing kernel structures, so that while stepping you HAVE kernel kicked in so the base is restored with kernel gs base...
however, if you catch a random interrupt in beneath pop and mov during normal execution??

Indy

November 26th, 2010, 10:32

Maximus

Quote:

The GS trick: I am a bit of lost for it - I will try it for sure, but i would like to understand why...

Kernel manipulates his native state of the task. This is a trap frame. When the conversion to a custom structure describes the context run some amendments. You can follow any responses to see it in the functions KeContextFromKframes() & KeContextToKframes(). The scheduler uses these functions, moreover all the segment registers stored in the trap frame. The scheduler does not cause these functions, moreover all the segment registers stored in the trap frame. This does not affect the planning. The debugger calls these functions to get/change the context. And he can not get/set register Gs, since it is reset by these functions.

Quote:

we call directly a GDI r0 function that uses R3 callback - ok (by the way, do we need to go manually on it, or we can just use the 'legal' call?)

We can use a large number of system services, causing shadow callbacks. When you entry into the kernel trap frame is stored, must be cleared by TF. When you entry into the kernel of callback service is completed and saved the trap frame is loaded into the processor. This mechanism supports is full of recursive calls. See KeUserModeCallback(), KiCallbackReturn(), NtCallbackReturn() etc.

Indy

November 27th, 2010, 08:00

Build a simple sample. Code MessageBox() is transferred to the buffer that is in the map does not file section, available for R/E. In this case, change the memory access is impossible, ie to put breaks in the memory will not work:
http://img827.imageshack.us/img827/2660/hdscr.png

Code:

	_imp__MessageBoxA proto :HANDLE, :PSTR, :PSTR, :ULONG

	

%NTERR macro

	.if Eax

	Int 3

	.endif

endm



.data



$Message	CHAR 16 DUP (?)



.code

	include Gcbe.inc



%ALLOC macro vBase, vSize, vProtect, cSize, Reg32

	mov vBase,NULL

	mov vSize,cSize

	invoke ZwAllocateVirtualMemory, NtCurrentProcess, addr vBase, 0, addr vSize, MEM_COMMIT, PAGE_READWRITE

	mov Reg32,vBase

	%NTERR

	add vBase,cSize - X86_PAGE_SIZE

	mov vSize,X86_PAGE_SIZE

	invoke ZwProtectVirtualMemory, NtCurrentProcess, addr vBase, addr vSize, PAGE_NOACCESS, addr vProtect

	%NTERR	

endm



$Title	CHAR "Ip's:",0



Ep proc

Local GpSize:ULONG

Local Snapshot:GP_SNAPSHOT

Local Protect:ULONG

Local CsBase:PVOID, CsSize:ULONG

Local BiBase:PVOID, BiSize:ULONG

Local ObjAttr:OBJECT_ATTRIBUTES

Local SectionSize:LARGE_INTEGER

Local SectionOffset:LARGE_INTEGER

Local ViewBase:PVOID, ViewSize:ULONG

Local SectionHandle:HANDLE

	%ALLOC Snapshot.GpBase, GpSize, Protect, 200H * X86_PAGE_SIZE, Ebx

	mov Snapshot.GpLimit,ebx

	mov Snapshot.GpBase,ebx

	lea ecx,Snapshot.GpLimit

	push eax

	push eax

	push eax

	push eax

	push eax

	push 4

	push GCBE_PARSE_SEPARATE

	push ecx

	push dword ptr [_imp__MessageBoxA]

	%GPCALL GP_PARSE

	%NTERR

	mov eax,Snapshot.GpLimit

	xor edx,edx

	sub eax,ebx

	mov ecx,ENTRY_HEADER_SIZE

	div ecx

	invoke udw2str, Eax, addr $Message

	%ALLOC CsBase, CsSize, Protect, 200H * X86_PAGE_SIZE, Esi

;	%ALLOC BiBase, BiSize, Protect, 200H * X86_PAGE_SIZE, Edi

	xor eax,eax

	cld

	mov ecx,sizeof(OBJECT_ATTRIBUTES)/4

	lea edi,ObjAttr

	mov dword ptr [SectionSize],100H * X86_PAGE_SIZE

	mov dword ptr [SectionSize + 4],eax

	rep stosd

	mov ObjAttr.uLength,sizeof(OBJECT_ATTRIBUTES)

	invoke ZwCreateSection, addr SectionHandle, SECTION_ALL_ACCESS, addr ObjAttr, addr SectionSize, PAGE_EXECUTE_READWRITE, SEC_COMMIT, NULL

	%NTERR

	mov ViewBase,eax

	mov ViewSize,eax

	mov dword ptr [SectionOffset],eax

	mov dword ptr [SectionOffset + 4],eax

	invoke ZwMapViewOfSection, SectionHandle, NtCurrentProcess, addr ViewBase, 0, 0, addr SectionOffset, addr ViewSize, ViewShare, NULL, PAGE_EXECUTE_READWRITE

	%NTERR

	push ViewBase	; push edi

	push esi

	push Snapshot.GpLimit

	push Snapshot.GpBase

	%GPCALL GP_BUILD_GRAPH

	%NTERR

	invoke ZwUnmapViewOfSection, NtCurrentProcess, ViewBase

	%NTERR

	invoke ZwMapViewOfSection, SectionHandle, NtCurrentProcess, addr ViewBase, 0, 0, addr SectionOffset, addr ViewSize, ViewShare, NULL, PAGE_EXECUTE_READ

	%NTERR

	push MB_OK

	push offset $Title

	push offset $Message

	push eax

	call ViewBase

	ret

Ep endp

2363
Possible to automatically generate a series of Gs-prefixes. This will make the debugger useless

The same manipulation can be done remap image.

Maximus

December 2nd, 2010, 20:21

Hi Indy - thanks alot.

Unfortunately I'm on an x64 system, and I am scared to even think of using WinDBG for anything different from crash analysis... I'll re-setup a VM and put something like Syser or such on it for examining the calls you suggest.

nice shared library, by the way

Indy

December 19th, 2010, 06:24

Sample displays a message. The code is automatically reconstructed and added to Gs-series. Tracing is not possible.

2376