RSA studying'n reversing'

Data

by Evilcry
 

...
Web: www.cryptorev.da.ru / http://abcd6.interfree.it
E-mail: evilcry@virgilio.it
Channels IRC:  irc.azzurranet.org   #cryptorev #crack-it

...

Difficulty

NewBies () Intermediate (X) Advanced () Master ()
 

RSA studying 'n reversing
Written by Evilcry.

Introduction

This article is based on my experience and other articles on the web.

Tools

-Hiew
-OllyDbg
-RSA Tool 2
-Calc.exe
-Some decrypting tool(as Powmod)

URL or FTP of program

www.google.com

Name of program

Lockless 3 CM (if you want to play a bit).

Essay

RSA overview

From some time I began studying the RSA cryptosystem, for two foundamental reasons, one because it is a great source for gaining knowledge in math & cryptography, and also because I encoutered RSA in some protection systems and initially I did not know that this was RSA!. RSA is a public_key_cipher conceived at M.I.T. in 1978 thanks to a collaboration between Ron Rivest, Adi Shamir and Les Andleman (indeed RSA derives from their names). This is an asymmetric cryptosystem which bases its "power" on a series of features that prime numbers have. This system founded a large field of use in computer security. Now we will see an hypothetical crypting session, keep your eyes open! this is mainly a mathematical discusion. The first step to crypt a message is:

n=p*q

N as we can see from the previous equation, is a product between p and q that are two prime numbers. With a big attention we can affirm that minor are P and Q and minor is the security, so the best is to use a module N properly big. Now is time to determine Ô(n), we can do this with:

Ô(n)=(p-1)*(q-1)

This is an implementation or more precisely an extension of the Euler Theorem. Pratically Ô(n), is a number RELATIVELY prime to (p-1)*(q-1). But at this point someone could pose this question: "What is the real meaning of the term *relatively prime*?", a number is relatively prime to another when these two numbers have no factor in common, except obviously 1. For example 8 and 21 aren't prime numbers, but 8 and 21 will be relatively prime to another number..obviously is 1. Is 1 because this is the only common factor between 8 and 21. The factors of 8 are (1,2,4,8), the factors of 21 are (1,3,7,21). I hope that the concept of "relatively prime" is now clearer.

Another step and we have "finished". Very easly E could the retrived from:

e =Ô(n)

E represents the key (or better the public exponent) thank to which our message will be encrypted. So E should be a prime number. The decryption key (D) could be obtained from:

d = e^(-1) mod ((p-1)*(q-1)) ------->So we whould know 'e', 'p' e 'q'

OK, now we know in general how a session of RSA works, as obviously you have seen during this session there are some values that we know and others that we should discover (in the case of an attack). Now we can build a table that will help us.

P and Q Are prime numbers Not known
N N product of p*q Known
Ô(n) Ô(n)=(p-1)*(q-1) Not known
e Crypto key Not known
d Decrypt key Known

At this table we also add X (the plaintext) and Y (the ciphertext). Now we are into the heart of the algorithm!, the famous:

C=M^e mod n

Where M is the Message to crypt and C is the Crypted message. I think that you are all able to obtain the inverse formula, but to be complete:

M=C^d mod n

We are at the end of our RSA overview, as surely you have noticed, this is a discussion based mainly on the math, but I tried to not include the factorisation systems as the Method of the Elliptic Curve (a bit slow with the big numbers) there are also Pollard's Monte Carlo Algorithm, Continued Fraction Algorithm, Trial Division (this is the oldest system that I know). If we want to perform operations as the trial division, we can use the Miracl libraries.

Pratical RSA

In this section we will begin to use RsaTool2 (of The Egoiste TMG) which will be very useful for the reversing of RSA targets. Firstly, we need to set the NumberBase of our RsaTool to 10 (indeed we work in base decimal), and next we will verify if the formula previously studied is true:

n=p*q

In the field "Modulus N" insert the number 47 and next press the button "Factor N", at this point we will adviced that this is a prime numeber, also 53 is a prime number. So we retrive N=2491 from the multiplication of (47*53). Now if we insert 2491 in "Modulus N" and next factorize this number, we will reobtain P and Q!. Now we have P and Q, and if we analyze the formula:

d = e^(-1) mod ((p-1)*(q-1))

we can see that we should obtain from this formula the decryption key D. To obtain "D" we need only of the public Exponent "E" and with P and Q, there is a specific button which is "Calc D". So we know also the decryption key, if we want to decrypt a text easily, we can use the formula:

M=(C^d) mod n

If the numbers considered are little, we can make some attempt of encryption/decryption just using only Windows calculator, but the best is to find a program which uses the Miracl libraries.

The factorization (addendum)

This little section, is not of fundamental importance to understanding RSA, but could help you to understand which are the major problem that can emerge in a hypothetical RSA attack. As I hope you understand, to be able to overcome a code protected with RSA we need to FACTORIZE the modulus N. If this number is not so big there aren't problems, but if N is really big we need to use probabilistic algorithms. The algorithms for the factorisation are divided in two groups:

-Deterministic algorithms.

-Montecarlo's algorithms (which are based on a semi-casual choice of the initial data).

Precisely the deterministic algorithms are divised in three subgroup:

-Algorithm that returns from each successful attempt, a divisor (prime or not).

-Algorithm that from each successful attempt returns ALWAYS a prime divisor.

-Algorithm that from each successful attempt returns divisor common to the starting number.

The Euclidean theorem, is the most used for this kind of attacks.

x=(x.a)(x.b)/(x.a.b) mod N

For example, N=777 if we divide this num by 3 we have 259 (which is >32), 259 could be divided by 7, and we obtain 37 (which is <72). The prime divisors of 259 are obviously (3,7,37). If you try to factorize this with RsaTool2 you will obtain these same values. This kind of deterministic algorithms could be good (for good you should think "speed of resolution") for little numbers, but for big numbers are used Montecarlo's algorithms, which are probabilistic.

Reversing corner

Now is time to see something of practice. Firstly we will begin with a simple crackme, Lockless's 3CM that you can download from Lockless web site. Firstly we put a bpx GetDlgItemTextA and GetWindowTextA. After that SoftICE popped, we should be here:

Reference To: USER32.GetDlgItemTextA

004137FE Call dword ptr [00428910]

00413804 jmp 00413819 ; Jump down

Referenced by a (U)nconditional or (C)onditional Jump at Address:

00413819 pop ebp

0041381A ret 000C

Called from:

004029B0 push FFFFFFFF

004029B2 push 00419E18

004029B7 mov eax, dword ptr fs:[00000000]

004029BD push eax

004029BE mov dword ptr fs:[00000000], esp

004029C5 sub esp, 00000650

004029CB push esi

004029CC push edi

Possible StringData Ref from Data Obj ->"9901" ; This is a very important value

004029CD push 004200DC

004029D2 lea ecx, dword ptr [esp+000000E4]

004029D9 call 00401130 ; In this call the number 9901 is copied to another location

Possible StringData Ref from Data Obj ->"12790891"; Also this num is really important

004029DE push 004200D0

004029E3 lea ecx, dword ptr [esp+1C]

004029E7 mov dword ptr [esp+00000664], 00000000

004029F2 call 00401130 ; In this call is transferred 12790891

Possible StringData Ref from Data Obj ->"8483678" ; Another very important value

004029F7 push 004200C8

004029FC lea ecx, dword ptr [esp+00000274]

00402A03 mov byte ptr [esp+00000664], 01

00402A0B call 00401130 ;Idem come sopra

Possible StringData Ref from Data Obj ->"5666933" ; Last but not least value

00402A10 push 004200C0

00402A15 lea ecx, dword ptr [esp+000001AC]

00402A1C mov byte ptr [esp+00000664], 02

00402A24 call 00401130 ; Is called here the same call

00402A29 mov edx, dword ptr [esp+00000668] ; Relative address of the serial

00402A30 or esi, FFFFFFFF ;

00402A33 mov edi, edx ;| Common method used to retrive the length of a string

00402A35 mov ecx, esi ;|

00402A37 xor eax, eax ;|

00402A39 mov byte ptr [esp+00000660], 03 ;|

00402A41 repnz ;|

00402A42 scasb ;|

00402A43 not ecx ;|

00402A45 dec ecx ;|

00402A46 cmp ecx, 0000000E ; if the serial isn't Eh (14 chars)

00402A49 jne 00402BB2 ; the jump to beggar off, else go

00402A4F xor ecx, ecx

00402A51 mov al, byte ptr [ecx+edx] ; char relative to the serial

00402A54 cmp al, 30

00402A56 jl 00402BB2 ; Jump to beggar off, if is less tha 30h (0)

00402A5C cmp al, 39

00402A5E jg 00402BB2 ; Jump to beggar off, if is greater than 39h (9d)

00402A64 inc ecx ; Inc the counter

00402A65 cmp ecx, 0000000E

00402A68 jl 00402A51 ;If is less than Eh, go to the next cycle

A bit of analysis is necessary, to fully understand this piece of code. At the begin a serial is taken, and not the name, this should make to think that an algorithm will encrypt/decrypt our serial. In this case the things are a bit easier because we know that this is RSA. Next we can see the presence of four fixed numbers, and immediately we think of E and N. I remind you of the formula:

C=M^e mod n

At 00402A51 starts a routine that checks if the serial inserted is a number, if isn't a number then jump to an error message. From this cycle we also untderstand that the NumberBase is 10. At this point it is better to leave the routine-analysis that follow the previous routine, because it is a bit long and boring. But I will try to resume what happens:

-The serial should be of 14 chars, but during the routine this number (14) will be divided into two equal parts each one of 7 characters. At each piece of serial is now applied a series of operations, these operations involve also the numbers that we have previously seen (9901 and 12790891). Finally these two parts of serial are united for the final check. From the operations that happen we can understand that 12790891 represents the modulus N and that 9901 is the public exponent E.

Now our work is to decrypt these two part of serial. With a bit of tracing through this routine, we can easly understand that the values 5666933 and 8483678 are the two correct parts of the serial, but obviously crypted. So we can build a scheme of the current situation:

-We know the correct but crypted serial, in other words we know C

-We know the modulus N, that was used to crypt our serial.

-We know E (with E and P,Q we can found D).

We have all for a correct decrypting session. To decrypt the two part of serial we need to have further C, also N and D. If you remember N is N=P*Q, so all we need is to factorize N. In RsaTool2 we insert the supposed N and next check "Factor N", but our prog tell us that the number entered is prime. If this number is prime it isn't N (which is composed by the multiplication of two primes), but can be our E. Now we retry to factorize 12790891 from which we obtain P (1667) * Q( 7673). At this point we have E,P and Q and we can retrieve the decryption key D, that derives from this formula:

d = e^(-1) mod ((p-1)*(q-1))

The value of D is 10961333, so finally we have all elements to be allowed to apply:

M=C^d mod n

A hint for you, don't try to use windows calculator, there are specific tools to decrypt this number ;-).

M1=(8483678^10961333) mod 12790891

M2=(5666933^10961333) mod 12790891

From which we obtain:

M1=7167622

M2=3196885

These are the two parts of decrypted serial, if you remember our routine split the serial in two equal parts, so we make the inverse Correct_Serial=(M1+M2). That finally is 71676223196885.

How to recognize RSA.

To recognize RSA in a protection system is not always simple, because it could have many and various forms, or in the worst case could be modified. But here are a list of the common features that a routine *could have*:

-The serial is encrypted with C=(M^E) mod N, where M is our serial.

-As consequence we should see two fix values, the modulus N and the public exponent E.

Now i wanted to insert here a piece of code taken from a crackme, that could clarify your ideas:

0040118D sub_40118D proc near

0040118D mov ebx, eax ; In eax we have our serial

0040118F mov ecx, dword_4030A8 ; In ecx is copyed a constant value

00401195 mov esi, dword_4030A4 ; In esi another costant value

0040119B mov eax, 1

004011A0 loc_4011A0:

004011A0 cdq ; ConvertDoubleToQuad, before a division

04011A1 mul ebx ; ebx*eax

004011A3 div esi ; Division between esi and eax (in esi there is a costanr value)

004011A5 mov eax, edx ; Remainder in eax

004011A7 loop loc_4011A0 ; Loop these operations, this cycle is repeated 1109 times, indeed in ecx there is 1109!

After there is a comparision between the correct serial, and the serial (crypted) that we typed.

VALUES:

004030A4= 0EAD2C511h so 3939681553

004030A8=455h so 1109

This routine is very easy, in EBX we have our typed serial, which is multiplied by EAX (first time, this contains 1) and next is divided by ESI (ESI = 3939681553). In the next step is taken the reminder of the division (in other words is this perform a MOD operation!), all these operation are iterated 1109 times!. With a bit of math analysis we can see what really happen:

EAX=EAX * EBX

Repeated 1109 times, and that means:

EAX=(EBX ^ 1109)

Next we have EAX = EAX / ESI, but only the reminder is taken, and we can consider it as:

EAX MOD ESI (don't forget that ESI is 3939681553)

So finally we see this, C=(Serial ^ 1109) MOD 3939681553

Great!, this is really the end ;-), if you have any questions, ideas or other suggestions...mail m.

References

-"Cenni di Aritmetica Superiore per la Crittologia", of Marco Frigerio

-Other docs on the web.

Greets

To all my reversing friends, especially: Quequero, Ironspark, kratorius, LonelyWolf, ZaiRoN, AndreaGeddon, and AntiCrack people.