Cracking ECC FLEXlm

"Well to be honest, I'd known about this vulnerability for quite a while ;-), however the RNG has since been secured, so publishing this now, will not hurt any developers (unless of course they forgot to upgrade). Slightly edited by CrackZ".

Introduction

This document describes the methods used to extract seeds 3 and 4 from ECC FLEXlm protected targets. The essay alone should be sufficient, however my tools, if provided, will greatly assist you if you are attempting to crack a real target.

Tools required

NuMega SoftICE
Datarescue IDA (4.15 or later)
FLEXlm v7.2h SDK
Microsoft Visual C++
TASM (if you want to do some analysis of ripped code)

Overview of how the crack works

Standard FLEXlm on older versions required two 32 bit encryption seeds as secrets. These seeds are what provided the security for the standard product. The seeds could be extracted from the target program, so the programs could be keygenned by extracting the seeds, then building lmcrypt (the FLEXlm key generator) that would allow the cracker to generate valid keys for the target program. ECC FLEXlm requires two additional seeds. These seeds are used to generate the public/private key pair, and only the public key is stored in the distributed executables. The Certicom Security Builder ECC library which is used to do this seems to be a secure implementation. I could find no correlation between the bits used as seeds for the key pair generator, and the public key. In order to brute force out the seeds, it would require a maximum search of 2^64 trials, which is far too many to search if a key is to be found.

Globetrotter made the error of using seeds 3 and 4 for other purposes in the code. The values for seeds 3 and 4 cannot be completely recovered, however enough information can be recovered to make it possible to greatly reduce the keyspace, making it possible to recover FLEXlm seeds 3 and 4 within a reasonable length of time.

Partial recovery of seeds 3 and 4

The weakness occurs in the routine _do_handshake. Here seeds 3 and 4 are used in a validation routine which makes it so that fake vendor daemons won't work unless these values are correct. Here's the part of the routine where the seeds are used.

		mov	edx,  dword ptr ds:_seed3
		xor	edx,  dword ptr ds:_seed4
		shr	edx, 10h
		and	edx, 0FFFFh
		push	edx
		mov	eax,  dword ptr ds:_seed3
		add	eax,  dword ptr ds:_seed4
		and	eax, 0FFFFh
		push	eax
		mov	ecx,  dword ptr ds:_seed3
		xor	ecx,  dword ptr ds:_seed4
		and	ecx, 0FFFFh
		push	ecx
		call	_srand16
		add	esp, 0Ch
		call	_rand16
		call	_rand16
		call	_rand16
		call	_rand16
		mov	[ebp+var_60], eax ; "random" numbers are used as constants in lm_new :)
		call	_rand16
		mov	[ebp+var_5C], eax
		call	_rand16
		mov	[ebp+var_58], eax
		call	_rand16
		mov	[ebp+var_54], eax

If we could recover the seed values fed to srand16, then we could know a lot about what the values for seeds 3 and 4 were. With only one of the values generated by rand16, it would not be possible to recover the state information of the pseudorandom number generator. In order to figure out information about the original values fed into the generator, the srand16 and rand16 calls need to be examined.

/*------------------------------------------------------------------------*
 *
 * my_srand16: emulate the srand16 routine.
 * seeds: seed values.
 *------------------------------------------------------------------------*/
int my_srand16(int seed1, int seed2, int seed3)
{
	sdata_0 = seed1 % 32363;
	sdata_1 = seed2 % 31727;
	sdata_2 = seed3 % 31657;

	if (sdata_0 == 0) sdata_0 += 32362;
	if (sdata_1 == 0) sdata_1 += 31726;
	if (sdata_2 == 0) sdata_2 += 31656;
	return(0);
}

/*------------------------------------------------------------------------*
 *
 * my_rand16: emulate the rand16 routine.
 *------------------------------------------------------------------------*/
int my_rand16()
{

	int var_8;
	int var_4;
	int edxval;
	int eaxval;

	var_4 = sdata_0 / 206;
	edxval = (sdata_0 / 206) * 206;
	eaxval = sdata_0 - edxval;
	sdata_0 = eaxval * 157 - (var_4 * 21);
	if (sdata_0 <= 0)
	{
		sdata_0 += 32363;
	}

	var_4 = sdata_1 / 217;
	edxval = var_4 * 217;
	eaxval = sdata_1 - edxval;
	sdata_1 = eaxval * 146 - (var_4 * 45);
	if (sdata_1 <= 0)
	{
		sdata_1 += 31727;
	}

	var_4 = sdata_2 / 222;
	edxval = var_4 * 222;
	eaxval = sdata_2 - edxval;
	sdata_2 = eaxval * 142 - (var_4 * 133);
	if (sdata_2 <= 0)
	{
		sdata_2 += 31657;
	}
	var_8 = sdata_0 - sdata_1;
	if (var_8 > 706)
	{
		var_8 = var_8 - 32362;
	}
	var_8 = var_8 + sdata_2;
	if (var_8 <= 0)
	{
		var_8 += 32362;
	}
	return(var_8 >> 1);
}

Although it looks like a mess, the workings of the pseudorandom number generator are pretty straightforward. The algorithm is the same one as the one in Bruce Schneier's book "Applied Cryptography" for the linear congruential generator. In the book he mentions that the algorithm isn't cryptographically secure, and references to how other people have broken this generator. The way it works is that there are three generators, each of which has a period of differing length. Initial values are loaded into each generator, and as numbers are generated, each generator is incremented to the next state. Given an output value, and the values of the first two states, the state of the third generator can be obtained.

The first random number extracted from the target is used, and the first two LCGs of the RNG are cycled, with the third LCG set to the value required to derive the value extracted from the target. If the sequence matches over the 4 values we got from the target, we have found the state of the RNG. The full cycles of the three LCGs were extracted into large arrays, so it would be possible to step forwards and backwards through the states, and no calculation save pointer arithmetic was required. Once the state required to match the 4 values was found, the LCGs were stepped backwards until the original values fed into the PRNG were found. There were multiple possible input values to the LCG though, because of the modulo operation performed before the values were loaded into the LCGs. All of permutations of the valid inputs to srand were cycled. After this work, we know the following things about the seeds.

High 16 bits of seed3 ^ seed4 == value1
Low 16 bits of seed3 + seed4 & 0xffff == value2
Low 16 bits of seed3 ^ seed4 == value3

By knowing these things, the keyspace can be typically reduced to around 2^30 operations, depending on the values of value, value2, and value3. At this point, the brute forcing part comes into play. There are three ECC public keys stored in the FLEXlm target. The smallest public key is selected, since the generation of the public/private key pair from the input seeds is fastest for this value. The seeds are minimally obscured by modifying the values with bytes obtained from the vendor name.

int reveal_data(sb_keydata_t *inkey, char *inname)
{
        int i;
        char *p; 

        p = inname; 
        for (i = 0; i < inkey->nbytes; i++)
        { 
                if (*p == '\0') p = inname;
                if (i & 0x0000001) 
                { 
                        if (i%3)
                        { 
                                inkey->keydata[i] = inkey->keydata[i] + *p; 
                        } 
                        else 
                        { 
                                inkey->keydata[i] = inkey->keydata[i] ^ *p;
                        }
                }
                else
                { 
                        inkey->keydata[i] = inkey->keydata[i] - *p;
                }
                p++;
        }
        return(0);
}

Once the real public key is revealed, we can use the security builder routines to cycle through the possible seeds 3 and 4, and we're done there. I should make a brief mention of seeds 1 and 2. They are hidden using the same old methods that were used before in earlier versions of FLEXlm, so the methods outlined in previous essays of combining values from the job structure and the vendorcode structure still work.

Nolan Blender.