[reSIProcate] Random.cxx and MultiCore systems

Byron Campen bcampen at estacado.net
Wed Mar 19 12:14:18 CDT 2008


	I wouldn't re-implement Random::getRandom() with getCryptoRandom(),  
since the contract on it is for providing cheap, pseudo-random  
numbers. It would be more reasonable to change the code that  
generates transaction-ids and tags (in fact, the code that generates  
Call-Ids has been tweaked to help with this very problem that you're  
seeing). The tweak in the Call-Id generation code involves throwing  
the thread-id into the generated bits, which solves the collision  
issue you're seeing. Maybe we could alter Random::getRandom() to xor  
the current thread-id with everything it returned (this would be in- 
keeping with "cheap, pseudo-random numbers")? Or maybe we could add a  
Random::getRandomReentrant() function?

	Anyone have an opinion on this?

Best regards,
Byron Campen

> So this bug report concerns a very strange issue that we noticed on  
> our brandnew Dual Quad Core machine (8 cpu’s) involving duplicate  
> Call-Id’s, Transaction-ID’s and Tag’s being generated for  
> independent INVITE’s. This behavior would then result in assert  
> failures all over the stack.
>
>
>
> We have a single instance of DUM/Resiprocate running on its own  
> thread. Our application generates 4 independent INVITE requests at  
> the same exact time which results in sequential calls eventually  
> being made to Random.cxx and then glibc’s random() function. Of the  
> four calls we get the following random values returned
>
>
>
> Call 1: aaaaaaaaaaa
>
> Call 2: bbbbbbbbbb
>
> Call 3: aaaaaaaaaaa   (same exact sequence of random values as the  
> first call)
>
> Call 4: bbbbbbbbbb  (same exact sequence of random values as the  
> second call)
>
>
>
> Sometime later, various assert failures would occur due to  
> duplicate TID values and all sorts of other issues.
>
>
>
> If pause or sleep the thread for 1 MS then the the problem  
> disappears. So what the heck is going on….
>
>
>
> We think that DUM thread is being migrated across CPU’s between the  
> different invocations of glibc’s random() function and the “seed”  
> value is stale in a one of the CPU caches.
>
>
>
> So how do we fix this – When we dug into the resiprocate Random.cxx  
> code we noticed that although we had linked against OpenSSL, the  
> OpenSSL random functions were not being used at all. They would be  
> used to initialize the seed but not used to actually generate the  
> random values.
>
>
>
> If we used the crypto versions of the functions the repeatedness  
> issue went away completely.
>
>
>
> Here is a small patch which will use the crypto version if  
> USE_OPENSSL is defined
>
>
>
> --- rutil/Random.cxx.orig              2008-03-14  
> 23:21:29.000000000 -0700
>
> +++ rutil/Random.cxx    2008-03-15 00:26:59.000000000 -0700
>
> @@ -149,8 +149,9 @@
>
>  Random::getRandom()
>
>  {
>
>     initialize();
>
> -
>
> -#ifdef WIN32
>
> +#if USE_OPENSSL
>
> +   return getCryptoRandom();
>
> +#elif WIN32
>
>     assert( RAND_MAX == 0x7fff );
>
>     int r1 = rand();
>
>     int r2 = rand();
>
>
>
>
>
> -Aron
>
>
>
> Aron Rosenberg
>
> SightSpeed
>
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel at resiprocate.org
> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20080319/927a43ed/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2423 bytes
Desc: not available
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20080319/927a43ed/attachment.bin>


More information about the resiprocate-devel mailing list