[reSIProcate] Random.cxx and MultiCore systems
Byron Campen
bcampen at estacado.net
Wed Mar 19 12:14:18 CDT 2008
I wouldn't re-implement Random::getRandom() with getCryptoRandom(),
since the contract on it is for providing cheap, pseudo-random
numbers. It would be more reasonable to change the code that
generates transaction-ids and tags (in fact, the code that generates
Call-Ids has been tweaked to help with this very problem that you're
seeing). The tweak in the Call-Id generation code involves throwing
the thread-id into the generated bits, which solves the collision
issue you're seeing. Maybe we could alter Random::getRandom() to xor
the current thread-id with everything it returned (this would be in-
keeping with "cheap, pseudo-random numbers")? Or maybe we could add a
Random::getRandomReentrant() function?
Anyone have an opinion on this?
Best regards,
Byron Campen
> So this bug report concerns a very strange issue that we noticed on
> our brandnew Dual Quad Core machine (8 cpu’s) involving duplicate
> Call-Id’s, Transaction-ID’s and Tag’s being generated for
> independent INVITE’s. This behavior would then result in assert
> failures all over the stack.
>
>
>
> We have a single instance of DUM/Resiprocate running on its own
> thread. Our application generates 4 independent INVITE requests at
> the same exact time which results in sequential calls eventually
> being made to Random.cxx and then glibc’s random() function. Of the
> four calls we get the following random values returned
>
>
>
> Call 1: aaaaaaaaaaa
>
> Call 2: bbbbbbbbbb
>
> Call 3: aaaaaaaaaaa (same exact sequence of random values as the
> first call)
>
> Call 4: bbbbbbbbbb (same exact sequence of random values as the
> second call)
>
>
>
> Sometime later, various assert failures would occur due to
> duplicate TID values and all sorts of other issues.
>
>
>
> If pause or sleep the thread for 1 MS then the the problem
> disappears. So what the heck is going on….
>
>
>
> We think that DUM thread is being migrated across CPU’s between the
> different invocations of glibc’s random() function and the “seed”
> value is stale in a one of the CPU caches.
>
>
>
> So how do we fix this – When we dug into the resiprocate Random.cxx
> code we noticed that although we had linked against OpenSSL, the
> OpenSSL random functions were not being used at all. They would be
> used to initialize the seed but not used to actually generate the
> random values.
>
>
>
> If we used the crypto versions of the functions the repeatedness
> issue went away completely.
>
>
>
> Here is a small patch which will use the crypto version if
> USE_OPENSSL is defined
>
>
>
> --- rutil/Random.cxx.orig 2008-03-14
> 23:21:29.000000000 -0700
>
> +++ rutil/Random.cxx 2008-03-15 00:26:59.000000000 -0700
>
> @@ -149,8 +149,9 @@
>
> Random::getRandom()
>
> {
>
> initialize();
>
> -
>
> -#ifdef WIN32
>
> +#if USE_OPENSSL
>
> + return getCryptoRandom();
>
> +#elif WIN32
>
> assert( RAND_MAX == 0x7fff );
>
> int r1 = rand();
>
> int r2 = rand();
>
>
>
>
>
> -Aron
>
>
>
> Aron Rosenberg
>
> SightSpeed
>
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel at resiprocate.org
> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20080319/927a43ed/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2423 bytes
Desc: not available
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20080319/927a43ed/attachment.bin>
More information about the resiprocate-devel
mailing list