[reSIProcate] Random.cxx and MultiCore systems
Alan Hawrylyshen
alan at polyphase.ca
Wed Mar 19 14:30:27 CDT 2008
More to Byron's point. If you substitute the crypto random function,
you will notice a significant increase in latency. A pseudo random
cheap solution that doesn't exhibit this erect is needed. We stumbled
across something similar years ago when we ported to a then exotic
dual processor machine. The problem occurred very rarely then but with
4+ cores and todays speeds I believe the problem will happen
predictably as you have seen Aron.
Alan
--
Sorry this is terse; it is from my handheld device.
On 19 Mar 2008, at 10:14, Byron Campen <bcampen at estacado.net> wrote:
> I wouldn't re-implement Random::getRandom() with getCryptoRandom(),
> since the contract on it is for providing cheap, pseudo-random
> numbers. It would be more reasonable to change the code that
> generates transaction-ids and tags (in fact, the code that generates
> Call-Ids has been tweaked to help with this very problem that you're
> seeing). The tweak in the Call-Id generation code involves throwing
> the thread-id into the generated bits, which solves the collision
> issue you're seeing. Maybe we could alter Random::getRandom() to xor
> the current thread-id with everything it returned (this would be in-
> keeping with "cheap, pseudo-random numbers")? Or maybe we could add
> a Random::getRandomReentrant() function?
>
> Anyone have an opinion on this?
>
> Best regards,
> Byron Campen
>
>> So this bug report concerns a very strange issue that we noticed on
>> our brandnew Dual Quad Core machine (8 cpu’s) involving duplicate
>> Call-Id’s, Transaction-ID’s and Tag’s being generated for
>> independent INVITE’s. This behavior would then result in assert fa
>> ilures all over the stack.
>>
>>
>>
>> We have a single instance of DUM/Resiprocate running on its own
>> thread. Our application generates 4 independent INVITE requests at
>> the same exact time which results in sequential calls eventually
>> being made to Random.cxx and then glibc’s random() function. Of th
>> e four calls we get the following random values returned
>>
>>
>>
>> Call 1: aaaaaaaaaaa
>>
>> Call 2: bbbbbbbbbb
>>
>> Call 3: aaaaaaaaaaa (same exact sequence of random values as the
>> first call)
>>
>> Call 4: bbbbbbbbbb (same exact sequence of random values as the
>> second call)
>>
>>
>>
>> Sometime later, various assert failures would occur due to
>> duplicate TID values and all sorts of other issues.
>>
>>
>>
>> If pause or sleep the thread for 1 MS then the the problem
>> disappears. So what the heck is going on….
>>
>>
>>
>> We think that DUM thread is being migrated across CPU’s between th
>> e different invocations of glibc’s random() function and the
>> “seed” value is stale in a one of the CPU caches.
>>
>>
>>
>> So how do we fix this – When we dug into the resiprocate Random.cx
>> x code we noticed that although we had linked against OpenSSL, the
>> OpenSSL random functions were not being used at all. They would b
>> e used to initialize the seed but not used to actually generate th
>> e random values.
>>
>>
>>
>> If we used the crypto versions of the functions the repeatedness
>> issue went away completely.
>>
>>
>>
>> Here is a small patch which will use the crypto version if
>> USE_OPENSSL is defined
>>
>>
>>
>> --- rutil/Random.cxx.orig 2008-03-14 23:21:29.000000000 -0700
>>
>> +++ rutil/Random.cxx 2008-03-15 00:26:59.000000000 -0700
>>
>> @@ -149,8 +149,9 @@
>>
>> Random::getRandom()
>>
>> {
>>
>> initialize();
>>
>> -
>>
>> -#ifdef WIN32
>>
>> +#if USE_OPENSSL
>>
>> + return getCryptoRandom();
>>
>> +#elif WIN32
>>
>> assert( RAND_MAX == 0x7fff );
>>
>> int r1 = rand();
>>
>> int r2 = rand();
>>
>>
>>
>>
>>
>> -Aron
>>
>>
>>
>> Aron Rosenberg
>>
>> SightSpeed
>>
>> _______________________________________________
>> resiprocate-devel mailing list
>> resiprocate-devel at resiprocate.org
>> https://list.re
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel at resiprocate.org
> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20080319/911976dc/attachment.htm>
More information about the resiprocate-devel
mailing list