< Previous by Date | Date Index | Next by Date > |
< Previous in Thread | Thread Index | Next in Thread > |
One other item we looked at doing was using the srand_r random_r functions which actually keep the seed as an
Application variable which is then passed back in. It was a little scary to make the change since the entire Random
class would have to be reworked to be done non-static. -Aron From: Byron Campen
[mailto:bcampen@xxxxxxxxxxxx] I wouldn't
re-implement Random::getRandom() with getCryptoRandom(), since the contract on
it is for providing cheap, pseudo-random numbers. It would be more reasonable
to change the code that generates transaction-ids and tags (in fact, the code
that generates Call-Ids has been tweaked to help with this very problem that
you're seeing). The tweak in the Call-Id generation code involves throwing the
thread-id into the generated bits, which solves the collision issue you're
seeing. Maybe we could alter Random::getRandom() to xor the current thread-id
with everything it returned (this would be in-keeping with "cheap,
pseudo-random numbers")? Or maybe we could add a
Random::getRandomReentrant() function? Anyone have an
opinion on this? Best regards, Byron Campen
So
this bug report concerns a very strange issue that we noticed on our brandnew
Dual Quad Core machine (8 cpu’s) involving duplicate Call-Id’s,
Transaction-ID’s and Tag’s being generated for independent INVITE’s. This
behavior would then result in assert failures all over the stack. We
have a single instance of DUM/Resiprocate running on its own thread. Our
application generates 4 independent INVITE requests at the same exact time
which results in sequential calls eventually being made to Random.cxx and then
glibc’s random() function. Of the four calls we get the following random values
returned Call
1: aaaaaaaaaaa Call
2: bbbbbbbbbb Call
3: aaaaaaaaaaa (same exact sequence of random values as the first
call) Call
4: bbbbbbbbbb (same exact sequence of random values as the second call) Sometime
later, various assert failures would occur due to duplicate TID values and all
sorts of other issues. If
pause or sleep the thread for 1 MS then the the problem disappears. So what the
heck is going on…. We
think that DUM thread is being migrated across CPU’s between the different
invocations of glibc’s random() function and the “seed” value is stale in a one
of the CPU caches. So
how do we fix this – When we dug into the resiprocate Random.cxx code we noticed
that although we had linked against OpenSSL, the OpenSSL random functions were
not being used at all. They would be used to initialize the seed but not used
to actually generate the random values. If
we used the crypto versions of the functions the repeatedness issue went away
completely. Here
is a small patch which will use the crypto version if USE_OPENSSL is
defined ---
rutil/Random.cxx.orig
2008-03-14 23:21:29.000000000 -0700 +++
rutil/Random.cxx 2008-03-15 00:26:59.000000000 -0700 @@
-149,8 +149,9 @@ Random::getRandom() {
initialize(); - -#ifdef
WIN32 +#if
USE_OPENSSL +
return getCryptoRandom(); +#elif
WIN32
assert( RAND_MAX == 0x7fff );
int r1 = rand();
int r2 = rand(); -Aron Aron
Rosenberg SightSpeed _______________________________________________ resiprocate-devel mailing list |