Re: [reSIProcate] [Fwd: Re: [reSIProcate-users]Helper::computeCallId returns the same value]
Glibc rand uses a seed, how is the seed accessed or protected?
-Aron
-----Original Message-----
From: Bruce Lowekamp [mailto:bbl@xxxxxxxxxxxx]
Sent: Friday, November 07, 2008 8:23 AM
To: Aron Rosenberg
Cc: Adam Roach; resiprocate-devel
Subject: Re: [reSIProcate] [Fwd: Re: [reSIProcate-users]Helper::computeCallId
returns the same value]
I'm baffled by how this could be happening. Looking at random.o in
libc on the fedora and ubuntu machines I have handy right now (neither
ia64, but both smp's), the lock in random() is implemented:
17: f0 0f b1 0d 00 00 00 lock cmpxchg %ecx,0x0
my understanding is that operation is considered smp-safe. So unless
I'm missing something very basic about what that guarantees that
operation provides (which is entirely possible, I don't claim to be an
expert on low-level memory operations), I don't understand how random
could be causing the problem. I don't see how anything else coming in
from makeInviteSession could be causing it, either. I'd be interested
in whether a collision is seen if you logged the callId in
BaseCreator.cxx right after computeCallId is called, but of course
that might change behavior...
There are some possible race conditions that have never been fixed in
Condition.cxx, and it's possible to do some stupid things with
pointers to temporaries with some of the code in rutil (Data::c_str
being the best example), but I don't see any of that involved in
makeInviteSession.
Bruce
On Thu, Nov 6, 2008 at 5:53 PM, Aron Rosenberg
<arosenberg@xxxxxxxxxxxxxx> wrote:
> We see the issue on a gentoo stock glibc 2.6.1 version on a dual
> quad-core Intel server.
>
> -Aron
>
> Aron Rosenberg
>
> -----Original Message-----
> From: resiprocate-devel-bounces@xxxxxxxxxxxxxxx
> [mailto:resiprocate-devel-bounces@xxxxxxxxxxxxxxx] On Behalf Of Bruce
> Lowekamp
> Sent: Wednesday, November 05, 2008 11:53 AM
> To: Adam Roach
> Cc: resiprocate-devel
> Subject: Re: [reSIProcate] [Fwd: Re:
> [reSIProcate-users]Helper::computeCallId returns the same value]
>
> I spent a little bit of time looking at this, but it's left me more
> confused than I was before.
>
> Have you determined what platforms people are actually seeing the
> CallID problem with? In particular, what libc are they using? To get
> a duplicate callid, it looks like you would have to get 4 consecutive
> calls to random() to return the same result. The only way I can see
> that would happen would be if two threads run their calls in parallel
> starting with the same state, but without sharing any updates to the
> random state.
>
> With glibc, I believe this is virtually impossible. The glibc
> implementation of rand and random imposes a mutex around all of the
> calls that access the static state.
> http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/random.c?rev=1.18&c
> ontent-type=text/x-cvsweb-markup&cvsroot=glibc
> so unless there's something I'm not seeing like a peculiar cache
> setting being used for the lock and memory random() uses, I don't see
> how this problem is possible there.
>
> Based on that, I'm wondering if a different libc implementation is
> being used here, and the reason switching to SSL fixes the problem is
> that the openssl implementation actually forces thread safety
> (ssleay_rand_bytes does locking, and it ultimately is the default rand
> function in openssl). My conclusion would be that the right thing to
> do is to add a mutex to getRandom() that is used if an unsafe C
> library is being used (not entirely sure how to check for that, but
> could probably identify a set of known-safe C libraries that can be
> detected). That way, the concern about other uses of Random that
> aren't being detected goes away.
>
> Bruce
>
>
> 2008/10/13 Adam Roach <adam@xxxxxxxxxxx>:
>> As we've seen in the past, the Call-ID generation code that DUM uses
>> (resip/stack/Helper.cxx:625 on head) can generate colliding Call-IDs
> under
>> high-load conditions. The current code looks like this:
>>
>> Data
>> Helper::computeCallId()
>> {
>> static Data hostname = DnsUtil::getLocalHostName();
>> Data hostAndSalt(hostname + Random::getRandomHex(16));
>> #ifndef USE_SSL // .bwc. None of this is neccessary if we're using
>> openssl
>> #if defined(__linux__) || defined(__APPLE__)
>> pid_t pid = getpid();
>> hostAndSalt.append((char*)&pid,sizeof(pid));
>> #endif
>> #ifdef __APPLE__
>> pthread_t thread = pthread_self();
>> hostAndSalt.append((char*)&thread,sizeof(thread));
>> #endif
>> #ifdef WIN32
>> DWORD proccessId = ::GetCurrentProcessId();
>> DWORD threadId = ::GetCurrentThreadId();
>> hostAndSalt.append((char*)&proccessId,sizeof(proccessId));
>> hostAndSalt.append((char*)&threadId,sizeof(threadId));
>> #endif
>> #endif // of USE_SSL
>> return hostAndSalt.md5().base64encode(true);
>> }
>>
>> I spoke to Byron just now, and he thinks the comment about "USE_SSL"
> is not
>> accurate. (It would be if the code under getRandomHex() called into
> OpenSSL
>> -- currently, it does not).
>>
>> To help refresh memories, we've visited this problem in detail before,
> most
>> recently here:
>>
>> http://list.resiprocate.org/archive/resiprocate-devel/msg06605.html
>>
>> The conclusion of that thread left me confused -- Alan demonstrated
> that
>> we'll have collisions (albeit rarely) on just about any architecture,
> and
>> that such collisions don't require multithreading to occur. From my
> read of
>> things, Aron's problem (and Ilana's; see
>> http://list.resiprocate.org/archive/resiprocate-users/msg00642.html)
> occurs
>> more frequently than Alan's test program.
>>
>> It seems to me that there are a few things we can do to try and
> address
>> this:
>>
>> 1. If we're using OpenSSL, make computeCallId call through to OpenSSL
>> for its random numbers (there area a few paths to get there, so
>> I'm just throwing out the general idea at this point).
>> 2. Remove the "#ifndef USE_SSL" guards from computeCallId() -- is
>> this sufficent?
>> 3. Do #2, but also salt in a 32-bit thread-local serial number to
>> prevent intra-thread collisions
>>
>> Thoughts? (If no one expresses an opinion in a reasonable amount of
> time,
>> I'll probably do #3).
>>
>> [It occurs to me that we must have a similar problem with tags and
> branch
>> IDs, albeit without any assert()s being triggered -- I would presume
> that
>> any fix made to Call-ID should also be made to them as well, in
>> Helper::computeUniqueBranch() and Helper::computeTag()]
>>
>> /a
>>
>>
>> ---------- Forwarded message ----------
>> From: Adam Roach <adam@xxxxxxxxxxx>
>> To: Ilana Polyak <Ilana.Polyak@xxxxxxxxxxxxxx>
>> Date: Mon, 13 Oct 2008 09:49:28 -0500
>> Subject: Re: [reSIProcate-users] Helper::computeCallId returns the
> same
>> value
>> This issue has been previously seen, but we haven't been able to pin
> it
>> down.
>>
>> Previous reports can be found here:
>>
>>
> http://list.resiprocate.org/archive/resiprocate-devel-old/msg03200.html
>> http://list.resiprocate.org/archive/resiprocate-devel/msg06605.html
>>
>> Aron's solution -- shunting "getRandom" over to "getCryptoRandom" --
> worked
>> for him. Of course, you impose a higher load on your CPU when you do
> so, so
>> you may want to try tracking the problem down and addressing it in a
> more
>> efficient way.
>>
>> The problem does not seem to surface except when using DUM.
>>
>> /a
>>
>>
>> Ilana Polyak wrote:
>>>
>>> Hello
>>>
>>> I have just started to use dum in our application and noticed that if
> I
>>> run calls in a very high rate the call id repeats itself?
>>>
>>> What am I doing wrong I have a separate thread that calls buildFdSet,
>>> stack process and dum process. There is a semaphore before it and
> semaphore
>>> for all the api calls that come from my application.
>>>
>>> I have run a call for computeCallId from the same thread ( the thread
> that
>>> runs the dum and stack) and the value returned seems to be fine. But
> when it
>>> gets called from the api makeInviteSession which is called from the
> context
>>> of my application thread the value repeats it self for around 8
> calls.
>>>
>>> The calls are created one after another in a very high volume. If the
>>> calls are created in a low volume (let's say one per second)
> everything is
>>> fine.
>>>
>>> Have anyone seen this problem?
>>>
>>> Thanks
>>>
>>> **_Ilana Polyak_**
>>>
>>> Senior Software Engineer, Protocol Group
>>>
>>> Blade Business Line
>>>
>>> **_ _**
>>>
>>> **_AudioCodes USA, Inc._**
>>>
>>> 27 World's Fair Drive
>>>
>>> Somerset, NJ 08873
>>>
>>> Tel: 732-469-0880 ext. 137
>>>
>>> Fax: 732-469-2298
>>>
>>> Direct: 732-652-4677
>>>
>>> Corporate URL: http://www.audiocodes.com <http://www.audiocodes.com/>
>>>
>>> Blade Business Line URL: http://www.audiocodes.com/blades
>>>
>>> **
>>>
>>>
>>>
> ------------------------------------------------------------------------
>>> This email and any files transmitted with it are confidential
> material.
>>> They are intended solely for the use of the designated individual or
> entity
>>> to whom they are addressed. If the reader of this message is not the
>>> intended recipient, you are hereby notified that any dissemination,
> use,
>>> distribution or copying of this communication is strictly prohibited
> and may
>>> be unlawful.
>>>
>>> If you have received this email in error please immediately notify
> the
>>> sender and delete or destroy any copy of this message
>>>
> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> resiprocate-users mailing list
>>> resiprocate-users@xxxxxxxxxxxxxxx
>>> List Archive: http://list.resiprocate.org/archive/resiprocate-users/
>>
>> _______________________________________________
>> resiprocate-users mailing list
>> resiprocate-users@xxxxxxxxxxxxxxx
>> List Archive: http://list.resiprocate.org/archive/resiprocate-users/
>>
>> _______________________________________________
>> resiprocate-devel mailing list
>> resiprocate-devel@xxxxxxxxxxxxxxx
>> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>>
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel@xxxxxxxxxxxxxxx
> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>