< Previous by Date	Date Index	Next by Date >
< Previous in Thread	Thread Index	Next in Thread >

Re: [reSIProcate] Random.cxx and MultiCore systems

From: "Aron Rosenberg" <arosenberg@xxxxxxxxxxxxxx>
Date: Thu, 20 Mar 2008 10:44:54 -0700

First run – Count was around 70

mp-test ~ # ./a.out

tot: 778262873

l1: 2060261465

l2: 2060261465

Aborted

Second Run – Count was at 400

mp-test ~ # ./a.out

tot: 4033371507

l1: 1314891622

l2: 1314891622

Aborted

Third Run – Count was at 130

mp-test ~ # ./a.out

tot: 1427405301

l1: 475005228

l2: 475005228

Aborted

mp-test ~ # ./a.out

tot: 1309167503

l1: 71029242

l2: 71029242

Aborted

-Aron

From: Alan Hawrylyshen [mailto:alan@xxxxxxxxxxxx]
Sent: Thursday, March 20, 2008 11:39 AM
To: Aron Rosenberg
Cc: Byron Campen; resiprocate-devel
Subject: Re: [reSIProcate] Random.cxx and MultiCore systems

I am still quite tempted to prove what the failure is with a minimal test driver. I fear that it might be something slightly more insidious. So, once we can cause this to happen at-will, we can address the appropriate root cause. Is this something that can be checked easily? Anyone?

I have a test driver that fails on a dual core intel platform, gcc 4.0.1, Mac OS X 10.5.2

This will fail around the 100 mark in the progress output (but I have waited much longer).

Let it run for a while and see.

This will abort when two successive calls to random() match.

I would expect this to be unlikely, but should we check this on a single processor / single core system?

Does it happen more often on dual core or SMP systems?

Aron - can you try this on your platform?

Please run it a LOT and see if the time-to-run varies greatly or if it fails reliably.

Thanks

Alan

#include <stdio.h>

#include <time.h>

#include <unistd.h>

#include <stdlib.h>

#include <string.h>

int

main()

{

unsigned long long t = 0;

unsigned long l1 = (unsigned long)random();

srandom(time(0));

unsigned long l2 = 0UL;

while (3)

{

l2 = (unsigned long)random();

if ( l1 == l2 ){

printf("tot: %llu\nl1: %lu\nl2: %lu\n",t,l1,l2);

abort();

}

l1 = l2;

t++;

const int modulator = 10000000L;

if (!(t % modulator)) {

printf("%llu...\r",(t/modulator));

fflush(stdout);

}

return 0;

}

Alan

On 19-Mar-08, at 15:56 , Aron Rosenberg wrote:

The only thing that I could think of is to use the new random_r and srand_r functions instead of random and srand. The glibc _r ones force the application to keep the “seed” value which might make it immune to the caching problem.

The issue with this approach was that the entire Random() class is static although you could just add a class wide static variable to hold the new userland data.

Follow-Ups:
- Re: [reSIProcate] Random.cxx and MultiCore systems
  - From: Alan Hawrylyshen

References:
- [reSIProcate] Random.cxx and MultiCore systems
  - From: Aron Rosenberg
- Re: [reSIProcate] Random.cxx and MultiCore systems
  - From: Byron Campen
- Re: [reSIProcate] Random.cxx and MultiCore systems
  - From: Alan Hawrylyshen
- Re: [reSIProcate] Random.cxx and MultiCore systems
  - From: Byron Campen
- Re: [reSIProcate] Random.cxx and MultiCore systems
  - From: Aron Rosenberg
- Re: [reSIProcate] Random.cxx and MultiCore systems
  - From: Alan Hawrylyshen

Prev by Date: Re: [reSIProcate] Random.cxx and MultiCore systems
Next by Date: Re: [reSIProcate] Random.cxx and MultiCore systems
Previous by thread: Re: [reSIProcate] Random.cxx and MultiCore systems
Next by thread: Re: [reSIProcate] Random.cxx and MultiCore systems
Index(es):
- Date
- Thread