< Previous by Date Date Index Next by Date >
  Thread Index Next in Thread >

[reSIProcate] The Million User Dilemma


Hi guys,

I must say I have a quite ambitious goal.  I want to make it so that I
can build a network of repros that can support millions upon millions of
users.  Likewise, I like to consider myself as a standards based guy,
and I want to take as much of everyone's input as possible in the design
path to doing this.  In return, everything will be made available for
free under the same Vovida license and/or BSD licensing that is already
available.


Several key areas of concern are the following:

Reliability:
How do we make it so that we can have many repro nodes work together
across large geographic topology, and allow calls to continue processing
in the event of an attack or a failure?

Scalability:
If you've ever run the testStack application and you're running a modern
computer, you'll notice that it doesn't matter how many cores you have,
or even to the point of the clock rate of your processor, there seems to
be a magic threshold around 6500 TPS for non invite scenarios.
Likewise, for calls, I can get about 1/3rd of that.  Also, those are
tests done with TCP, when you add in UDP, you can watch it suck up
memory like its job.  Based on what Byron has shown me, on inferior
hardware, the stack that Estacado/Tekelec has built and modified from
the main resiprocate tree can perform over 12000 TPS for noninvite
transactions in a single thread.  This means there are even great areas
for improvement beyond just adding concurrency.

Security:
Resiprocate supports TLS fairly well.  I would like to be able to take
advantage of that with any reliability mechanism put forth to help meet
HIPAA style requirements that require that all data stored to disk be
encrypted, and all data in transit be in encrypted.  Thankfully, part of
this problem can be more easily resolved by keeping more state in memory.

NAT Traversal:
Jeremy Geras and Scott Godin among others have worked very hard to
provide NAT traversal mechanisms for calls and registrations and so
forth through reTurn, reflow, and recon.  Jeremy's branch of recon
utilizes an outdated stack, but supports ICE to a large degree.  It is
missing support for ICE with TURN and has some other quirks that I've
managed to work out.

In my research around these key areas, I have come up with several ideas
of my own to deal with these issues, however, I would like to open this
up to the community to discuss these areas in an open forum where
everyone can participate and have their input taken seriously.

Thanks guys,
Dan