< Previous by Date Date Index Next by Date >
  Thread Index Next in Thread >

[reSIProcate] Request for help with crashes in header processing


I have been struggling for a week with crashes in resiprocate that I am 
unable to debug.  Any help someone more familiar with the code could 
provide would be *most* appreciated as I am rapidly approaching a major
deployment.  I am using the 0.9.0-5019 public tarball and I am not using

dum, only the stack.  I have run under FC2 and FC3 with the same
results.

These crashes are happening under a load test and they always seem to be

when the stack is iterating through headers and/or header parameters.
Here 
is one that has happened a few times:

#0  0x002404d5 in memcpy () from /lib/tls/libc.so.6
#1  0x006cc77e in std::basic_streambuf<char, std::char_traits<char>
>::xsputn ()
   from /usr/lib/libstdc++.so.6
#2  0x006c3f45 in std::ostream::write () from /usr/lib/libstdc++.so.6
#3  0x0812604f in resip::operator<< (strm=@0xf45f6abc, d=@0x0) at
os/Data.cxx:1559
#4  0x081d1e8a in resip::DataParameter::encode (this=0x8207cac,
stream=@0xf45f6abc)
    at DataParameter.cxx:76
#5  0x0815f4cc in resip::ParserCategory::encodeParameters
(this=0x9334d00, str=@0xf45f6abc)
    at stl_iterator.h:614
#6  0x08143030 in resip::NameAddr::encodeParsed (this=0x9334d00,
str=@0xf45f6abc) at NameAddr.cxx:273
#7  0x08179b93 in resip::ParserContainer<resip::NameAddr>::encode
(this=0x94119a0,
    headerName=@0x8271a64, str=@0xf45f6abc) at stl_iterator.h:614
#8  0x081e5e7b in resip::HeaderFieldValueList::encode (this=0x9411978,
headerEnum=0, str=@0xf45f6abc)
    at HeaderFieldValueList.hxx:25
#9  0x08166481 in resip::SipMessage::encode (this=0x989abb0,
str=@0xf45f6abc, isSipFrag=false)
    at SipMessage.cxx:550
#10 0x0816658b in resip::SipMessage::encode (this=0x989abb0,
str=@0xf45f6abc) at SipMessage.cxx:525
#11 0x081a0c3e in resip::TransportSelector::transmit (this=0xf47c9c20,
msg=0x989abb0,
    target=@0xf45f8cd0) at TransportSelector.cxx:664
#12 0x0818a3f8 in resip::TransactionState::sendToWire (this=0xe147d818,
msg=0x8271824, resend=false)
    at TransactionState.cxx:1481
#13 0x0818d9c7 in resip::TransactionState::processServerInvite
(this=0xe147d818, msg=0x989abb0)
    at TransactionState.cxx:976
#14 0x081920ec in resip::TransactionState::process
(controller=@0xf47c9b90) at TransactionState.cxx:171
#15 0x08186489 in resip::TransactionController::process
(this=0xf47c9b90, fdset=@0xf45ff260)
    at TransactionController.cxx:91
#16 0x0817f2ea in resip::SipStack::process (this=0xf47a6018,
fdset=@0xf45ff260) at SipStack.cxx:421


I had assumed at first that these were caused by some multi-threading
problem, 
but I have mutex'd the code like crazy without solving anything.
Besides,
once inside SipStack::process, I would assume there is no interaction
with any
other threads.

I have tried dumping out various resip objects in the stack to see what
may 
be polluted, but between the template instantiations and iterators, it
is very 
difficult to know what to look for.  I have tons of gdb output that
makes no
sense to me.  

For instance, in frame 3, the reference passed to operator<< is invalid.
This 
code is called in frame 4 from the following line in 
DataParameter.cxx (line 76):
      return stream << getName() << Symbols::EQUALS << mValue;
In frame 4, the DataParameter object is trash:
    (gdb) p *this
    warning: can't find linker symbol for virtual table for
`resip::DataParameter' value
    $8 = {
      <resip::Parameter> = {
        _vptr.Parameter = 0x826c5e0,
        mType = 136346804
      },
      members of resip::DataParameter:
      mValue = {
        mPreBuffer = "N5resip19ParserCo",
        mSize = 1919250025,
        mBuf = 0x65736142 <Address 0x65736142 out of bounds>,
        mCapacity = 69,
        mMine = resip::Data::Share
      },
      mQuoted = false
    }
In frame 5, the calling code is in the stl iterator, called from 
ParserCategory::encodeParameters:
      (*it)->encode(str);
The ParserCategory object looks fine:
    (gdb) p *this
    $9 = {
      <resip::LazyParser> = {
        _vptr.LazyParser = 0x820b708,
        mHeaderField = 0x0,
        mIsMine = true,
        mIsParsed = true
      },
      members of resip::ParserCategory:
      mParameters = {
 
<std::_Vector_base<resip::Parameter*,std::allocator<resip::Parameter*>
>> = {
          _M_impl = {
            <std::allocator<resip::Parameter*>> = {
              <__gnu_cxx::new_allocator<resip::Parameter*>> = {<No data
fields>}, <No data fields>},
            members of
std::_Vector_base<resip::Parameter*,std::allocator<resip::Parameter*>
>::_Vector_impl:
            _M_start = 0x94119b8,
            _M_finish = 0x94119bc,
            _M_end_of_storage = 0x94119bc
          }
        }, <No data fields>},
      mUnknownParameters = {
 
<std::_Vector_base<resip::Parameter*,std::allocator<resip::Parameter*>
>> = {
          _M_impl = {
            <std::allocator<resip::Parameter*>> = {
              <__gnu_cxx::new_allocator<resip::Parameter*>> = {<No data
fields>}, <No data fields>},
            members of
std::_Vector_base<resip::Parameter*,std::allocator<resip::Parameter*>
>::_Vector_impl:
            _M_start = 0x0,
            _M_finish = 0x0,
            _M_end_of_storage = 0x0
          }
        }, <No data fields>},
      mHeaderType = resip::Headers::To
    }

So here I am lost - there seems to be nothing wrong with this object
(which 
I think is a NameAddr).  The mParameters collection looks fine.  The
first 
(and only?) Parameter in that collection is a tag:
    (gdb) p *(this->mParameters._M_impl._M_start)
    $10 = (class resip::Parameter *) 0x93986b0
    (gdb) p *(*(this->mParameters._M_impl._M_start))
    $12 = {
      _vptr.Parameter = 0x82169a8,
      mType = resip::ParameterTypes::tag
    }
But somehow the DataParameter::encode is being called with a this
pointer 
of 0x8207cac, not the 0x93986b0 I see in the collection.  The iterator
is 
pointing at the tag Parameter, so why would that not be called
correctly?:
    (gdb) p it
    $14 = {
      _M_current = 0x94119b8
    }
    

I am definitely not an STL expert, but I would think all of this is
fairly 
thread-safe, so unless I am stomping on memory, how could this code go
wrong?  
And if I was stomping on memory, why does it look fine in the debugger?
The 
only explanation I can come up with is that DataParameter::encode or
something 
in its calling tree trashed the stack and stomped on the this pointer.
But 
how would I find that and how could it happen??

Two last pieces of interesting data (among many others I don't quite
get):
in frame 11, TransportSelector::transmit uses a DataStream on the stack
that in turn uses the mEncoded member of the SipMessage being processed.
Inside that frame, if I look at the address of the stream it is a
different
value than the stack trace shows being passed down the calling tree:
    (gdb) p &encodeStream
    $22 = (class resip::DataStream *) 0xf45f6a90
Why does that get passed as 0xf45f6abc??  Furthermore, the
SipMessage::mEncoded
member looks like it is too small (both the mSize and mCapacity values
are
smaller than the string pointed to by mBuf):
    (gdb) p encoded
    $20 = (resip::Data &) @0x989add4: {
      mPreBuffer = "SIP/2.0 180 Ring",
      mSize = 48,
      mBuf = 0xe162ce80 "SIP/2.0 180 Ringing\r\nTo:
<sip:8002026080@xxxxxxxxx:5062>;tag=01\r\nFrom:
<sip:callDriver@xxxxxxxxxx",
      mCapacity = 96,
      mMine = resip::Data::Take
    }


I know this is a lot to swallow, but I am really lost here.  Thanks to
all
who got this far for just slogging thru it :)

    
Dennis Dupont
Senior Systems Architect
Intelemedia Communications, Inc.