Re: [reSIProcate] resiprocate crash due to no free memory when receiving sip message with large header
On Jul 10, 2008, at 10:39 , Byron Campen wrote:
Actually, this patch will cause heap corruption, since mMessage has
taken ownership of mBuffer. A better one would be:
Index: resip/stack/ConnectionBase.cxx
===================================================================
--- resip/stack/ConnectionBase.cxx (revision 8120)
+++ resip/stack/ConnectionBase.cxx (working copy)
@@ -176,16 +176,25 @@
else
{
// ...but some of the chunk must be shifted into the
next one.
- size_t size = numUnprocessedChars*3/2;
- if (size < ConnectionBase::ChunkSize)
+ if (numUnprocessedChars >= ConnectionBase::ChunkSize)
{
- size = ConnectionBase::ChunkSize;
+ // .bwc. We have lots of unprocessed chars here;
this can be
+ // caused if we get a really, really large header
field value.
+ // We can safely reject this, I think.
I don't think you can reject this -- what if the header is very large?
Legitimately?
I think that in general we should detect that we have an unprocessed
number of bytes > chunkSize and grow chunkSize to accommodate this
case (doubling or 3/2s at a time) as was the intent of the original
code...
// ...but some of the chunk must be shifted into the
next one.
size_t size = numUnprocessedChars*3/2;
if (size < ConnectionBase::ChunkSize)
{
size = ConnectionBase::ChunkSize;
}
char* newBuffer =
MsgHeaderScanner::allocateBuffer(size);
memcpy(newBuffer, unprocessedCharPtr,
numUnprocessedChars);
mBuffer = newBuffer;
mBufferPos = numUnprocessedChars;
mBufferSize = size;
}
The above is my 'main' branch ConnectionBase code....
Note that the size of the buffer will grow by 3/2s the number of
unprocessed chars. In the case where the header has a value that
continues beyond 2 buffers in size we SHOULD see the following behavior:
[ a 60 char value for the 3 char header HDR ]
HDR : valuevalu1valuevalu2valuevalu3valuevalu4valuevalu5valuevalu6
With a ChunkSize of 16 (for simplicity's sake)
0 1 2 3 4 5 6 7 8 9 A B C D E F
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0000| H | D | R | | : | | v | a | l | u | e | v | a | l | u | 1 |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0010| v | a | l | u | e | v | a | l | u | 2 | v | a | l | u | e | v |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0020| a | l | u | 3 | v | a | l | u | e | v | a | l | u | 4 | v | a |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0030| l | u | e | v | a | l | u | 5 | v | a | l | u | e | v | a | l |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0040| u | 6 |
+---+---+
chunk0:
0 1 2 3 4 5 6 7 8 9
+---+---+---+---+---+---+---+---+---+---+
0000| H | D | R | | : | | v | a | l | u |
+---+---+---+---+---+---+---+---+---+---+
^ ^
`-- Unknown(l=3) `-- unprocessedCharPtr
After the first scanner pass, the SIP message has a header pointer
(Unknown Header in this case), that points to chunk0+0x0 and has a
length of 3. Therefore, some of chunk0's contents are being used by
the message The unprocessedCharPtr has value chunk0 + 0x6 in this
case.
The code will then allocate a new buffer, necessarily copying chunk0
into the new buffer, of size unproc*3/2 (6 in this case) -- this is a
bit odd since we would expect better allocation related to chunksize,
not # unprocessed chars. The latter is random based on the framing of
the headers within the chunk.
chunk1 ends up being:
0 1 2 3 4 5
+---+---+---+---+---+---+
0000| v | a | l | u | e | v |
+---+---+---+---+---+---+
^
`- unprocessedCharPtr
Another round (which requires that we save chunk1 gets a 9 char buffer:
0 1 2 3 4 5 6 7 8
+---+---+---+---+---+---+---+---+---+
0000| v | a | l | u | e | v | a | l | u |
+---+---+---+---+---+---+---+---+---+
Another few rounds and we have: chunk(size)
chunk2 (13), chunk3(19),
chunk4 (19), chunk5(28),
chunk6 (42), chunk7(63)
Chunk7 is finally large enough to hold the entire header:
0 1 2 3 4 5 6 7 8 9 A B C D E F
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0000| v | a | l | u | e | v | a | l | u | 1 | v | a | l | u | e | v |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0010| a | l | u | 2 | v | a | l | u | e | v | a | l | u | 3 | v | a |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0020| l | u | e | v | a | l | u | 4 | v | a | l | u | e | v | a | l |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0030| u | 5 | v | a | l | u | e | v | a | l | u | 6 |
+---+---+---+---+---+---+---+---+---+---+---+---+
This was a really terrible growth mechanism since it too 7 or so
copies to get the data for the header in a contiguous buffer. Worse,
as you point out, chunks 1 through 6 are wasted memory.
Here is what we have floating around...
chunk0:
0 1 2 3 4 5 6 7 8 9
+---+---+---+---+---+---+---+---+---+---+
0000| H | D | R | | : | | v | a | l | u |
+---+---+---+---+---+---+---+---+---+---+
^ ^
`-- Unknown(l=3) `-- unprocessedCharPtr
chunk1:
0 1 2 3 4 5
+---+---+---+---+---+---+
0000| v | a | l | u | e | v |
+---+---+---+---+---+---+
2
0 1 2 3 4 5 6 7 8
+---+---+---+---+---+---+---+---+---+
0000| v | a | l | u | e | v | a | l | u |
+---+---+---+---+---+---+---+---+---+
3
0 1 2 3 4 5 6 7 8 9 A B C
+---+---+---+---+---+---+---+---+---+---+---+---+---+
0000| v | a | l | u | e | v | a | l | u | 1 | v | a | l |
+---+---+---+---+---+---+---+---+---+---+---+---+---+
4
0 1 2 3 4 5 6 7 8 9 A B C D E F
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0000| v | a | l | u | e | v | a | l | u | 1 | v | a | l | u | e | v |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0010| a | l | u |
+---+---+---+
5
0 1 2 3 4 5 6 7 8 9 A B C D E F
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0000| v | a | l | u | e | v | a | l | u | 1 | v | a | l | u | e | v |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0010| a | l | u | 2 | v | a | l | u | e | v | a | l |
+---+---+---+---+---+---+---+---+---+---+---+---+
6
0 1 2 3 4 5 6 7 8 9 A B C D E F
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0000| v | a | l | u | e | v | a | l | u | 1 | v | a | l | u | e | v |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0010| a | l | u | 2 | v | a | l | u | e | v | a | l | u | 3 | v | a |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0020| l | u | e | v | a | l | u | 4 | v | a |
+---+---+---+---+---+---+---+---+---+---+
If you look in the code ...
//.jacob. I've discarded the "assigned" concept.
Once upon a time there was a concept of 'assigned' on a Preparse
buffer. If this was set when you had a fragmentation issue, you could
safely release the buffer since NOTHING in the SIP header is using the
buffer. NOTE WELL: THe first buffer is being used for the header field
name. The second through n-1 buffers are useless duplication.
I think this is a bug that was introduced with the MHS implementation
when the assigned state was taken out of the FSM. Not a big deal and
we can fix it.
Additionally, it would be a great idea for us to change the test cases
for connected transports to use a rediculously small chunkSize to
validate that this all works in all cases ( I haven't checked that we
do this yet ).
Thanks,
Alan Hawrylyshen
Early reSIProcate Developer and Lurker