< Previous by Date	Date Index	Next by Date >
< Previous in Thread	Thread Index

Re: [reSIProcate] Uri escaping

From: Jeremy Geras <jgeras@xxxxxxxxxxxxxxx>
Date: Fri, 9 Apr 2010 13:07:32 -0400

I guess that would change the default to always escaping anything outside of US ASCII (be it valid UTF-8 bytes or garbage). Sounds good to me.

Thanks,

Jeremy

From: slgodin@xxxxxxxxx [mailto:slgodin@xxxxxxxxx] On Behalf Of Scott Godin
Sent: April-09-10 6:53 AM
To: Jeremy Geras
Cc: resiprocate-devel@xxxxxxxxxxxxxxx
Subject: Re: [reSIProcate] Uri escaping

Why not just fix Uri::shouldEscapeUserChar so that if c < 0 then return true, instead of false?

inline bool

Uri::shouldEscapeUserChar(char c)

{

if(!mEncodingReady)

{

initialiseEncodingTables();

}

if(c < 0)

{

return false;

}

return mUriEncodingUserTable[c];

}

Scott

2010/4/9 Jeremy Geras <jgeras@xxxxxxxxxxxxxxx>

Here's an attempt at fixing escapeToStream(..) which seems to work OK...

template<class Predicate> EncodeStream&

Data::escapeToStream(EncodeStream& str, Predicate shouldEscape) const

{

static char hex[] = "0123456789ABCDEF";

if (empty())

{

return str;

}

const char* p = mBuf;

const char* e = mBuf + mSize;

while (p < e)

{

// ?abr? Why is this special cased? Removing this code

// does not change the behavior of this method.

if (*p == '%'

&& e - p > 2

&& isHex(*(p+1))

&& isHex(*(p+2)))

{

str.write(p, 3);

p+=3;

}

else if (shouldEscape(*p))

{

int hi = (*p & 0xF0)>>4;

int low = (*p & 0x0F);

str << '%' << hex[hi] << hex[low];

p++;

}

// BEGIN NEW

else if ((*p) < 0)

{

// count the number of UTF-8 bytes

if (((*p) | 0x1F) == ~0x20)

{

Data utf8(Data::Borrow, p, 2);

utf8.urlEncode(str);

p+=2;

}

else if (((*p) | 0x0F) == ~0x10)

{

Data utf8(Data::Borrow, p, 3);

utf8.urlEncode(str);

p+=3;

}

else if (((*p) | 0x07) == ~0x08)

{

Data utf8(Data::Borrow, p, 4);

utf8.urlEncode(str);

p+=4;

}

else

{

str.put(*p++);

}

// END NEW

else

{

str.put(*p++);

}

return str;

}

From: resiprocate-devel-bounces@xxxxxxxxxxxxxxx [mailto:resiprocate-devel-bounces@xxxxxxxxxxxxxxx] On Behalf Of Jeremy Geras
Sent: April-08-10 4:05 PM

To: Scott Godin
Cc: resiprocate-devel@xxxxxxxxxxxxxxx

Subject: Re: [reSIProcate] Uri escaping

So my latest thinking is that my attempted fix is definitely bad, since it doesn't take into account the special 3261 rules (like + being a valid user character).

So I guess the real fix is to fix escapeToStream(..) so that it can detect UTF-8 characters and % HEX HEX encode them...?

From: Jeremy Geras
Sent: April-08-10 2:41 PM
To: 'Scott Godin'
Cc: resiprocate-devel@xxxxxxxxxxxxxxx
Subject: RE: [reSIProcate] Uri escaping

Hmmm... Won't Uri::shouldEscapeUserChar return false with non-US ASCII characters? (That's what I'm seeing... c == -60) when the URI contains a character like Ă.

From: slgodin@xxxxxxxxx [mailto:slgodin@xxxxxxxxx] On Behalf Of Scott Godin
Sent: April-08-10 2:31 PM
To: Jeremy Geras
Cc: resiprocate-devel@xxxxxxxxxxxxxxx
Subject: Re: [reSIProcate] Uri escaping

I agree that RFC3261 specifies the user part should be escaped (ie. % HEXDIG HEXDIG):

user             =  1*( unreserved / escaped / user-unreserved )

However looking at the Data::escapeToStream method - it should be escaping using (% HEXDIG HEXDIG) - it's not really clear to me how you are ending up with non-escaped characters in the encoding. We should figure out why escapeToStream is not working for you.

Scott

On Thu, Apr 8, 2010 at 4:59 PM, Jeremy Geras <jgeras@xxxxxxxxxxxxxxx> wrote:

Hi,

General question: How should URIs be escaped?

See the attached trace for an example of what we're seeing with resip -- note the Contact header in the NOTIFY, and the subsequent Request-URI in the re-SUBSCRIBE. Should the URI in the request line be URL-encoded? (Pouring through the BNF in 3261 leads me to think that it should...)

Note that this is a trace from a deployment where the re-SUBSCRIBE actually fails because of the lack of proper escaping. I tried making the following change in Uri.cxx, and with that in place the re-SUBSCRIBE works. However I'm not sure if this is correct, or if this is the right spot. Thoughts?

Thanks,

Jeremy

// should not encode user parameters unless its a tel?

EncodeStream&

Uri::encodeParsed(EncodeStream& str) const

{

str << mScheme << Symbols::COLON;

if (!mUser.empty())

{

#ifdef HANDLE_CHARACTER_ESCAPING

//mUser.escapeToStream(str, shouldEscapeUserChar); <== ORIGINAL CODE

mUser.urlEncode(str);

#else

str << mUser;

#endif

... snip ...

_______________________________________________
resiprocate-devel mailing list
resiprocate-devel@xxxxxxxxxxxxxxx
https://list.resiprocate.org/mailman/listinfo/resiprocate-devel

References:
- [reSIProcate] Uri escaping
  - From: Jeremy Geras
- Re: [reSIProcate] Uri escaping
  - From: Scott Godin
- Re: [reSIProcate] Uri escaping
  - From: Jeremy Geras
- Re: [reSIProcate] Uri escaping
  - From: Jeremy Geras
- Re: [reSIProcate] Uri escaping
  - From: Scott Godin

Prev by Date: Re: [reSIProcate] Uri escaping
Next by Date: [reSIProcate] Resip History Visualization
Previous by thread: Re: [reSIProcate] Uri escaping
Next by thread: [reSIProcate] Resip History Visualization
Index(es):
- Date
- Thread