< Previous by Date | Date Index | Next by Date > |
< Previous in Thread | Thread Index |
I guess that would change the default to always escaping
anything outside of US ASCII (be it valid UTF-8 bytes or garbage). Sounds good
to me. Thanks, Jeremy From: slgodin@xxxxxxxxx
[mailto:slgodin@xxxxxxxxx] On Behalf Of Scott Godin Why not just fix Uri::shouldEscapeUserChar so that if c <
0 then return true, instead of false? inline bool Uri::shouldEscapeUserChar(char c) { if(!mEncodingReady) { initialiseEncodingTables(); } if(c < 0) { return false; } return mUriEncodingUserTable[c]; } Scott 2010/4/9 Jeremy Geras <jgeras@xxxxxxxxxxxxxxx> Here's an attempt at fixing
escapeToStream(..) which seems to work OK... template<class Predicate> EncodeStream& Data::escapeToStream(EncodeStream&
str, Predicate shouldEscape) const {
static char
hex[] = "0123456789ABCDEF";
if (empty())
{
return str;
}
const char* p =
mBuf;
const char* e =
mBuf + mSize;
while (p < e)
{
// ?abr? Why is this special cased? Removing this
code
// does not change the behavior of this method.
if (*p == '%'
&& e - p > 2
&& isHex(*(p+1))
&& isHex(*(p+2)))
{
str.write(p, 3);
p+=3;
}
else if
(shouldEscape(*p))
{
int hi = (*p & 0xF0)>>4;
int low = (*p & 0x0F);
str << '%' << hex[hi] <<
hex[low];
p++;
} // BEGIN NEW
else if ((*p)
< 0)
{
// count the number of UTF-8 bytes
if (((*p) | 0x1F) == ~0x20)
{
Data utf8(Data::Borrow, p, 2);
utf8.urlEncode(str);
p+=2;
}
else if (((*p)
| 0x0F) == ~0x10)
{
Data utf8(Data::Borrow, p, 3);
utf8.urlEncode(str);
p+=3;
}
else if (((*p)
| 0x07) == ~0x08)
{
Data utf8(Data::Borrow, p, 4);
utf8.urlEncode(str);
p+=4;
}
else
{
str.put(*p++);
}
} // END NEW
else
{
str.put(*p++);
}
}
return str; } From: resiprocate-devel-bounces@xxxxxxxxxxxxxxx
[mailto:resiprocate-devel-bounces@xxxxxxxxxxxxxxx] On Behalf Of Jeremy
Geras
Subject: Re: [reSIProcate] Uri escaping So my latest thinking is that my
attempted fix is definitely bad, since it doesn't take into account the special
3261 rules (like + being a valid user character). So I guess the real fix is to fix
escapeToStream(..) so that it can detect UTF-8 characters and % HEX HEX encode
them...? From: Jeremy Geras Hmmm... Won't
Uri::shouldEscapeUserChar return false with non-US ASCII characters?
(That's what I'm seeing... c == -60) when the URI contains a character like Ă. From: slgodin@xxxxxxxxx
[mailto:slgodin@xxxxxxxxx]
On Behalf Of Scott Godin I
agree that RFC3261 specifies the user part should be escaped (ie. % HEXDIG
HEXDIG): user = 1*( unreserved / escaped / user-unreserved ) However
looking at the Data::escapeToStream method - it should be escaping using (%
HEXDIG HEXDIG) - it's not really clear to me how you are ending up with
non-escaped characters in the encoding. We should figure out why
escapeToStream is not working for you. Scott On
Thu, Apr 8, 2010 at 4:59 PM, Jeremy Geras <jgeras@xxxxxxxxxxxxxxx>
wrote: Hi, General question: How should URIs be escaped? See the attached trace for an example of what we're seeing with
resip -- note the Contact header in the NOTIFY, and the subsequent Request-URI
in the re-SUBSCRIBE. Should the URI in the request line be URL-encoded?
(Pouring through the BNF in 3261 leads me to think that it should...) Note that this is a trace from a deployment where the re-SUBSCRIBE
actually fails because of the lack of proper escaping. I tried making the
following change in Uri.cxx, and with that in place the re-SUBSCRIBE
works. However I'm not sure if this is correct, or if this is the right
spot. Thoughts? Thanks, Jeremy // should not encode user parameters unless its a tel? EncodeStream&
Uri::encodeParsed(EncodeStream&
str) const {
str << mScheme << Symbols::COLON;
if (!mUser.empty())
{ #ifdef
HANDLE_CHARACTER_ESCAPING
//mUser.escapeToStream(str, shouldEscapeUserChar);
<== ORIGINAL CODE
mUser.urlEncode(str); #else str << mUser; #endif ... snip ...
|