[reSIProcate] Uri escaping

Scott Godin sgodin at sipspectrum.com
Fri Apr 9 08:52:34 CDT 2010


Why not just fix Uri::shouldEscapeUserChar so that if c < 0 then return
true, instead of false?

inline bool
Uri::shouldEscapeUserChar(char c)
{
   if(!mEncodingReady)
   {
      initialiseEncodingTables();
   }

   if(c < 0)
   {
      return false;
   }

   return mUriEncodingUserTable[c];
}

Scott

2010/4/9 Jeremy Geras <jgeras at counterpath.com>

>  Here's an attempt at fixing escapeToStream(..) which seems to work OK...
>
>
>
> template<class Predicate> EncodeStream&
>
> Data::escapeToStream(EncodeStream& str, Predicate shouldEscape) const
>
> {
>
>    static char hex[] = "0123456789ABCDEF";
>
>
>
>    if (empty())
>
>    {
>
>       return str;
>
>    }
>
>
>
>    const char* p = mBuf;
>
>    const char* e = mBuf + mSize;
>
>
>
>    while (p < e)
>
>    {
>
>       // ?abr? Why is this special cased? Removing this code
>
>       // does not change the behavior of this method.
>
>       if (*p == '%'
>
>           && e - p > 2
>
>           && isHex(*(p+1))
>
>           && isHex(*(p+2)))
>
>       {
>
>          str.write(p, 3);
>
>          p+=3;
>
>       }
>
>       else if (shouldEscape(*p))
>
>       {
>
>          int hi = (*p & 0xF0)>>4;
>
>          int low = (*p & 0x0F);
>
>
>
>          str << '%' << hex[hi] << hex[low];
>
>          p++;
>
>       }
>
> *// BEGIN NEW*
>
>       else if ((*p) < 0)
>
>       {
>
>          // count the number of UTF-8 bytes
>
>          if (((*p) | 0x1F) == ~0x20)
>
>          {
>
>             Data utf8(Data::Borrow, p, 2);
>
>             utf8.urlEncode(str);
>
>             p+=2;
>
>          }
>
>          else if (((*p) | 0x0F) == ~0x10)
>
>          {
>
>             Data utf8(Data::Borrow, p, 3);
>
>             utf8.urlEncode(str);
>
>             p+=3;
>
>          }
>
>          else if (((*p) | 0x07) == ~0x08)
>
>          {
>
>             Data utf8(Data::Borrow, p, 4);
>
>             utf8.urlEncode(str);
>
>             p+=4;
>
>          }
>
>          else
>
>          {
>
>             str.put(*p++);
>
>          }
>
>       }
>
> *// END NEW*
>
>       else
>
>       {
>
>          str.put(*p++);
>
>       }
>
>    }
>
>    return str;
>
> }
>
>
>
> *From:* resiprocate-devel-bounces at resiprocate.org [mailto:
> resiprocate-devel-bounces at resiprocate.org] *On Behalf Of *Jeremy Geras
> *Sent:* April-08-10 4:05 PM
>
> *To:* Scott Godin
> *Cc:* resiprocate-devel at resiprocate.org
> *Subject:* Re: [reSIProcate] Uri escaping
>
>
>
> So my latest thinking is that my attempted fix is definitely bad, since it
> doesn't take into account the special 3261 rules (like + being a valid user
> character).
>
>
>
> So I guess the real fix is to fix escapeToStream(..) so that it can detect
> UTF-8 characters and % HEX HEX encode them...?
>
>
>
> *From:* Jeremy Geras
> *Sent:* April-08-10 2:41 PM
> *To:* 'Scott Godin'
> *Cc:* resiprocate-devel at resiprocate.org
> *Subject:* RE: [reSIProcate] Uri escaping
>
>
>
> Hmmm... Won't Uri::shouldEscapeUserChar return false with non-US ASCII
> characters?  (That's what I'm seeing... c == -60) when the URI contains a
> character like Ă.
>
>
>
> *From:* slgodin at gmail.com [mailto:slgodin at gmail.com] *On Behalf Of *Scott
> Godin
> *Sent:* April-08-10 2:31 PM
> *To:* Jeremy Geras
> *Cc:* resiprocate-devel at resiprocate.org
> *Subject:* Re: [reSIProcate] Uri escaping
>
>
>
> I agree that RFC3261 specifies the user part should be escaped (ie. %
> HEXDIG HEXDIG):
>
>
>
> user             =  1*( unreserved / escaped / user-unreserved )
>
>  However looking at the Data::escapeToStream method - it should be
> escaping using (% HEXDIG HEXDIG) - it's not really clear to me how you are
> ending up with non-escaped characters in the encoding.  We should figure out
> why escapeToStream is not working for you.
>
>
>
> Scott
>
>
>
> On Thu, Apr 8, 2010 at 4:59 PM, Jeremy Geras <jgeras at counterpath.com>
> wrote:
>
> Hi,
>
>
>
> General question:  How should URIs be escaped?
>
>
>
> See the attached trace for an example of what we're seeing with resip --
> note the Contact header in the NOTIFY, and the subsequent Request-URI in the
> re-SUBSCRIBE.  Should the URI in the request line be URL-encoded?  (Pouring
> through the BNF in 3261 leads me to think that it should...)
>
>
>
> Note that this is a trace from a deployment where the re-SUBSCRIBE actually
> fails because of the lack of proper escaping.  I tried making the following
> change in Uri.cxx, and with that in place the re-SUBSCRIBE works.  However
> I'm not sure if this is correct, or if this is the right spot.  Thoughts?
>
>
>
> Thanks,
>
> Jeremy
>
>
>
> // should not encode user parameters unless its a tel?
>
> EncodeStream&
>
> Uri::encodeParsed(EncodeStream& str) const
>
> {
>
>    str << mScheme << Symbols::COLON;
>
>    if (!mUser.empty())
>
>    {
>
> #ifdef HANDLE_CHARACTER_ESCAPING
>
>       //mUser.escapeToStream(str, shouldEscapeUserChar); <== ORIGINAL CODE
>
>       mUser.urlEncode(str);
>
> #else
>
>       str << mUser;
>
> #endif
>
>
>
>  ... snip ...
>
>
>
>
>
>
> _______________________________________________
> resiprocate-devel mailing list
> resiprocate-devel at resiprocate.org
> https://list.resiprocate.org/mailman/listinfo/resiprocate-devel
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://list.resiprocate.org/pipermail/resiprocate-devel/attachments/20100409/a03de738/attachment.htm>


More information about the resiprocate-devel mailing list