< Previous by Date Date Index Next by Date >
< Previous in Thread Thread Index Next in Thread >

Re: [reSIProcate] DNS graylist issue with non-INVITE transaction failures


Any objections to the following type of change? It would modify the signature of the current MarkListener::onMark callback to allow changing of the entries via pass-by-reference. I could always add a new callback onMarkCreated if breaking source compatibility is bad.

If one of the listeners doesn't want the mark to be added they would just set expiry to 0.

--- resip/stack/MarkListener.hxx (revision 9087)
+++ resip/stack/MarkListener.hxx (working copy)
@@ -9,7 +9,7 @@
 {
    public:
       virtual ~MarkListener() {}
-      virtual void onMark(const Tuple& target,TupleMarkManager::MarkType mark)= 0;
+      virtual void onMark(const Tuple& target,UInt64& expiry, TupleMarkManager::MarkType& mark)= 0;
 };
 }
 
--- resip/stack/TupleMarkManager.cxx (revision 9087)
+++ resip/stack/TupleMarkManager.cxx (working copy)
@@ -25,7 +25,9 @@
       {
          mList.erase(i);
          // ?bwc? Should we do this?
-         notifyListeners(tuple,OK);
+         UInt64 expiry = 0;
+         MarkType mark = OK;
+         notifyListeners(tuple,expiry,mark);
       }
    }
    
@@ -34,10 +36,11 @@
  void TupleMarkManager::mark(const Tuple& tuple,UInt64 expiry,MarkType mark)
 {
+   // .amr. Notify listeners first so they can change the entry if they want
+   notifyListeners(tuple,expiry,mark);
    ListEntry entry(tuple,expiry);
    resip::Lock g(mListMutex);
    mList[entry]=mark;
-   notifyListeners(tuple,mark);
 }
 
 void TupleMarkManager::registerMarkListener(MarkListener* listener)
@@ -51,11 +54,11 @@
 }
 
 void
-TupleMarkManager::notifyListeners(const resip::Tuple& tuple, MarkType mark)
+TupleMarkManager::notifyListeners(const resip::Tuple& tuple, UInt64& expiry, MarkType& mark)
 {
    for(Listeners::iterator i = mListeners.begin(); i!=mListeners.end(); ++i)
    {
-      (*i)->onMark(tuple,mark);
+      (*i)->onMark(tuple,expiry,mark);
    }
 }
 
--- resip/stack/TupleMarkManager.hxx (revision 9087)
+++ resip/stack/TupleMarkManager.hxx (working copy)
@@ -57,9 +57,7 @@
       typedef std::set<MarkListener*> Listeners;
       Listeners mListeners;
       
-      void notifyListeners(const resip::Tuple& tuple, MarkType mark);
-
-
+      void notifyListeners(const resip::Tuple& tuple, UInt64& expiry, MarkType& mark);
 };
 
 }


Aron Rosenberg
Sr. Director, Engineering,
LifeSize, a division of Logitech




On Mon, Oct 3, 2011 at 2:16 PM, Scott Godin <sgodin@xxxxxxxxxxxxxxx> wrote:
Extensibility is good.  : )


On Mon, Oct 3, 2011 at 4:11 PM, Aron Rosenberg <arosenberg@xxxxxxxxxxxx> wrote:
What do you think about modifying the MarkListener class to give the App Writer control over inserts to black/gray list?


Aron Rosenberg
Sr. Director, Engineering,
LifeSize, a division of Logitech




On Mon, Oct 3, 2011 at 11:50 AM, Scott Godin <sgodin@xxxxxxxxxxxxxxx> wrote:
This problem is discussed in the following RFC's:  http://tools.ietf.org/html/rfc4320http://tools.ietf.org/html/rfc4321

RFC Statements:  

Without a provisional, a late final response is the same as no response at all and will likely result in blacklisting the late- responding element ([3]). If an element is delaying its final response at all, sending a 100 Trying after Timer E reaches T2 prevents this blacklisting without damaging recovery from unreliable transport failure.

4. Normative Updates to RFC 3261

4.1. Action 1

An SIP element MUST NOT send any provisional response with a Status- Code other than 100 to a non-INVITE request. An SIP element MUST NOT respond to a non-INVITE request with a Status-Code of 100 over any unreliable transport, such as UDP, before the amount of time it takes a client transaction's Timer E to be reset to T2. An SIP element MAY respond to a non-INVITE request with a Status-Code of 100 over a reliable transport at any time. Without regard to transport, an SIP element MUST respond to a non- INVITE request with a Status-Code of 100 if it has not otherwise responded after the amount of time it takes a client transaction's Timer E to be reset to T2.



I believe the correct solution is for the proxy to send a 100 trying using the guidelines above.

Scott

On Mon, Oct 3, 2011 at 1:45 PM, Aron Rosenberg <arosenberg@xxxxxxxxxxxx> wrote:
I have the following topology setup.

Focus A  <--> Proxy A <--> Proxy B <--> User B

User B does a SUBSCRIBE and INVITE to Focus A

At some point, User B loses connectivity due to a WiFi dropout or other such network interface change that is transient.
If Focus A sends a NOTIFY to User B, the NOTIFY to User B will fail with a locally generated 408 before Proxy B sends back a 503 or 504 message. Also because this is a non-INVITE transaction, no 100 Trying is sent by Proxy A or Proxy B.

Now because this was a locally generated 408 error a DNS graylist entry is added for the IP Address of Proxy A since it was an in-dialog message that was sent based on a numerical IP Route header.

This then causes all of Focus A's other traffic which is Routed through Proxy A to temporarily fail including new transactions.

Another interesting side effect is that even though the original Route header did not include a transport= value (Route: x.x.x.x:port), resip will try to contact that IP:Port with TCP even though TCP was never part of the original DNS lookup (not in the SRV records).

I can see a few possible ways to fix this issue:

- Change Proxy A to send 100 Trying for non-INVITE transactions. This will change the TransactionState from "Trying" to "Connecting" which then won't trigger a graylist (requires Proxy changes)
- Change TransactionState class to only graylist entries for INVITE transactions based on "mMachine==ClientInvite" since those are the only ones which seem to send 100 Trying messages.
- Allow Application writer control over black/gray list.
   - If we don't want to break source compatibility, we could modify the MarkListener class to add a new virtual function "before" insertion which could return true/false or allow modifying the Mark or Expirery.
   - If we wanted to break source compatibility, we could modify the current onMark callback to have a new signature and be called before insertion allowing changing of the values.

If anybody is interested, I have a full resip debug log showing this issue from Focus A's perspective.

Aron Rosenberg
Sr. Director, Engineering,
LifeSize, a division of Logitech



_______________________________________________
resiprocate-devel mailing list
resiprocate-devel@xxxxxxxxxxxxxxx
https://list.resiprocate.org/mailman/listinfo/resiprocate-devel



_______________________________________________
resiprocate-devel mailing list
resiprocate-devel@xxxxxxxxxxxxxxx
https://list.resiprocate.org/mailman/listinfo/resiprocate-devel