Bug 415

Summary: OLSR's broken dynamic behaviour?
Product: ns-3 Reporter: Egemen <egemen.cetinkaya>
Component: routingAssignee: ns-bugs <ns-bugs>
Status: RESOLVED FIXED    
Severity: major CC: gjcarneiro
Priority: P1    
Version: ns-3.2   
Hardware: PC   
OS: Linux   
Attachments: olsr dynamism test file
interface setdown patch

Description Egemen 2008-11-19 13:12:09 EST
Created attachment 305 [details]
olsr dynamism test file

We were testing some scenarios where after bringing down the links (via interface setdown procedure), the traffic from source to destination does not follow the alternate path. Assuming OLSR is a dynamic routing protocol and in ns-3 environment it can be used both for wired and wireless, it should redirect the traffic through the alternate path. Using the attached test file we couldn't observe this in the tracefiles.

I am also attaching the patch that sets down the interfaces to simulate a down link.
Comment 1 Egemen 2008-11-19 13:12:58 EST
Created attachment 306 [details]
interface setdown patch
Comment 2 Gustavo J. A. M. Carneiro 2008-12-01 07:35:06 EST
OLSR is based on timers.  After bringing down an interface, you have to wait some time (6 seconds IIRC) until neighbor tuples on that interface expire and a new routing table is calculated.

Although I would agree a possible optimization is to force the tuple expiration as soon as the interface comes down.  But I am not sure NS-3 can give the OLSR agent any kind of notification that an interface has gone down; not without new core API anyway...
Comment 3 Egemen 2008-12-01 17:45:21 EST
(In reply to comment #2)
> OLSR is based on timers.  After bringing down an interface, you have to wait
> some time (6 seconds IIRC) until neighbor tuples on that interface expire and a
> new routing table is calculated.
> 
> Although I would agree a possible optimization is to force the tuple expiration
> as soon as the interface comes down.  But I am not sure NS-3 can give the OLSR
> agent any kind of notification that an interface has gone down; not without new
> core API anyway...
> 

In the test file attached, the on/off application starts at 10, and stops at 30 sec. (both for the sink and the source). The interface is being setdown at 15 sec. and from 15 sec to 30 sec there is no traffic. According to your comment there should be traffic from 21 sec to 30 sec., which is not the case.
Comment 4 Gustavo J. A. M. Carneiro 2008-12-02 14:01:07 EST
http://code.nsnam.org/ns-3-dev/rev/8658841e4782

Details of changes are in the log message.  It should be easy to backport the patch to ns-3.2.

I should warn, though, that it takes a very long time for OLSR to redirect the flow through the new path.  This is kind of a pathological case, I guess.  In this network, normally none of the nodes are MPRs and so they do not ever emit TC messages.  At second 15 a link comes down, and the following has to take place before the new route is acquired:

  1- OLSR only notices the link is down after 3 lost HELLOs, i.e. 6 seconds;

  2- Only in next HELLO that node 2 sends (up to 2 seconds wait) will node 2 inform node 1 that it (node 1) has been selected as MPR of node 2;

  3- In the next TC emission time (wait of up to 5 seconds) node 1 will finally send its first TC;

  4- Node 2 finally receives a TC and now has the topology information needed to compute a route to node 0.

In my tests, only around simulation time 27s is the new route acquired.  I know the performance isn't great, but as far as I know it's the way OLSR is.

Closing, please reopen if the fix doesn't work for you, or open a new bug report if you find any other unrelated problem.
Comment 5 Gustavo J. A. M. Carneiro 2008-12-03 12:32:33 EST
(In reply to comment #4)
> http://code.nsnam.org/ns-3-dev/rev/8658841e4782
> 
> Details of changes are in the log message.  It should be easy to backport the
> patch to ns-3.2.
> 
> I should warn, though, that it takes a very long time for OLSR to redirect the
> flow through the new path.  This is kind of a pathological case, I guess.  In
> this network, normally none of the nodes are MPRs and so they do not ever emit
> TC messages.  At second 15 a link comes down, and the following has to take
> place before the new route is acquired:
> 
>   1- OLSR only notices the link is down after 3 lost HELLOs, i.e. 6 seconds;
> 
>   2- Only in next HELLO that node 2 sends (up to 2 seconds wait) will node 2
> inform node 1 that it (node 1) has been selected as MPR of node 2;
> 
>   3- In the next TC emission time (wait of up to 5 seconds) node 1 will finally
> send its first TC;
> 
>   4- Node 2 finally receives a TC and now has the topology information needed
> to compute a route to node 0.

Well, actually I just realized that OLSR supposedly computes routing table also from two hop neighbors, so in theory the time without correct route should be only 6 seconds (3 HELLO intervals), not 11 seconds.  I must be missing something, but unfortunately have no time right now to look in more detail.
Comment 6 Egemen 2008-12-06 13:38:47 EST
Gustavo,

I was/am/will be busy as well. Worst case, we will look to it after Dec. 17 release (either by opening a new bug report or continuing on this bug).

Regards,

Egemen