Bug 555 - DCF immediate access bug
: DCF immediate access bug
Status: NEW
: ns-3
wifi
: ns-3-dev
: All All
: P2 critical
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2009-04-24 03:52 EDT by
Modified: 2009-12-11 06:08 EDT (History)


Attachments
DCF-collision-bug.patch (1.98 KB, patch)
2009-04-24 03:52 EDT, TimoB
Details | Diff
DCF collision (10.85 KB, patch)
2009-05-11 11:10 EDT, TimoB
Details | Diff
DCF collision (14.82 KB, patch)
2009-06-08 11:08 EDT, TimoB
Details | Diff
New patch to work with revision 5762:ae78a8de6f5f (18.25 KB, patch)
2009-11-20 10:49 EDT, Faker Moatamri
Details | Diff
Updated patch against 5768:a07d60db8448 (18.22 KB, patch)
2009-11-23 13:16 EDT, TimoB
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2009-04-24 03:52:29 EDT
Created an attachment (id=427) [details]
DCF-collision-bug.patch

DcfManager handes immediate access to the medium incorrectly:
That the backoff counter == 0 is an incorrect criterion, the backoff end must
have elapsed.

This occurs for two subsequenct unACKed frames, where the second gets 0 backoff
time. Then the access mechanisms incorrectly signals an immediate collision
instead of waiting AIFS.

Furthermore, immediate medium access should be possible at simulator start,
because all backoffs should be regarded as elapsed. This is easily fixed by
setting negative start times.

The patch applies these fixes. It modifies wifi-test.cc validation and two
ref-traces.
------- Comment #1 From 2009-04-24 04:33:04 EDT -------
(In reply to comment #0)
> Created an attachment (id=427) [details] [details]
> DCF-collision-bug.patch
> 
> DcfManager handes immediate access to the medium incorrectly:
> That the backoff counter == 0 is an incorrect criterion, the backoff end must
> have elapsed.
> 
> This occurs for two subsequenct unACKed frames, where the second gets 0 backoff
> time. Then the access mechanisms incorrectly signals an immediate collision
> instead of waiting AIFS.
> 
> Furthermore, immediate medium access should be possible at simulator start,
> because all backoffs should be regarded as elapsed. This is easily fixed by
> setting negative start times.
> 
> The patch applies these fixes. It modifies wifi-test.cc validation and two
> ref-traces.

I think that you are saying that this breaks validation of dcf-manager-test.cc.
If so, you need to provide a patch to fix it. Adding a testcase to
DcfManagerTest which demonstrates the old buggy behavior and the new correct
behavior would also be needed.

I should point out that there are a couple of unneeded changes in this patch
(Changing MicroSeconds (0) to Seconds (0.0)): This piece of code is really
tricky and critical so, patches to it should really really be minimal: if you
wish to make non-functional changes later to change MicroSeconds to Seconds,
that is fine with me.
------- Comment #2 From 2009-04-24 04:39:26 EDT -------
(In reply to comment #0)
> Created an attachment (id=427) [details] [details]
> DCF-collision-bug.patch
> 
> DcfManager handes immediate access to the medium incorrectly:
> That the backoff counter == 0 is an incorrect criterion, the backoff end must
> have elapsed.
> 
> This occurs for two subsequenct unACKed frames, where the second gets 0 backoff
> time. Then the access mechanisms incorrectly signals an immediate collision
> instead of waiting AIFS.

I am sorry but I really don't see what is the relationship between a 0 backoff
time and an immediate collision (I would expect that if the backoff is zero,
there is no immediate collition). To re-iterate my previous comment, you really
need to first provide a testcase for that bug in dcf-manager-test.cc with your
fix.

> 
> Furthermore, immediate medium access should be possible at simulator start,
> because all backoffs should be regarded as elapsed. This is easily fixed by
> setting negative start times.

This is really an arbitrary decision and I don't believe it matters in any way
so, if I had to choose between the current behavior and the new behavior, I
would pick the one which minimizes the number of changes to the code. Either
way, you really have to leave this change out of what you describe as a more
real bug for 2 subsequent unacked frames.
------- Comment #3 From 2009-05-11 11:09:13 EDT -------
Okay, the situation is even more complex than I thought.

I've extended DcfManagerTest to pinpoint the bug. To do that I had to add alot
of code, because DcfManagerTest didnt perform a backoff for successful
transmission. So that added could write the test case for the bug:

  // Test case where two packets are queued at DcaTxop, the first send starts
  // immediately and the second is also immediately indicated but has to wait
  // for its backoff.
  //
  //  20          60     66      70   80
  //   | tx        | sifs | aifs  | tx |
  //   |20: request access for both packets

Where the backoff of the second tx is 0. In original ns-3-dev, this test yields
a collision for the second tx at 20, where no collision should be indicated,
because the medium is busy from the first tx.

The DcfManagerTest patch contains a second test case with backoff=2 which
doesnt throw a collision, to highlight the problem with =0.

The patch I previously submitted creates more problems than fixes.

I've added a new patch, which adds a bool m_backoffElapsed to DcfState. I have
not found a way to do it without a new attribute. With the new patch the added
test cases pass ok.

See below about the simulator start.

(In reply to comment #2)
> > 
> > Furthermore, immediate medium access should be possible at simulator start,
> > because all backoffs should be regarded as elapsed. This is easily fixed by
> > setting negative start times.
> 
> This is really an arbitrary decision and I don't believe it matters in any way
> so, if I had to choose between the current behavior and the new behavior, I
> would pick the one which minimizes the number of changes to the code. Either
> way, you really have to leave this change out of what you describe as a more
> real bug for 2 subsequent unacked frames.

Yes, it is independent from the first issue. Of course "no changes" are always
the mimimum number of changes :). It's not really an arbitrary decision to make
simulator start a special point in time for backoff management.
------- Comment #4 From 2009-05-11 11:10:06 EDT -------
Created an attachment (id=438) [details]
DCF collision
------- Comment #5 From 2009-06-08 11:08:21 EDT -------
Created an attachment (id=461) [details]
DCF collision

Updated for current ns-3-dev.
The problem still persists. I also corrected the last of the new ACK test
cases.
------- Comment #6 From 2009-06-23 07:38:16 EDT -------
kirill, I noticed that this patch is changing one of the tests you added for
ack timeout:

// this situation is not supposed to fire an internal or external  collison.    
// ExpectCollision (39, 2, 0); // backoff: 2 slot                               

what do you think about this change ?
------- Comment #7 From 2009-06-23 07:53:19 EDT -------
(In reply to comment #6)
> kirill, I noticed that this patch is changing one of the tests you added for
> ack timeout:
> 
> // this situation is not supposed to fire an internal or external  collison.    
> // ExpectCollision (39, 2, 0); // backoff: 2 slot                               
> 
> what do you think about this change ?
> 
Ok, I agree with this changes: we should just wait till medium becomes idle
rather than think that it was collision.
------- Comment #8 From 2009-06-23 08:02:14 EDT -------
I have to confess that I really don't understand the purpose of this test:

  // Test case where two packets are queued at DcaTxop, the first send starts   
  // immediately and the second is also immediately indicated but has to wait   
  // for its backoff.                                                           
  //                                                                            
  //  20          60     66      70   80                                        
  //   | tx        | sifs | aifs  | tx |                                        
  //   |20: request access for both packets                                     
  StartTest (4, 6 , 10);
  AddDcfState (1);
  AddAccessRequest (20, 40, 20, 0, 0);
  AddAccessRequest (20, 10, 70, 0, 0);
  EndTest ();

If I read the code of dca-txop.cc correctly, I think that this set of access
requests is impossible. i.e., DcaTxop::StartAccessIfNeeded appears to be
checking for IsAccessRequested before calling RequestAccess so, it should not
be possible for a specific queue to request access again before the first
access has been resolved. 

Timo, could you explain what you are trying to test here and why you think that
the above is a sequence of events which can happen ?
------- Comment #9 From 2009-06-24 05:34:41 EDT -------
(In reply to comment #8)
> Timo, could you explain what you are trying to test here and why you think that
> the above is a sequence of events which can happen ?

Yes, it is true that RequestAccess() is called exactly once for each
transmission. However it can still happen, that it is called twice at the same
simulated time point.

Consider a DcaTxop with two packets Enqueued(), both are put into the queue and
RequestAccess() is called. When NotifyAccessGranted() is invoked, the first
packet transmission starts immediately via MacLow. Still in the same procedure
StartAccessIfNeeded() is called in NotifyAccessGranted() and RequestAccess() is
signaled for the second packet.

Thus two RequestAccess() happen the same time, if the medium is idle for the
first one and transmission can start immediately.
------- Comment #10 From 2009-06-24 05:44:34 EDT -------
(In reply to comment #9)
> (In reply to comment #8)
> > Timo, could you explain what you are trying to test here and why you think that
> > the above is a sequence of events which can happen ?
> 
> Yes, it is true that RequestAccess() is called exactly once for each
> transmission. However it can still happen, that it is called twice at the same
> simulated time point.
> 
> Consider a DcaTxop with two packets Enqueued(), both are put into the queue and
> RequestAccess() is called. When NotifyAccessGranted() is invoked, the first
> packet transmission starts immediately via MacLow. Still in the same procedure
> StartAccessIfNeeded() is called in NotifyAccessGranted() and RequestAccess() is
> signaled for the second packet.
> 
> Thus two RequestAccess() happen the same time, if the medium is idle for the
> first one and transmission can start immediately.
> 

So, if I understand you well, you are saying that, in this case, we bogusly
generate a collision, right ?
------- Comment #11 From 2009-06-24 06:08:15 EDT -------
> So, if I understand you well, you are saying that, in this case, we bogusly
> generate a collision, right ?

The bug is that there should be _no_ collision indicated.

Without the patch this situation signals a collision if backoff == 0 for the
second packet.

The bad condition is: IsBusy() and backoff == 0. Occurs because the first
packet obviously is in TX and if the second gets backoff == 0, a collision is
raised. And that is wrong.
------- Comment #12 From 2009-06-25 02:38:18 EDT -------
Would it not be possible to instead fix this bug my adding an end of tx signal
to MacLow, forward it to DcaTxop, and invoke RequestAccessIfNeeded from there
at the end of tx for broadcast/multicast messages ?

This would add yet another event, but, at least, it seems pretty
straightforward: the way the current patch works makes my brain hurt.
------- Comment #13 From 2009-06-25 13:05:09 EDT -------
(In reply to comment #12)
> Would it not be possible to instead fix this bug my adding an end of tx signal
> to MacLow, forward it to DcaTxop, and invoke RequestAccessIfNeeded from there
> at the end of tx for broadcast/multicast messages ?
> 
> This would add yet another event, but, at least, it seems pretty
> straightforward: the way the current patch works makes my brain hurt.

No, it does not change anything about the bug reported.
If you delay RequestAccessIfNeeded() to after the TX the test case changes to:

  // Test case where two packets are queued at DcaTxop, the first send starts
  // immediately and the second is also immediately indicated but has to wait
  // for its backoff.
  //
  //  20          60     66      70   80
  //   | tx        | sifs | aifs  | tx |
  //   |20: request access for first packet
  //               |60: request access for second packet

and we still need to differentate this case from

  //  20          60     66              70      75
  //   | tx        | sifs | aifs+backoff  | idle | tx |
  //   |20: request access for first packet
  //                                             |75: request access for second
packet

I don't see what's wrong with adding a "backoff elapsed" flag to DcaTxop.
------- Comment #14 From 2009-10-02 14:18:19 EDT -------
On the critical path for ns-3.7
------- Comment #15 From 2009-11-20 10:49:55 EDT -------
Created an attachment (id=665) [details]
New patch to work with revision 5762:ae78a8de6f5f

This is an updated patch that should apply to revision 5762:ae78a8de6f5f of
ns-3-dev. I disabled the last two tests in the dcf-manager-test.cc because they
kept on failing.
------- Comment #16 From 2009-11-23 13:16:03 EDT -------
Created an attachment (id=670) [details]
Updated patch against 5768:a07d60db8448

Updated by pulling into old hg repository.

Main problem with your patch was:
+  NS_TEST_EXPECT_MSG_EQ (Simulator::Now (), MicroSeconds
(expected.m_expectedGrantTime), "Expected access grant is now");
-  NS_TEST_EXPECT_MSG_EQ (Simulator::Now (), MicroSeconds (expected.second),
"Expected access grant is now");

wrong value there.

Updated everything else. The updated test methods allow (require) specification
of the backoff slots after each AIFS.

Timo