Bug 2926 - SSRC and SLRC mechanism not fully aligned to the standard
SSRC and SLRC mechanism not fully aligned to the standard
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: wifi
ns-3.28
All All
: P3 normal
Assigned To: sebastien.deronne
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-06-04 11:53 EDT by Muhammad Iqbal CR
Modified: 2018-07-03 07:00 EDT (History)
3 users (show)

See Also:


Attachments
Proposed fix: unify and fix need retransmission check (8.51 KB, patch)
2018-06-04 19:11 EDT, Muhammad Iqbal CR
Details | Diff
reworked patch (10.51 KB, patch)
2018-06-13 05:25 EDT, sebastien.deronne
Details | Diff
Fix SSRC and SLRC reset and increment behavior (26.74 KB, patch)
2018-06-30 15:12 EDT, Muhammad Iqbal CR
Details | Diff
Fix SSRC and SLRC reset and increment behavior (reworked to passing packet size) (27.63 KB, patch)
2018-07-01 08:24 EDT, Muhammad Iqbal CR
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Muhammad Iqbal CR 2018-06-04 11:53:28 EDT
I found that some part of MAC retransmission regarding SSRC and SLRC is outdated. According to IEEE-802.11-2012std, SSRC (short retry count) is used for any frame (control, management, or data) which size <= the dot11RtsThreshold size, and SLRC (long retry count) is used for any frame which size > dot11RtsThreshold. The current code uses SSRC only for RTS packets, and SLRC for any non-RTS packet.

I discussed this with Tom and he thought that it was left out-of-date when 11n/ac features were added, and only the RTS feature were added.
Comment 1 Muhammad Iqbal CR 2018-06-04 19:11:55 EDT
Created attachment 3107 [details]
Proposed fix: unify and fix need retransmission check
Comment 2 sebastien.deronne 2018-06-13 00:00:49 EDT
I've just read the standard section 10.3.4.4 Recovery procedures and retransmit limits, and indeed we have this:

- If RTS frame or DATA or MGMT <= dot11RtsThreshold => use SSRC
if DATA or MGMT > dot11RtsThreshold => use SLRC

I will have a look at the patch and check for regression.
Comment 3 sebastien.deronne 2018-06-13 05:25:04 EDT
Created attachment 3110 [details]
reworked patch

Some tests were crashing because you forgot to adapt Minstrel and MinstrelHt classes.

Still some tests are failing but I guess this is expected. I suggest to postpone until issue with test randomness is fixed before trying to fix those.
Comment 4 Alexander Krotov 2018-06-14 10:31:16 EDT
Logically short and long retry counters should be managed by the queue (DCA or EDCA), not by the remote station manager. There is only one counter per queue according to the standard, and it is not tied to the frame or station.

It also means that retry counter is not reset when the frame is dropped, see official standard interpretations from http://standards.ieee.org/findstds/interps/index.html
Relevant interpretation is in http://standards.ieee.org/findstds/interps/802.11-1999_interp.zip, copied here for convenience:

Interpretation Request #9

1-09/03 (Contention window and retry counters) Topic: Reset of contention window to
CWmin Relevant Clauses: 9.2.4, 9.2.5.3 Classification: unambiguous

According to the above sub-clauses, the Station {Short, Long} Retry counters are in-
cremented every time a (Short/Long) retry counter associated with an MSDU is incre-
mented. They are only reset upon successful transmission of an MPDU (of appropriate
length).
The CW is ‘controlled’ by the Station counters, increasing in size every time either of the
Station counters increases. It is reset to CWmin only

• after a successful MSDU transmission, or
• when either of the Station counters reaches their respective limit.

Consider a scenario where a station *continually* fails to transmit successfully: The
CW will increase, until condition b) above is met, at which point it will revert to CWmin.
However, the Station retry counters are not reset at this point, and will continue to in-
crease; condition b) will not be met again, and CW will increase (or remain at CWmax)
for all subsequent (failed) attempts, regardless of the state of the respective MSDU
counters.

Is this the intended behavior? It seems odd that the CW will be reset after the first fail-
ure (when b) is met, and the MSDU is discarded), but not for subsequent MSDUs.
Comment 5 sebastien.deronne 2018-06-18 05:54:53 EDT
(In reply to Alexander Krotov from comment #4)
> Logically short and long retry counters should be managed by the queue (DCA
> or EDCA), not by the remote station manager. There is only one counter per
> queue according to the standard, and it is not tied to the frame or station.

I do not agree with that if I read this:

Every STA shall maintain a STA short retry count (SSRC) as well as a STA long retry count (SLRC), both of which shall take an initial value of 0.

Note that in Linux this is also not managed by the queue.

> 
> It also means that retry counter is not reset when the frame is dropped, see
> official standard interpretations from
> http://standards.ieee.org/findstds/interps/index.html
> Relevant interpretation is in
> http://standards.ieee.org/findstds/interps/802.11-1999_interp.zip, copied
> here for convenience:

I do read this:
The SSRC shall be incremented when any short retry count (SRC) associated with any MPDU with the Type subfield equal to Data is incremented
That means, if we start transmission for a new MPDU, it shall be reset.

> 
> Interpretation Request #9
> 
> 1-09/03 (Contention window and retry counters) Topic: Reset of contention
> window to
> CWmin Relevant Clauses: 9.2.4, 9.2.5.3 Classification: unambiguous
> 
> According to the above sub-clauses, the Station {Short, Long} Retry counters
> are in-
> cremented every time a (Short/Long) retry counter associated with an MSDU is
> incre-
> mented. They are only reset upon successful transmission of an MPDU (of
> appropriate
> length).
> The CW is ‘controlled’ by the Station counters, increasing in size every
> time either of the
> Station counters increases. It is reset to CWmin only
> 
> • after a successful MSDU transmission, or
> • when either of the Station counters reaches their respective limit.
> 
> Consider a scenario where a station *continually* fails to transmit
> successfully: The
> CW will increase, until condition b) above is met, at which point it will
> revert to CWmin.
> However, the Station retry counters are not reset at this point, and will
> continue to in-
> crease; condition b) will not be met again, and CW will increase (or remain
> at CWmax)
> for all subsequent (failed) attempts, regardless of the state of the
> respective MSDU
> counters.

I do not clearly find the mentioned sections in the revised 2016 version of the standard, which version are you using?

> 
> Is this the intended behavior? It seems odd that the CW will be reset after
> the first fail-
> ure (when b) is met, and the MSDU is discarded), but not for subsequent
> MSDUs.

Not sure I understand what you identified as a problem, could you clarify?

In any case, I think this patch should be pushed, but if we identify another mismatch between our implementation and the standard, I suggest to open a new item in the tracker.
Comment 6 Alexander Krotov 2018-06-18 06:30:10 EDT
> Every STA shall maintain a STA short retry count (SSRC) as well as a STA long retry count (SLRC), both of which shall take an initial value of 0.

It does not say that "retry counter shall be maintainer for every STA", it says that "STA shall maintain" it. There is one pair of counters for DCF, one pair of counters for each EDCAF (one per AC), and one pair of counters for each PDU.

EDCA queue counters, one per AC, are called QSRC[AC] and QLRC[AC]. They are increased on failure and reset on success. CW[AC] is reset to CWmin[AC] when one of those reaches retry limit value. See page 1380 of the 2016 edition of the standard.

DCF counters are called SSRC and SLRC.

PDUs have their own retry counters called SRC and LRC, which do not affect CW.

> I do not clearly find the mentioned sections in the revised 2016 version of the standard, which version are you using?

The references in interpretation request are for 1999 version. In 2016 version 10.3.3 "Random backoff time" corresponds to 9.2.4, and 10.3.4.4 "Recovery procedures and retransmit limits" corresponds to 9.2.5.3.

> Not sure I understand what you identified as a problem, could you clarify

NS-3 implements one pair of counters per remote station instead of one pair of counters per queue. As a result, it is impossible to reproduce a scenario mentioned in the interpretation of the standard: if you continuously fail to transmit frames, they are removed from the queue, but CW resets only once and then reaches CWmax and stays there.

> In any case, I think this patch should be pushed, but if we identify another mismatch between our implementation and the standard, I suggest to open a new item in the tracker.

Ok, maybe we should assign another number for this issue.
Comment 7 sebastien.deronne 2018-06-24 06:54:06 EDT
(In reply to Alexander Krotov from comment #6)
> > Every STA shall maintain a STA short retry count (SSRC) as well as a STA long retry count (SLRC), both of which shall take an initial value of 0.
> 
> It does not say that "retry counter shall be maintainer for every STA", it
> says that "STA shall maintain" it. There is one pair of counters for DCF,
> one pair of counters for each EDCAF (one per AC), and one pair of counters
> for each PDU.
> 
> EDCA queue counters, one per AC, are called QSRC[AC] and QLRC[AC]. They are
> increased on failure and reset on success. CW[AC] is reset to CWmin[AC] when
> one of those reaches retry limit value. See page 1380 of the 2016 edition of
> the standard.
> 
> DCF counters are called SSRC and SLRC.
> 
> PDUs have their own retry counters called SRC and LRC, which do not affect
> CW.
> 
> > I do not clearly find the mentioned sections in the revised 2016 version of the standard, which version are you using?
> 
> The references in interpretation request are for 1999 version. In 2016
> version 10.3.3 "Random backoff time" corresponds to 9.2.4, and 10.3.4.4
> "Recovery procedures and retransmit limits" corresponds to 9.2.5.3.
> 
> > Not sure I understand what you identified as a problem, could you clarify
> 
> NS-3 implements one pair of counters per remote station instead of one pair
> of counters per queue. As a result, it is impossible to reproduce a scenario
> mentioned in the interpretation of the standard: if you continuously fail to
> transmit frames, they are removed from the queue, but CW resets only once
> and then reaches CWmax and stays there.
> 
> > In any case, I think this patch should be pushed, but if we identify another mismatch between our implementation and the standard, I suggest to open a new item in the tracker.
> 
> Ok, maybe we should assign another number for this issue.

Alexander, please open a new thread for this, I do not plan to change this for the next release, but I still would like to push Muhammad's patch for ns-3.29 (I still need to fix regression failures).
Comment 8 sebastien.deronne 2018-06-26 13:33:10 EDT
(In reply to sebastien.deronne from comment #7)
> (In reply to Alexander Krotov from comment #6)
> > > Every STA shall maintain a STA short retry count (SSRC) as well as a STA long retry count (SLRC), both of which shall take an initial value of 0.
> > 
> > It does not say that "retry counter shall be maintainer for every STA", it
> > says that "STA shall maintain" it. There is one pair of counters for DCF,
> > one pair of counters for each EDCAF (one per AC), and one pair of counters
> > for each PDU.
> > 
> > EDCA queue counters, one per AC, are called QSRC[AC] and QLRC[AC]. They are
> > increased on failure and reset on success. CW[AC] is reset to CWmin[AC] when
> > one of those reaches retry limit value. See page 1380 of the 2016 edition of
> > the standard.
> > 
> > DCF counters are called SSRC and SLRC.
> > 
> > PDUs have their own retry counters called SRC and LRC, which do not affect
> > CW.
> > 
> > > I do not clearly find the mentioned sections in the revised 2016 version of the standard, which version are you using?
> > 
> > The references in interpretation request are for 1999 version. In 2016
> > version 10.3.3 "Random backoff time" corresponds to 9.2.4, and 10.3.4.4
> > "Recovery procedures and retransmit limits" corresponds to 9.2.5.3.
> > 
> > > Not sure I understand what you identified as a problem, could you clarify
> > 
> > NS-3 implements one pair of counters per remote station instead of one pair
> > of counters per queue. As a result, it is impossible to reproduce a scenario
> > mentioned in the interpretation of the standard: if you continuously fail to
> > transmit frames, they are removed from the queue, but CW resets only once
> > and then reaches CWmax and stays there.
> > 
> > > In any case, I think this patch should be pushed, but if we identify another mismatch between our implementation and the standard, I suggest to open a new item in the tracker.
> > 
> > Ok, maybe we should assign another number for this issue.
> 
> Alexander, please open a new thread for this, I do not plan to change this
> for the next release, but I still would like to push Muhammad's patch for
> ns-3.29 (I still need to fix regression failures).

I opened bug 2926 to track this.
Comment 9 sebastien.deronne 2018-06-26 13:33:58 EDT
(In reply to sebastien.deronne from comment #8)
> (In reply to sebastien.deronne from comment #7)
> > (In reply to Alexander Krotov from comment #6)
> > > > Every STA shall maintain a STA short retry count (SSRC) as well as a STA long retry count (SLRC), both of which shall take an initial value of 0.
> > > 
> > > It does not say that "retry counter shall be maintainer for every STA", it
> > > says that "STA shall maintain" it. There is one pair of counters for DCF,
> > > one pair of counters for each EDCAF (one per AC), and one pair of counters
> > > for each PDU.
> > > 
> > > EDCA queue counters, one per AC, are called QSRC[AC] and QLRC[AC]. They are
> > > increased on failure and reset on success. CW[AC] is reset to CWmin[AC] when
> > > one of those reaches retry limit value. See page 1380 of the 2016 edition of
> > > the standard.
> > > 
> > > DCF counters are called SSRC and SLRC.
> > > 
> > > PDUs have their own retry counters called SRC and LRC, which do not affect
> > > CW.
> > > 
> > > > I do not clearly find the mentioned sections in the revised 2016 version of the standard, which version are you using?
> > > 
> > > The references in interpretation request are for 1999 version. In 2016
> > > version 10.3.3 "Random backoff time" corresponds to 9.2.4, and 10.3.4.4
> > > "Recovery procedures and retransmit limits" corresponds to 9.2.5.3.
> > > 
> > > > Not sure I understand what you identified as a problem, could you clarify
> > > 
> > > NS-3 implements one pair of counters per remote station instead of one pair
> > > of counters per queue. As a result, it is impossible to reproduce a scenario
> > > mentioned in the interpretation of the standard: if you continuously fail to
> > > transmit frames, they are removed from the queue, but CW resets only once
> > > and then reaches CWmax and stays there.
> > > 
> > > > In any case, I think this patch should be pushed, but if we identify another mismatch between our implementation and the standard, I suggest to open a new item in the tracker.
> > > 
> > > Ok, maybe we should assign another number for this issue.
> > 
> > Alexander, please open a new thread for this, I do not plan to change this
> > for the next release, but I still would like to push Muhammad's patch for
> > ns-3.29 (I still need to fix regression failures).
> 
> I opened bug 2926 to track this.

I meant bug 2939 :-)
Comment 10 sebastien.deronne 2018-06-26 13:37:14 EDT
Final patch pushed in changeset 13663:ed75ddbe630d
Comment 11 Muhammad Iqbal CR 2018-06-30 15:12:24 EDT
Created attachment 3119 [details]
Fix SSRC and SLRC reset and increment behavior

I found out that I forgot to manipulate the SSRC and SLRC count correctly when transmission failed or success. The offending block of codes are at functions: ReportData{Ok,Failed} and ReportFinalDataFailed. Since the packet is not exist in the context of those functions, I have to pass it to the function for the RtsCtsThreshold check. I also changed power-rate-adaptation-test.cc accordingly and it still PASS, though I haven't check other tests yet.

I also considered merging the functions with infix of Data and Rts as one, similar to the previous patch with Need{Data,Rts}Transmission, but not doing it since RTS packets size is small anyway and it may break more tests.
Comment 12 Muhammad Iqbal CR 2018-06-30 15:14:25 EDT
Reopen for additional bug fix
Comment 13 sebastien.deronne 2018-07-01 04:54:28 EDT
(In reply to Muhammad Iqbal CR from comment #11)
> Created attachment 3119 [details]
> Fix SSRC and SLRC reset and increment behavior
> 
> I found out that I forgot to manipulate the SSRC and SLRC count correctly
> when transmission failed or success. The offending block of codes are at
> functions: ReportData{Ok,Failed} and ReportFinalDataFailed. Since the packet
> is not exist in the context of those functions, I have to pass it to the
> function for the RtsCtsThreshold check. I also changed
> power-rate-adaptation-test.cc accordingly and it still PASS, though I
> haven't check other tests yet.
> 
> I also considered merging the functions with infix of Data and Rts as one,
> similar to the previous patch with Need{Data,Rts}Transmission, but not doing
> it since RTS packets size is small anyway and it may break more tests.

Thanks, but can we not pass the size of the packet instead of the packet?
Comment 14 Muhammad Iqbal CR 2018-07-01 08:24:26 EDT
Created attachment 3120 [details]
Fix SSRC and SLRC reset and increment behavior (reworked to passing packet size)

(In reply to sebastien.deronne from comment #13)
> Thanks, but can we not pass the size of the packet instead of the packet?

Sure, here is the reworked patch using packet size. I have no problem passing the packet itself since (correct me if I'm wrong) it passes reference, so it won't have impact to the code performance.
Comment 15 sebastien.deronne 2018-07-03 07:00:18 EDT
Pushed Muhammad's patch and fixed regression.
Comment 16 sebastien.deronne 2018-07-03 07:00:34 EDT
(In reply to sebastien.deronne from comment #15)
> Pushed Muhammad's patch and fixed regression.

changeset 13671:7f5c9561dcdb