Bug 2041 - TCP RTO needs unit tests
TCP RTO needs unit tests
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: internet
ns-3-dev
All All
: P3 enhancement
Assigned To: natale.patriciello
:
Depends on: 1405
Blocks:
  Show dependency treegraph
 
Reported: 2015-01-21 00:49 EST by Tommaso Pecorella
Modified: 2015-10-27 05:50 EDT (History)
4 users (show)

See Also:


Attachments
Bug fix (?) (9.16 KB, patch)
2015-01-22 03:43 EST, natale.patriciello
Details | Diff
Patch + doxygen + rescan bindings (18.28 KB, patch)
2015-01-23 05:30 EST, Tommaso Pecorella
Details | Diff
fixing the doubling rto for non retransmission packet (6.25 KB, patch)
2015-03-02 21:27 EST, Nico Saputro
Details | Diff
Removing the doubling of rto when segment is not a retransmission (1.08 KB, patch)
2015-08-06 07:08 EDT, natale.patriciello
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tommaso Pecorella 2015-01-21 00:49:23 EST
The discussion is here:
https://groups.google.com/forum/#!topic/ns-3-users/YR78eyVkg08

T.
Comment 1 natale.patriciello 2015-01-22 03:43:40 EST
Created attachment 1947 [details]
Bug fix (?)

The patch does two thing, while (IMHO) be ready for the inclusion in the next release, as it doesn't change the logic, but only where doing things: 

- Add a timestampEnabled attribute to RttEstimator. If true, it will check the timestamp option inside the header. If it is present, the ts will be used as source of the rtt. If not (or if timestampEnabled is false) the old method of rttHistory is used. Meanwhile, m_rttHistory is updated, even when timestamp is used, as a graceful fallback.

- Change the API, using as input parameter the TcpHeader directly, in order to use the TimestampOption.

Side effect:
This patch will fasten a bit more the TCP and RTT calculation, while we want to split this relationship.
Comment 2 natale.patriciello 2015-01-22 04:04:37 EST
mmm, doxygen documentation missing, and also missing update of comments.. if it is ok, I'll make the v2 with documentation revised.
Comment 3 Tommaso Pecorella 2015-01-23 05:30:01 EST
Created attachment 1949 [details]
Patch + doxygen + rescan bindings

+1 from my side
Comment 4 Tom Henderson 2015-01-29 12:23:30 EST
The current patch for this can be found in bug 1405.

However, that patch does not address testing of RTO.

I'd like to also expose the rtt estimate and rtt variation estimate as traced values of the TcpSocketBase class.

So from my perspective, this can be closed by accepting the current patch to bug 1405 and then by writing a test suite that tests that RTO is getting correctly updated with and without timestamps usage, which probably means writing a controlled TCP exchange with forced losses.
Comment 5 Tommaso Pecorella 2015-01-30 17:13:40 EST
The original bug has been fixed with Bug 1405.

This bug name and classification are changed to reflect the new needs:
1) unit testing, and
2) rtt variation estimate traced value (rtt estimate is already a traced value).


(In reply to Tom Henderson from comment #4)
> The current patch for this can be found in bug 1405.
> 
> However, that patch does not address testing of RTO.
> 
> I'd like to also expose the rtt estimate and rtt variation estimate as
> traced values of the TcpSocketBase class.
> 
> So from my perspective, this can be closed by accepting the current patch to
> bug 1405 and then by writing a test suite that tests that RTO is getting
> correctly updated with and without timestamps usage, which probably means
> writing a controlled TCP exchange with forced losses.
Comment 6 Nico Saputro 2015-02-28 19:41:12 EST
Hi,
I am not sure whether I need to create a new bug or discuss it in here.

I saw that the patch has been implemented in ns3.22. 
But I found two issues here :
1. The isRetransmission in the SendDataPacket function is always true since (seq == m_txBuffer->HeadSequence ()) is always true (see the DoRetransmit function)
2. The RTO timer is doubled even if the packet is not a re-transmission packet. 
    I used the basic on-off application to send a packet every 15secs. See the result below.
    I think this is because the m_retxEvent for a new packet is not set and therefore it is always pass
    If isRetransmission can work properly, we may also use that for the RTO calculation for the non-retransmission packet.

Thanks,
Nico

20.4037s 14 OnOffApplication:SendPacket(): [INFO ]  Tx 512 10.1.1.1 Uid 15887 Sequence Number: 0 Time 20.4037
20.4037s 15 20.4037 [node 15] TcpSocketBase:Send(0xcc1b50, 0xccd710)
20.4037s 15 20.4037 [node 15] TcpSocketBase:Send(): [LOGIC] txBufSize=512 state ESTABLISHED
20.4037s 15 20.4037 [node 15] TcpSocketBase:SendPendingData(0xcc1b50, 1)
20.4037s 15 20.4037 [node 15] TcpSocketBase:AvailableWindow()
20.4037s 15 20.4037 [node 15] TcpSocketBase:UnAckDataCount(0xcc1b50)
20.4037s 15 20.4037 [node 15] TcpSocketBase:AvailableWindow(): [LOGIC] UnAckCount=0, Win=536
20.4037s 15 20.4037 [node 15] TcpSocketBase:SendPendingData(): [LOGIC] TcpSocketBase 0xcc1b50 SendPendingData w 536 rxwin 131072 segsize 536 nextTxSeq 1 highestRxAck 1 pd->Size 512 pd->SFS 512
20.4037s 15 20.4037 [node 15] TcpSocketBase:SendDataPacket(0xcc1b50, 1, 536, 1)
20.4037s 15 20.4037 [node 15] TcpSocketBase:AddOptions(0xcc1b50, 49153 > 9100 [ ACK ] Seq=1 Ack=1 Win=32768)
20.4037s 15 20.4037 [node 15] TcpSocketBase:AddOptionTimestamp(0xcc1b50, 49153 > 9100 [ ACK ] Seq=1 Ack=1 Win=32768)
20.4037s 15 20.4037 [node 15] TcpSocketBase:AddOptionTimestamp(): [INFO ] 15 Add option TS, ts=20403 echo=9416
20.4037s 15 20.4037 [node 15] TcpSocketBase:SendDataPacket(): [INFO ] 20.403715 old rto +1001000000.0ns new rto +2002000000.0nsacc rto 2002000# rto 1
20.4037s 15 20.4037 [node 15] TcpSocketBase:SendDataPacket(): [LOGIC] 0xcc1b50 SendDataPacket Schedule ReTxTimeout at time 20.4037 to expire at time 22.4057
20.4037s 15 20.4037 [node 15] TcpSocketBase:SendDataPacket(): [LOGIC] Send packet via TcpL4Protocol with flags 0x10
20.4037s 15 20.4037 [node 15] TcpSocketBase:SendPendingData(): [LOGIC] SendPendingData sent 1 packets
...
35.4073s 14 OnOffApplication:SendPacket(): [INFO ]  Tx 512 10.1.1.1 Uid 27272 Sequence Number: 1 Time 35.4073
35.4073s 15 35.4073 [node 15] TcpSocketBase:Send(0xcc1b50, 0xd26ed0)
35.4073s 15 35.4073 [node 15] TcpSocketBase:Send(): [LOGIC] txBufSize=512 state ESTABLISHED
35.4073s 15 35.4073 [node 15] TcpSocketBase:SendPendingData(0xcc1b50, 1)
35.4073s 15 35.4073 [node 15] TcpSocketBase:AvailableWindow()
35.4073s 15 35.4073 [node 15] TcpSocketBase:UnAckDataCount(0xcc1b50)
35.4073s 15 35.4073 [node 15] TcpSocketBase:AvailableWindow(): [LOGIC] UnAckCount=0, Win=1072
35.4073s 15 35.4073 [node 15] TcpSocketBase:SendPendingData(): [LOGIC] TcpSocketBase 0xcc1b50 SendPendingData w 1072 rxwin 130560 segsize 536 nextTxSeq 513 highestRxAck 513 pd->Size 512 pd->SFS 512
35.4073s 15 35.4073 [node 15] TcpSocketBase:SendDataPacket(0xcc1b50, 513, 536, 1)
35.4073s 15 35.4073 [node 15] TcpSocketBase:AddOptions(0xcc1b50, 49153 > 9100 [ ACK ] Seq=513 Ack=1 Win=32768)
35.4073s 15 35.4073 [node 15] TcpSocketBase:AddOptionTimestamp(0xcc1b50, 49153 > 9100 [ ACK ] Seq=513 Ack=1 Win=32768)
35.4073s 15 35.4073 [node 15] TcpSocketBase:AddOptionTimestamp(): [INFO ] 15 Add option TS, ts=35407 echo=20454
35.4073s 15 35.4073 [node 15] TcpSocketBase:SendDataPacket(): [INFO ] 35.407315 old rto +1001000000.0ns new rto +2002000000.0ns acc rto 4004000# rto 2
35.4073s 15 35.4073 [node 15] TcpSocketBase:SendDataPacket(): [LOGIC] 0xcc1b50 SendDataPacket Schedule ReTxTimeout at time 35.4073 to expire at time 37.4093
35.4073s 15 35.4073 [node 15] TcpSocketBase:SendDataPacket(): [LOGIC] Send packet via TcpL4Protocol with flags 0x10
35.4073s 15 35.4073 [node 15] TcpSocketBase:SendPendingData(): [LOGIC] SendPendingData sent 1 packets
Comment 7 natale.patriciello 2015-03-01 05:18:22 EST
(In reply to Nico Saputro from comment #6)
> Hi,
> I am not sure whether I need to create a new bug or discuss it in here.

Hi, before of all let me thank you, because I always love people who go in details and share their experience. Of course, I think this is the right place to discuss such thing.

 
> 1. The isRetransmission in the SendDataPacket function is always true since
> (seq == m_txBuffer->HeadSequence ()) is always true (see the DoRetransmit
> function)

I'm not sure if I understand.. clearly, when DoRetransmit calls SendDataPacket, it should be a retansmission.. so isRetransmission should be true.
Anyway, I have run tcp-variants-comparison with a debug print, in order to see how many times that variable is true. I got a long long debug file, full of "is false". 

> 2. The RTO timer is doubled even if the packet is not a re-transmission
> packet. 
> I used the basic on-off application to send a packet every 15secs. See
> the result below.
>     I think this is because the m_retxEvent for a new packet is not set and
> therefore it is always pass
>     If isRetransmission can work properly, we may also use that for the RTO
> calculation for the non-retransmission packet.

Initially, I though you were wrong somewhere, because the code looks right.. or in other words, it set/cancel retx on right time. But then I understand the problem :)

You are scheduling packets with 15s of delay, so:

0    Send
0.1  NewAck

15   Send
15.1 NewAck

but the retx event is set in the NewAck method, and not in Send one. Clearly a bug, because the next NewAck is 15s forward in time, and retx event is expired for sure.

Bad news: patching this behavior is not simple as it may seems (or maybe it's only my brain which is slow sunday morning :D). I need to read the RFC..

Thank you for reporting!

Nat
Comment 8 Nico Saputro 2015-03-02 21:27:53 EST
Created attachment 1984 [details]
fixing the doubling rto for non retransmission packet
Comment 9 Nico Saputro 2015-03-02 21:41:26 EST
Hi Nat,

The minRTO is 1s according to the RFC, so I expect that when I send the packet, the RTO timer is initially set to 1s (or higher if the RTT calculation is higher than the minRTO), then when it is expired, the timer is doubled. In my experiment with 15s delay, the RTO timer is initially set to 2s (or higher if the RTT calculation is higher than the minRTO).
Therefore, I used the isRetransmission to fix the bug, but it did not work. The isRetransmission is never false in my experiment with 15s delay.

After you mentioned about tcp-variant-comparison, I understand now that this is due to the frequency of the sending packet. Tcp-variant comparison uses bulk transfer so it is possible that there is queue in the m_txbuffer and isRetransmission becomes false. In my case, there is no queue in the m_txbuffer since only one packet is present every 15s.

The main issue is because the m_retxEvent is scheduled in the NewAck and when the m_txbuffer is empty this event is cancelled immediately in the NewAck as well. 
When a new packet arrives, there is no m_retxEvent scheduled for this new packet and then the m_rto is doubled.
I agree with you that moving it from the NewAck is not simple. Therefore I made a quick fix for this problem.
Basically this patch moves the isRetransmission from the SendDataPacket.
In DoRetransmit, m_isRetransmission is set to true, and in the SendDataPacket, only when m_isRetransmission == true, the m_rto is doubled and then m_isRetransmission is set to false.

This patch works well in my case, but I don't have a chance to do thorough test yet, such as using the bulk transfer. 

Nico
Comment 10 natale.patriciello 2015-08-06 07:08:00 EDT
Created attachment 2108 [details]
Removing the doubling of rto when segment is not a retransmission

This patch fixes the problem (at least, under my test case, but I think it is correct for every cases) without adding an extra member variable.
Comment 11 natale.patriciello 2015-10-27 05:50:38 EDT
Fixed in 11712:3a4c022fac53. Reopen if the problem persist (but I hope this isn't the case :P)