Bug 1571 – TCP zero-window and flow control window updates by the receiver

Bug 1571 - TCP zero-window and flow control window updates by the receiver


Summary:	TCP zero-window and flow control window updates by the receiver

Status:	RESOLVED FIXED

Product:	ns-3
Classification:	Unclassified
Component:	tcp
Version:	ns-3.23
Hardware:	All All

Importance:	P5 major
Assigned To:	Adrian S.-W. Tam

URL:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2013-01-21 05:53 EST by Florian Tschorsch
Modified:	2015-10-27 05:51 EDT (History)
CC List:	5 users (show)

See Also:

Attachments
test case 1 (7.63 KB, application/empty) 2015-08-31 05:46 EDT, Mahdi	Details
test case 2 (12.01 KB, application/empty) 2015-08-31 05:49 EDT, Mahdi	Details
Replying correctly to window update segments (4.05 KB, patch) 2015-09-01 11:31 EDT, natale.patriciello	Details \| Diff
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Florian Tschorsch 2013-01-21 05:53:51 EST

TCP uses flow control to prevent overloading a receiver. He uses a sliding window which is advertised to the sender in return. When a receiver advertises a window size of 0, the sender stops sending and starts a so called persist timer. The persist timer protects TCP from deadlock situations. When the timer expires, TCP sends small probe packets.

Though, in modern TCP implementations the receiver sends an active window update as soon as he has freed a "significant" amount of data. Hence the window re-opens. This helps to quickly resume sending without the necessity of waiting for the timer to expire. Nevertheless the timer is still necessary because window updates can be lost.

In ns-3 active window updates are missing. After fixing Bug 1565 [1], you can use the same test script (sws.cc).

In order to implement this enhancement, I see two points in the code which need to be revised. First, after reading from the socket (TcpSocketBase::Recv) and being in a zero window situation, the receiver should trigger an additional ACK with the updated window value. I think this can be a little tricky because the TcpRxBuffer is quite separated from the TcpSocketBase. Second, the sender needs to recognize the ACK as window update. For him it could also look like a DUP ACK. Hence he increments the dup_ack_counter and keeps waiting.

During this week I will try to provide a patch that covers the above enhancements. Advises how to realize it, especially concerning the mentioned difficulties, are welcome.

--
[1] Bug 1565: https://www.nsnam.org/bugzilla/show_bug.cgi?id=1565

Comment 1 Mahdi 2015-08-31 05:32:59 EDT

In a regular TCP communication when the receiver can not handle more data, it advertises a zero window. This causes the sender to pause and wait for the receiver to be ready again. When the receiver has freed its buffer and is ready again, it sends a window update packet to the sender, stating that it's ready again to receive data. However, what I am observing is that the receiver does not send that in ns3. Furthermore, I produced this window update packet artificially and sent to the sender. This caused the sender to stop probing the receiver's window but it didn't resume the transmission. It just stopped and became silent which caused the simulation to end abruptly. I assume this is probably a bug.

Two test cases to reproduce these behaviors are as follows. The 3.23 version was used for both of them. The topology of the test cases as below:

n0 (sender) ------- n1 ------- n2 (receiver)

test case 1: The receiver does not send a window update actively when it is ready again. Only when it receives the zero window probes (ZWPs) and acknowledges them with a non-zero window, the sender realizes that the receiver's window is open again. In the test scenario, in order to produce a zero window state, window size is set to the size of one packet. The receiver goes to a zero window state and advertises a zero window at t=0.08 seconds. However, only when the sender sends the first ZWP six seconds later (default persist timeout = 6s) and the receiver acknowledges this ZWP, the transmission is resumed. Note that the Ack number of the ack is one byte higher than the Seq number of ZWP. I don't know if this is correct. The pcap file of the sender is also included in the attachment.

test case 2: The sender also does not recognize this window update packet. In a scenario similar to the first test case, at an optional time like t=3 seconds, the intermediary node creates this packet artificially on behalf of the receiver and sends it to the sender. This packet is the same as the one that receiver used to advertise a zero window. In other words, their Ack number is the same. The only difference is that the new packet's window size is not zero anymore. Please note that with this Ack number, the receiver is not acknowledging the ZWP. It is acknowledging the data just before the ZWP. When the sender receives this packet it must resume the connection. However, when it gets the packet it stops sending ZWPs but it does not resume the transmission. It just becomes silent and this causes the simulation to end abruptly. It seems that only a new Ack number will trigger the sender's transmission, which is wrong.
For creating and sending the update packet, I modified the regular point-to-point netdevice and installed it on the intermediary node. Similar to the first test case, the pcap file of the sender is also included in the attachment.

Comment 2 Mahdi 2015-08-31 05:46:24 EDT

Created attachment 2129 [details]
test case 1

Comment 3 Mahdi 2015-08-31 05:49:03 EDT

Created attachment 2130 [details]
test case 2

Comment 4 natale.patriciello 2015-09-01 09:35:00 EDT

(In reply to Mahdi from comment #1)
> In a regular TCP communication when the receiver can not handle more data,
> it advertises a zero window. This causes the sender to pause and wait for
> the receiver to be ready again. When the receiver has freed its buffer and
> is ready again, it sends a window update packet to the sender, stating that
> it's ready again to receive data. However, what I am observing is that the
> receiver does not send that in ns3. Furthermore, I produced this window
> update packet artificially and sent to the sender. This caused the sender to
> stop probing the receiver's window but it didn't resume the transmission. It
> just stopped and became silent which caused the simulation to end abruptly.
> I assume this is probably a bug.

This situation does not happen anymore in ns-3, giving the resolution given in bug 2159 (https://www.nsnam.org/bugzilla/show_bug.cgi?id=2159) which will be inserted in ns-3 post .24 release. More detail below.


> test case 1: The receiver does not send a window update actively when it is
> ready again. Only when it receives the zero window probes (ZWPs) and
> acknowledges them with a non-zero window, the sender realizes that the
> receiver's window is open again. In the test scenario, in order to produce a
> zero window state, window size is set to the size of one packet. The
> receiver goes to a zero window state and advertises a zero window at t=0.08
> seconds.

In this scenario, the receiver advertises RWND = 1 segment. Since the sender may only send bytes from SND.UNA up to SND.UNA + min(CWND, AWND), with a simple substitution the sender will send only the bytes from SND.UNA + AWND. So, it's from 0 to 1; so it will send 1 segment, and then stop the transmission. When the receiver get the data, it sends back the ACK, and when it comes to the sender, the sender itself sends out another segment.

You have the problem because currently the rWnd is calculated as

m_rxBuffer->MaxBufferSize () - m_rxBuffer->Size ()

while the right value is 

m_rxBuffer->MaxBufferSize ()


Check bug 2159 for more information.

However, I'll try to write a test case for a bad receiver (which advertises a 0-window). For me, this bug will be fixed within the inclusion of the resolution of 2159.

Comment 5 natale.patriciello 2015-09-01 11:31:12 EDT

Created attachment 2134 [details]
Replying correctly to window update segments

This patch moves some code in order to correctly manage window update segments. From the commit description:

Moved the entering of persistent state in DoForwardUp(), right before
the processing of the packet (if the rWnd is 0, schedule a persist
timeout). This fixes the entering of a zerowindow persistent state right after the first segment is received (e.g. window=0 in the received SYN-ACK).
   
Right after the processing, check if rWnd is different from 0 but
the persistent event is still active: in this case (triggered when an
update window packet is received) exit from the zerowindow persist state
and try to send data.

Comment 6 natale.patriciello 2015-10-27 05:51:08 EDT

Fixed in 11713:2a16c0d9a62e and 11714:3606c0336d1d