- 1 Project Overview
- 2 Technical Approach
- 3 Milestones and Deliverables
- 4 Weekly Plan
- 4.1 Community Bonding Period (6 May - 26 May 2019)
- 4.2 Week 1 (27 May - 2 June 2019)
- 4.3 Week 2 (3 June - 9 June 2019 )
- 4.4 Week 3 (10 June - 16 June 2019)
- 4.5 Week 4 (17 June - 23 June 2019)
- 4.6 Week 5 (24 June - 30 June 2019)
- 4.7 Week 6 (1 July - 7 July 2019)
- 4.8 Week 7 (8 July - 14 July 2019)
- 4.9 Week 8 (15 July - 21 July 2019)
- 4.10 Week 9 (22 July - 28 July 2019)
- 4.11 Week 10 (29 July - 4 August 2019)
- 4.12 Week 11 (5 August - 11 August 2019)
- 4.13 Week 12 (13 August - 19 August 2019)
- 5 Weekly Progress
- 6 GSoC Project Summary
- Project Name: TCP Testing and Alignment
- Student: Apoorva Bhargava
- Mentors: Tom Henderson, Vivek Jain
- Abstract: This project aims at aligning the ns-3 TCP with Linux kernel to have a more realistic implementation of TCP in ns-3 with proper documentation of the differences. The features of TCP which will be aligned are ECN, RACK, SACK, DSACK, and Paced Chirping. To achieve this, ns-3 DCE (Direct Code Execution) will be used. DCE (Direct Code Execution) is a framework that allows the users to run kernel space protocol inside the ns-3 without changing the source code.
- Code: to be added
- About Me: I am a 2nd-year postgraduate student at National Institute of Technology, Karnataka India. I have worked on Implementation of TCP Jersey in ns-3 and Implementation Cautious Adaptive RED in ns-3 during my first year of postgraduation. Currently, I am working on Alignment and Validation of ns-3 TCP with Linux TCP using ns-3 Direct Code Execution (DCE) framework.
Many features of ns-3 TCP are aligned with RFCs but not with TCP implementation in Linux. The main goal of this project is to have a Linux like TCP implementation in ns-3 and cover main components of TCP Prague i.e ECN, DCTCP, RACK and Paced Chirping. I have already done the alignment of Slow Start and Congestion Avoidance phase of ns-3 New Reno with Linux TCP Reno and validated it in simple dumbbell topology with one sender and one receiver using ns-3 DCE. Results can be found here. Currently, I am working on alignment of PRR recovery algorithm of ns-3 with Linux using ns-3 DCE as it is default recovery algorithm in Linux kernel. Further, I will work on the alignment of ECN followed by DCTCP, as DCTCP uses ECN feature of TCP in its algorithm. Next, I will cover the alignment of RACK which will also cover the alignment of SACK and DSACK as these two are the pre-requisites. Lastly, the alignment of Paced Chirping will be done. Validation of all the aligned features of ns-3 TCP will be done using ns-3 DCE and proper documentation of all the differences will be provided. Also, if time permits I will try to align the ns-3 implementation of TCP Cubic and TCP BBR with Linux kernel.
Milestones and Deliverables
The entire GSoC period is be divided into 2 phases and the deliverables at the end of each phase will be as follows:
- Align Explicit Congestion Notification (ECN) implementation of ns-3 with Linux
- Align Data Center TCP (DCTCP) implementation of ns-3 with Linux
- Validate the alignment of ECN in dumbbell topology using ns-3 DCE
- Validate the alignment of DCTCP in data center topology using ns3-DCE
- Align ns-3 implementation of Selective Acknowledgement (SACK) and Duplicate Selective Acknowledgement with Linux
- Align ns-3 implementation of Recent Acknowledgement (RACK) with Linux
- Validate the alignment of SACK, DSACK, and RACK using ns-3 DCE
- Align ns-3 implementation of Paced Chirping with Linux
- Validate the alignment of Paced Chirping using ns-3 DCE
Community Bonding Period (6 May - 26 May 2019)
- Contact the mentors and update weekly plan according to their suggestions
- Setting up a git repository for the project
- Get suggestions on the testing scenarios which will be used for the validation
- Work on the alignment of PRR as it the default recovery algorithm in Linux
Week 1 (27 May - 2 June 2019)
- Start understanding the codebase of ECN in ns-3 as well as in Linux
- Document the differences observed in ECN code of ns-3 and Linux
Week 2 (3 June - 9 June 2019 )
- Align the differences found in ECN and validate the implementation in dumbbell topology using DCE
Week 3 (10 June - 16 June 2019)
- Validate Data Center TCP in data center topology using DCE
Week 4 (17 June - 23 June 2019)
- Start understanding the codebase of SACK in ns-3 as well as Linux
- Document the differences observed
Week 5 (24 June - 30 June 2019)
- Align the differences found in SACK and validate the implementation using DCE
Week 6 (1 July - 7 July 2019)
- Study the codebase of DSACK in ns-3 and Linux
- Document the differences observed
Week 7 (8 July - 14 July 2019)
- Align the differences found in DSACK and validate the implementation using DCE
Week 8 (15 July - 21 July 2019)
- Study the codebase of RACK in ns-3 as well as in Linux
- Document the differences.
Week 9 (22 July - 28 July 2019)
- Align the differences found in RACK and validate the implementation using DCE
Week 10 (29 July - 4 August 2019)
- Study the codebase of Paced Chirping in ns-3 as well as Linux
- Document the differences
Week 11 (5 August - 11 August 2019)
- Align the differences found in Paced Chirping and validate the implementation using DCE
Week 12 (13 August - 19 August 2019)
- Submit all the required patches
Community Bonding Period
- Communicate with the mentors through call
- Set up my git repository
- Reported a bug by creating a merge request and it got merged into mainline of ns-3.
- Added the code for new TCP variant called TcpLinuxReno which contains the Linux like implementation of TCP New Reno. This work was done before the GSoC was started.
- Next feature which I took for the alignment is Proportional Rate Reduction (PRR) for TCP. While checking the alignment of PRR in ns-3 with Linux, I observed an issue related to the handling of SACK blocks in PRR algorithm. I have reported this issue.
- Submitted merge request for the issue of handling SACK blocks with PRR algorithm..
- Tested PRR in a single packet loss scenario using ns-3 DCE and aligned the observed differences. The difference was there because ns-3 handles everything in terms of bytes whereas Linux in terms of packets. According to RFC 6937, PRR calculates a variable called "sndcnt", which indicates exactly how many bytes should be sent in response to each ACK. The following equation is used to calculate the sndcnt in ns-3:
sendCount = std::ceil (m_prrDelivered * tcb->m_ssThresh * 1.0 / m_recoveryFlightSize) - m_prrOut;
Since ns-3 handles it in terms of bytes, the above equation was not giving the value of senCount in multiple of segment size which was not the case with Linux as it calculated the value of sndcnt in terms of packets.
- Also, Documented the results. Another difference was observed that on exiting the recovery phase ns-3 and Linux handles the updation of cwnd differently.
- Had a discussion with mentors on how to handle differences between Linux and pure RFC standards. And it was discussed that ns-3 should have both the implementations but the default should be set as Linux as this will give more realistic results to the users.
- It was finalized with mentors to have a Linux like PRR implementation in ns-3 as a separate class.
- Implemented a new class for Linux like PRR implementation. 
- Validated the implementation in a scenario of bulk traffic and updated the result in google doc. Following is the overlapping graph obtained for cwnd obtained in bulk traffic scenario:
- Created a new repo which contains examples, patches and scripts. 
- Decided with mentors the following test scenarios for PRR:
- pipe < ssthresh - pipe > ssthresh
- Tested default initial congestion window of 10 segments with existing test cases and 2 tests failed and 1 test crashed.
- Did unit testing of the alignment of TcpLinuxPrrRecovery class with Linux implementation of PRR. 
- Added two test cases pipe > ssthresh and pipe < ssthresh for testing and also documented about these test cases. 
- Looked into the implementation of div_u64 () method of Linux and observed that it is an architecture base division operation. If the system supports 64bit architecture then normal division operation is performed otherwise if the system supports 32bit architecture then an optimized 64bit division is performed. And it was decided that implementation of this method in ns-3 is not required.
- Fixed a few issues in the merge request that was submitted earlier. 
- Completed the alignment of TcpLinuxPrrRecovery class with Linux.
- Tested the alignment in the scenario where 20 packets are sent from sender to the receiver and 3rd, 5th, 6th, 7th and 8th packet are dropped.
- Discussed with mentors the limitation in ns-3 to support the Linux variant.
- Observed and fixed the following issue in the implementation of PRR:
In the scenario mentioned in point 2, it was observed that on receiving a partial ACK for 5th packet, 2 packets were getting ACKed (3rd and 4th packet) out of which one was already SACKed (4th packet). So in this case, the calculation of prr_delivered (total bytes delivered during recovery) should consider only 3rd packet and not the 4th packet as it was already counted in prr_delivered (on receiving dupack for the 3rd packet with SACK block for 4th packet). Due to this reason, I changed the data type of lastSackedBytes to int so that it can store a negative value and subtract the bytes which were already SACKed in the prr_delivered calculation.
- Discussed with mentors the future plans for the project. It was decided that first unit testing and system testing of LinuxReno should be completed with proper documentation and create the merge request for the same. After LinuxReno is completed, the same should be done for LinuxPRR.
- Updated the Google Doc containing the details of the unit tests and system tests. 
- Tested PRR in a SACK disabled scenario.
- Reported an issue related to extra retransmission on receiving a partial ACK. 
- Discussed and decided on the design of unit cases and system test with the mentors. I am working on the unit tests to test the following two conditions:
- Growth in cwnd due to byte counting (rather than ACK counting) in slow start and congestion avoidance phase.
- cwnd is maintained in segments in Linux, but in bytes in ns-3. And due to rounding the cwnd in ns-3, TCP New Reno in ns-3 is less aggressive than Linux.
- Implemented the unit cases to test the behavior of TcpLinuxReno in ns-3. The test case checked that the slow start and congestion avoidance behavior matches Linux behavior as follows:
1) in both slow start and congestion avoidance phases, presence or absence of delayed acks does not alter the window growth
2) in congestion avoidance phase, the arithmetic for counting the number of segments ACKed and deciding when to increment the congestion window (i.e. following the Linux function tcp_cong_avoid_ai()) is followed.
- Tested the slow start and congestion avoidance phase with delayed ack count of 1 and 2.
- Also, test the slow start and congestion avoidance phase with a smaller segment size i.e. 524 bytes and a larger segment size i.e 1500 bytes.
- Worked on the system testing and following configuration was decided for the system testing:
Topology: Dumbbell (One sender and one receiver)
AQM at the router: FIFO queue disc and PIE queue disc
Bottleneck link bandwidth: 1Mbps
Edge link bandwidth: 10 Mbps
Initial cwnd: 10 segments
RTT: 10 to 100 ms
But later it was decided with the mentors that for now we should hold the system testing and start testing other features of TCP like PRR, SACK, and CUBIC.
- Merged two separate test suites for the slow start and congestion avoidance phase of Linux Reno into a single test suite. 
- Added comments to the source code. 
- Mentors helped me increasing the level of logging and commenting on the tests.
- Added documentation for the Linux Reno in tcp.rst file. 
- Added example for the Linux Reno in examples/tcp directory. 
- Created a new branch named "rate-sample-prr" which contains the rate sample code rebased to latest ns-3-dev and added Linux PRR code over it.  The reason for using rate sample code is that the current implementation of PRR uses lastSackedBytes for the calculation of m_prrDelivered variable. And it was observed that in some scenarios lastSackedBytes came out to be negative like on arrival of partial ACK. So using rate sample we can avoid negative values and make the calculation of m_prrDelivered variable more straightforward.
- Create a merge request for Linux Reno congestion model in ns-3-dev. 
- Fixed some issues in Linux PRR:
1) Resolved the issue of extra retransmissions in ns-3  which was already reported on ns-3-dev. 
2) Also, there was a difference in the behavior of ns-3 and Linux on exiting the recovery state. Linux increases the congestion window after exiting the recovery state whereas ns-3 does not. Tried resolving this issue but more improvement is required in this fix.
- Found a bug related to timeout in ns-3. In ns-3, only for the first partial ACK RTO is reset. So if there is a scenario where TCP sender receives multiple partial ACKs, ns-3 will reset RTO only for first partial ACK and there is a possibility of a timeout. After fixing this bug, our validation results for Linux Reno became more overlapping. Following is the overlapping cwnd graph for Linux Reno after the bug fix.
- Fixed all the PRR related issues and validated Linux PRR using ns-3 DCE. cwnd traces of ns-3 Linux PRR was validated against the cwnd traces of DCE Linux PRR and following cwnd graph was obtained.
- Worked more PRR related issues.
- Completed the documentation Linux PRR.
- Started working on the alignment of ns-3 TCP CUBIC with Linux
- Created a final patch for Linux PRR. 
- Following results were obtained for the validation of ns-3 TCP CUBIC against Linux CUBIC using ns-3 DCE. 
- Following cwnd graph was obtained after aligning the observed differences.
- Changing the default value of beta and setting the delayed acknowledgment to 2 segments gave better results.
- Tried changing the packet size to 1500 bytes but with ns-3 stack on the receiver side, Linux stack sends the packet of 578 bytes only. And with Linux stack on the receiver side, Linux sender dynamically sets the delayed acknowledgment.
- Improve documentation
- Code review from mentors
- Prepare a final report
GSoC Project Summary
Link to Phase 1 tasks
Implementation and Testing of Linux Reno in ns-3: https://gitlab.com/apoorvabhargava/ns-3-dev/tree/LinuxRenoMergeRequest
Link to Phase 2 tasks
Implementation and Testing of Linux PRR in ns-3: https://gitlab.com/apoorvabhargava/ns-3-dev/tree/LinuxPRRMergeRequest
Link to Phase 3 tasks
Testing and Aligning ns-3 CUBIC with Linux: https://gitlab.com/apoorvabhargava/tcp_testing_and_alignment