From Nsnam
Revision as of 22:49, 7 June 2015 by Teto (Talk | contribs) (On the MPTCP subject)

Jump to: navigation, search

Main Page - Current Development - Developer FAQ - Tools - Related Projects - Project Ideas - Summer Projects

Installation - Troubleshooting - User FAQ - HOWTOs - Samples - Models - Education - Contributed Code - Papers

Return to GSoC 2015 Accepted Projects page.

Project overview

  • Project: Implementation of MPTCP (Multipath TCP) + Implementation of per-node clocks
  • Student: Matthieu Coudron
  • Mentors: Tom Henderson, Vedran Miletic, Tommaso Pecarolla, Peter Barnes
  • Code: (check out the different branches)
  • About me: I am a PhD student working on multipath communications. I have a background in network and system security.

On the MPTCP subject

Thanks to last year TCP option gsoc, it is possible to implement multipath TCP - an extension to TCP that is more and more popular (used in Apple voice recognition system "SIRI", embedded in yosemite, some Citrix products, soon to be embedded in Proximus products) - in a clean way. MPTCP is available in some (possibly out of branch) kernels - Linux, Mac OS, FreeBSD - and work even with adversary middleboxes (Contrary to SCTP), which was an important challenge. The 2nd challenge is still pending, ie, how to make the best usage of the path diversity ? How to be better than TCP without being more aggressive than TCP at bottlenecks ? I hope that being able to run MPTCP in a simulator could foster research on that peculiar subject since doing it with kernel code or creating a multihomed (3G/wired) setup can be complex (MPTCP kernel code is being refactored, and implementation is quite time consuming). There is no solution in the literature that answers this in a robust way.


Here is a diff file of the beginning of one mptcp incomplete implementation based on ns3.19: It was generated through this kind of command (I just discovered the filterdiff utility, pretty cool): diff -ENwbur ~/ns3off/src/internet src/internet > test.patch cat test.patch | filterdiff -p0 -X toexclude.txt > final.patch

To help reviewers focus on the architecture, I removed some unnecessary files (but this is still a huge diff) and I add some comments in the following about MPTCP and the code. To sum up, the main files to check are mptcp-socket-base.* and mptcp-subflow.* and the modifications made to tcp-socket-base.* .

1/ First of all, MPTCP doesn't require modifications to be modified, so does this implementation, it just appears as another TCP variant so the MPTCP socket works with all the code that can work with a TcpSocket. 2/ MPTCP is a TCP extension, all the signaling is done through TCP options 3/ The application sees a *fake* TCP socket usually called the "meta socket". This socket then dispatches the data to send among the different TCP connections of the MPTCP connection (usually called subflows)

TcpSocketBase |-MpTcpSocketBase (this is the "meta socket", a logical socket that dispatches the send buffer between the different MpTcpSubflows, and reorder the segments received on the different subflows for the application to see) | |-MpTcpSubflow (This is a copy/paste of TcpNewReno except that it handles MPTCP logic, add/pop options when necessary)

4/ the standard demands that MPTCP should not be more greedy than TCP so there are congestion control algorithms specific to MPTCP. In the diff you just mptcp-cc-uncoupled. the way it's implemented, you subclass both MpTcpSocketBase and MpTcpSubflow into MpTcpSocketBaseUncoupled and MpTcpSubflowUncoupled.

5/ the path management files/classes are not used in the implementation and they don't have the same meaning as in the linux kernel, these classes are meant to attribute the unique IDs for each possible subflow (as required by the standard). In the linux kernel, path management modules implement policies as to yes or no should the meta establishes a new subflow.

6/ MPTCP has a global sequence number (to reassemble packets in order at the receiver) that is conveyed through a TCP option. Every TCP sequence number should be mapped to an "MPTCP sequence number". There are strict rules concerning these mappings: once a mapping is sent, it can't be changed, the data has to be sent, resent even if it was received on another subflow etc... the mapping is responsible for much complexity of the code. It means a mapping can't be removed as long as the whole data has not been received, and the data can not be passed to the upper layer because there may be a checksum covering the whole mapping.

Features required from ns3: - Need to decorralate Tcp sender unacknowledged head (SND.UNA) from TcpTxBuffer - MPTCP demultiplexing is not done on the 5 tuple but on the key embedded in the MPTCP capable option - it should be possible to set a memory size for the Meta buffer and to share this space with subflows, ie it should be possible for TCP buffers

One critical aspect of multipath protocols is the reordering problem that usually require larger buffer to get the same performance as single path protocols. The main challenge to simulate correctly MPTCP is to mimic linux buffer mechanisms (in my opinion).

Nb: MPTCP has many mechanisms to deal with middleboxes and such but I don't believe they are interesting to have in ns3 which should be used to analyze the algorithmic part, thus none of the failure mechanims are implemented (e.g. fallback to TCP in case the server is not MPTCP compliant etc...).

On the per node clock

I would like to start implementing per node clock to be able to simulate time distribution protocols. Right now nodes are all perfectly synchronized ins ns3 (they share the simulator clocks). My goal is to be able to run NTPD in ns3-dce over ns3 nodes with drifting clocks. Time distribution experimentations are hard to do in practice (do you control 2 or more stratum 1 NTP servers ? and the traffic between these), I believe it makes sens and I know of no simulator that does it. This proposition is a follow up of my email to the dev ml:

Expected deliverables

While working on the previous projects I also intend to send patches to improve some parts of the ns3 code (such as the waf upgrade I sent last week). I plan to work during the first half on the MPTCP code and then on the per-node clock integration. The MPTCP code has the priority though since this is the most awaited feature I believe.


I intend to validate MPTCP against DCE. This may require some synergy with the TCP validation project.

Week 1 - Step 1

  • Modify tcp-option.h to support MPTCP
  • (de)Serialization of the numerous MPTCP options

Week 2 - Deliverable for Step 1; start of Step 2

  • Add MPTCP crypto

The following was the initial plan but it may be postponed:

  • Adapt TcpSocketBase to be more flexible (making all function virtual, overload some functions with TcpHeaders parameters instead of flags etc...)
  • Same for TcpXxBuffer

Week 3 - Step 2

  • Addition of test scripts, to trace buffers

Week 4 - Step 2

  • put DCE infrastructure into place

Week 5 - Deliverable for Step2 and Step 3

  • Implement linux MPTCP schedulers to be able to compare
  • Implement OLIA/LIA congestion controls

Week 6 - Step 3

  • MPTCP may still need some polishing at this point

Week 7 - Deliverable for Step 3; start of Step 4

  • Addition of a Clock m_clock member in each Node.
    • (Peter) Consider adding the clock by aggregation instead. I haven't thought this through, but I think aggregation will make it easier to manipulate the clock through the Config system, for example.
  • Addition of a perfect clock (default behavior won't change)
  • Addition of a drifting clock with initial offset

Week 8 - Step 4

  • making ntpd work in DCE

Week 9 - Step 4

  • making ntpd work in DCE (indeed: that looks complex)

Week 10 - Step 4 and Deliverables for Step 3 and 4

  • test the whole thing
  • Add some tests/documentation

Weekly progress

Week 1 - Step 1

Sticking the plan In summary, this week has delivered the following:

  - (de)Serialization of the 7/8 mptcp options with their documentation
  - Associated testsuite

The implementation of these message can found on the repository[1].For more details check wiki[2]. During next week, while waiting for a clearer schedule over the mptcp work I plan to:

  - Add the pending mptcp crypto testsuite (depends on the discussion)
  - Continue the work I've started in background over netlink export

from DCE. This is something I've started long ago but it proves quite difficult, since wireshark can't dissect raw netlink, it expects it to be contained within a "cooked linux" header that libpcap generates but not DCE (yet). Current DCE netlink implementation does not work with NTPd, that's why I look into it. (Netlink is the linux communication protocol between kernel and userspace).

  - Send some patches to DCE to support ntpd

[1] [2]

Final review