Difference between revisions of "GSOC2021DCE"

From Nsnam
Jump to: navigation, search
(Project Overview)
(Artifacts for Workshop on ns-3 publication)
 
(10 intermediate revisions by 2 users not shown)
Line 21: Line 21:
 
** Ubuntu 20.04 support
 
** Ubuntu 20.04 support
 
*** Goal is that most capabilities presently available for Ubuntu 16 DCE will be available for Ubuntu 20.04 (native)
 
*** Goal is that most capabilities presently available for Ubuntu 16 DCE will be available for Ubuntu 20.04 (native)
*** Also produce Docker image and documentation to ease installation process
 
 
*** Also contact glibc maintainers about a non-patched solution
 
*** Also contact glibc maintainers about a non-patched solution
 +
*** More about this [https://docs.google.com/document/d/1o3xsukgDN9e4-q8n6KbLDX2c9fTxKIhRhimDsn4ivr8/edit#bookmark=id.8ttuw5i22nza here]
 
** Upgrade net-next-nuse Linux kernel support to recent kernel
 
** Upgrade net-next-nuse Linux kernel support to recent kernel
 
*** Focus is on the Google BBRv2 kernel (5.10 base): https://github.com/google/bbr
 
*** Focus is on the Google BBRv2 kernel (5.10 base): https://github.com/google/bbr
Line 29: Line 29:
 
*** More about this [https://docs.google.com/document/d/1o3xsukgDN9e4-q8n6KbLDX2c9fTxKIhRhimDsn4ivr8/edit#bookmark=id.ixtd8h2ia3hf here]
 
*** More about this [https://docs.google.com/document/d/1o3xsukgDN9e4-q8n6KbLDX2c9fTxKIhRhimDsn4ivr8/edit#bookmark=id.ixtd8h2ia3hf here]
 
** Investigate SMP architecture for LKL by querying the LKL developers list
 
** Investigate SMP architecture for LKL by querying the LKL developers list
*** More about this [https://docs.google.com/document/d/1o3xsukgDN9e4-q8n6KbLDX2c9fTxKIhRhimDsn4ivr8/edit#bookmark=kix.h6e9x86v84qy here]
+
*** More about this (also includes Technical specifications and reasons for the problems) [https://docs.google.com/document/d/1o3xsukgDN9e4-q8n6KbLDX2c9fTxKIhRhimDsn4ivr8/edit#bookmark=kix.h6e9x86v84qy here]
 
* '''Phase 2'''
 
* '''Phase 2'''
** To be determined based on Phase 1 results
+
** Re-Implement Python Bindings build in DCE
 +
** Adding --apiscan feature to DCE
 +
** Debug dce script failures from Phase 1
 +
** Implement and test fopencookie based FILE stream redirection
 
* '''Phase 3'''
 
* '''Phase 3'''
** To be determined
+
** Introduce Github Actions CI to ns-3-dce
 +
** Collaborate with mentors and project maintainers on resolving existing issues
 +
** Produce Docker image and documentation to ease installation process
 +
** Produce final DCE releases
  
 
= Weekly Reports =
 
= Weekly Reports =
Line 81: Line 87:
 
** dce-umip-nenmo : Identified that the socket polling events are being enqueued and dequeued from the rr task scheduler m_active queue, making it never return, but only return for specific cases like RngRun=2, but never returns back on gdb no matter what the conditions are.  
 
** dce-umip-nenmo : Identified that the socket polling events are being enqueued and dequeued from the rr task scheduler m_active queue, making it never return, but only return for specific cases like RngRun=2, but never returns back on gdb no matter what the conditions are.  
 
** Also, the LinuxSockImpl::Poll(...) which should probably be called whenever the polling event returns back with an event(POLLIN, POLLOUT, etc.), which would further call applications like PacketSink to increase the number of RX bytes, is never being called.
 
** Also, the LinuxSockImpl::Poll(...) which should probably be called whenever the polling event returns back with an event(POLLIN, POLLOUT, etc.), which would further call applications like PacketSink to increase the number of RX bytes, is never being called.
 +
* '''Week 8''' (July 26 - Aug 2)
 +
** Discussed with mentors on DCE releses and the order of priority of each.
 +
** Started work on DCE-1.11 release with Tom Sir
 +
** Design Decisions and discussions with mentors in progress for docker based DCE setup
 +
** Tom Sir and I discussed the bug in dce-umip-nemo, and DCE's LTE script failures, and possible causes behind it.
 +
** Started work on the possible fix suggested by Tom Sir to DCE's LTE script failures [https://github.com/direct-code-execution/ns-3-dce/issues/122 #122]
 +
* '''Week 9''' (Aug 2 - Aug 9)
 +
** Debugged dce-iperf script failures when using the Linux stack. Ipv4Linux implementation in DCE, works differently as compared to ns-3's Ipv4L3Protocol. More technical details about why the script fails can be found [https://github.com/direct-code-execution/ns-3-dce/issues/125 here]
 +
** Tom Sir suggested a patch for the dce-iperf example script to avoid static node index references while using GetAddress(). PR [https://github.com/direct-code-execution/ns-3-dce/pull/126 #126]
 +
** Tom Sir and I debugged the LTE script issues with Mac48Address usage in DCE(constraint due to net-next-nuse's implementation). [https://github.com/direct-code-execution/ns-3-dce/issues/122 #122]
 +
** Discussed DCE docker build design and implemented a protoype. [https://github.com/ParthPratim/dce-docker-beta/tree/main Docker Repo]
 +
** We tried to close all (known) issues for a possible DCE-1.11 release
 +
** Started work on getting a DCE-1.12 release for Ubuntu-20.04
 +
* '''Week 10''' (Aug 9 - Aug 16)
 +
** Participated in code review in a new PR on fopencookie based FILE stream callback redirection [https://github.com/direct-code-execution/ns-3-dce/pull/128 #128]
 +
** Figured out an execve based edge case where the implementation failed
 +
** Also, tried to provide technical specifications and reasons for the failure above. [https://github.com/direct-code-execution/ns-3-dce/pull/128#discussion_r687856219 Details here]
 +
** Opened PR for adding --apiscan feature to DCE [https://github.com/direct-code-execution/ns-3-dce/pull/129 #129]
 +
** Followed up with possible fixes on suggestions from Tom Sir on my PR #129
 +
= Final Project Evaluation Report =
 +
The final project evaluation report is hosted on : https://ns-3-dce-linux-upgrade.github.io/
 +
 +
Please read through the report which covers details of all the work done during GSoC 2021 on this project.
 +
 +
= Artifacts for Workshop on ns-3 publication =
 +
 +
A [https://dl.acm.org/doi/10.1145/3532577.3532606 Workshop on ns-3 paper] based on the work in this project was published in June 2022.  This section of the wiki page describes how to reproduce the simulation data presented in that paper.
 +
 +
Figures 4, 5, and 6 all provide performance results that are dependent on both how the software was compiled (optimizations) and on the underlying machine CPU.  It will be difficult to reproduce precisely the same results on a different machine, but similar trends in the data should be reproducible.
 +
 +
It should also be noted that in Figures 4 and 5, the curves labeled 'Tazaki' are experimental data points that were copied from a previous paper by the author and not reproduced by us.
 +
 +
Figure 4 and 5 data were generated on a machine running Ubuntu 16.04 with an Intel Core i7-4770 3.40 GHz CPU.  Figure 6 data was generated on a machine running Ubuntu 20.04 with an Intel Core i7-1065G7 1.30 GHz CPU.
 +
 +
Both ns-3-dce and ns-3.34 were build in optimized mode, and other software (net-next-nuse-4.4.0 and ELF loader) were built with the standard Bake configuration.
 +
 +
DCE software used is found on the [https://github.com/tomhenderson/ns-3-dce/tree/performance performance] branch of Tom Henderson's ns-3-dce repository.  This branch as of commit a13b443 (Feb. 18, 2022) was used.  This branch is consistent with the DCE 1.11 release, except that additional Bash scripts were added to generate each curve, and the program `example/dce-udp-perf.cc` was slightly modified as compared with the DCE 1.11 release.
 +
 +
For DCE 1.11 on Ubuntu 16.04, the ELF loader (elf-loader) must be compiled and used for some curves.  ELF loader is not available for Ubuntu 20.04 at the time of this writing.
 +
 +
== Figure 4 ==
 +
 +
The curve labeled 'DCE-1.11 ELF' (which uses DCE kernel mode) can be reproduced using the 'figure3-elf.sh' script, which will generate an output file called 'output-fig3-elf'.  This figure used to be numbered Figure 3 in an earlier version of the paper.  The data points plotted are generated by dividing the values in column 4 (Packets) by the values in column 5 (seconds).  The curve labeled 'Tazaki' was transcribed from the referenced publication.
 +
 +
== Figure 5 ==
 +
 +
Figure 5 provides four curves of generated data, plus a curve labeled 'Tazaki' which was transcribed from the referenced publication.  A Bash script was used for each curve.  The curve labeled 'ns-3' was generated using the 'run-rate-vs-time-ns3.sh' script, yielding a 'rate-vs-time.ns3.dat' file; the curve plotted is the plot of column 3 (rate) vs. column 5 (execution time).  Similarly, the curve labeled 'DCE-1.11 ELF' was generated using the 'run-rate-vs-time-elf-kernel.sh' script, the curve labeled 'DCE-1.11 ELF user' was generated using the 'run-rate-vs-time-elf-user.sh' script, and the curve 'DCE-1.11' was generated using the 'run-rate-vs-time-kernel.sh' script.
 +
 +
== Figure 6 ==
 +
 +
Figure 6 was generated similarly to Figure 5 but on a different machine (Ubuntu 20.04 with different hardware) and with software labelled DCE 1.12 which is a precursor to the actual DCE 1.12 release.  The scripts used to generate Figure 5 data can also be slightly modified to generate Figure 6 data.
 +
 +
Unlike Figures 4 and 5, which were based on software in Tom Henderson's 'performance' branch, Figure 6 is based on software staged in Parth Pratim Chatterjee's repositories, which can be built as follows.
 +
 +
* '''DCE-1.12 (Native Build, Linux Kernel-4.4.0)'''
 +
<div style="margin-left:5%;">
 +
<code> git clone https://gitlab.com/ParthPratim1/bake.git</code><br>
 +
<code>cd bake</code><br>
 +
<code>git checkout dce-1.12</code><br>
 +
<code>./bake.py configure -e dce-linux-1.12</code><br>
 +
<code>./bake.py download</code><br>
 +
<code>./bake.py build</code>
 +
</div>
 +
* '''DCE-1.12 with net-next-nuse-5.10 (Native Build, Linux Kernel-5.10.0)'''
 +
<div style="margin-left:5%;">
 +
<code> git clone https://gitlab.com/ParthPratim1/bake.git</code><br>
 +
<code>cd bake</code><br>
 +
<code>git checkout dce-1.12-linux-5.10</code><br>
 +
<code>./bake.py configure -e dce-linux-1.12</code><br>
 +
<code>./bake.py download</code><br>
 +
<code>./bake.py build</code>
 +
</div>
 +
* '''DCE-1.12 with net-next-nuse-5.10 on Docker'''
 +
** To avoid running docker as root, follow the first sub-section from the [https://docs.docker.com/engine/install/linux-postinstall/ Docker post-installation article]
 +
<div style="margin-left:5%;">
 +
<code>git clone https://github.com/ParthPratim/dce-docker-beta.git</code><br>
 +
<code>cd dce-docker-beta</code><br>
 +
<code>sudo docker-compose up -d</code><br>
 +
<code>sudo docker exec -it ns-3-dce /bin/bash</code><br>
 +
<code>cd /home/bake</code><br>
 +
<code>git init</code><br>
 +
<code>git remote add origin https://gitlab.com/ParthPratim1/bake.git</code><br>
 +
<code>git fetch origin</code><br>
 +
<code>git checkout docker-new</code><br>
 +
<code>./bake.py configure -e dce-linux-1.12</code><br>
 +
<code>./bake.py download</code><br>
 +
<code>./bake.py build</code>
 +
</div>

Latest revision as of 23:35, 21 June 2022

Main Page - Current Development - Developer FAQ - Tools - Related Projects - Project Ideas - Summer Projects

Installation - Troubleshooting - User FAQ - HOWTOs - Samples - Models - Education - Contributed Code - Papers

Back to GSoC 2021 projects

Project Overview

  • Project Name: Direct Code Execution Modernization
  • Student: Parth Pratim Chatterjee
  • Mentors: Tom Henderson, Apoorva Bhargava, Vivek Jain
  • Project Goals: DCE currently makes use of net-next-nuse to extend the Linux kernel internals like the networking stack to host applications but over the years the project hasn't been updated with the latest releases of the Linux kernel. As Linux progressed with newer releases, a major part of the source code changed, making previous glue code incompatible with the newer implementations of the network stack as some of the init calls and function usage changed significantly making migration to newer releases non-trivial. This project aims at enabling support for latest Linux kernel features and toolchains in the DCE environment with support for the socket networking stack, sysctl interfaces, system call access, etc. without any changes to the user APIs currently being used by host applications. The project aims at incorporating an upgraded net-next-nuse port for Linux +v5.10 or the LKL(Linux Kernel Library) into the DCE environment for host applications to effortlessly make use of Linux kernel stacks with minimum to no change in existing simulation scripts.
  • Repository:
  • About Me: I'm a freshman Computer Science undergraduate student at Kalinga Institute of Industrial Technology, Bhubaneshwar, India. I have a keen interest in Linux internals and computer networking. I was a grand prize winner at Google Code-In, 2018 for ns-3 organization, which helped me initially get introduced to DCE. I have an aptitude for Competitive Programming and heavily make use of C/C++, STL and other OOP concepts in solving algorithmic puzzles. I have an experience with C/C++ and Python of more than 3 years, working on projects for numerous Hackathons.

Milestones and Deliverables

The overall project goal is to update DCE such that the latest Linux systems are supported and the latest Linux kernel code could be used.

  • Detailed Project Plan (will be continuously updated throughout the GSoC program duration)
  • Phase 1
    • Ubuntu 20.04 support
      • Goal is that most capabilities presently available for Ubuntu 16 DCE will be available for Ubuntu 20.04 (native)
      • Also contact glibc maintainers about a non-patched solution
      • More about this here
    • Upgrade net-next-nuse Linux kernel support to recent kernel
      • Focus is on the Google BBRv2 kernel (5.10 base): https://github.com/google/bbr
      • Borrow from net-next-nuse-4.4.0 and LKL as appropriate to try to get a new version of net-next-nuse
      • Review existing tests and define/write new tests
      • More about this here
    • Investigate SMP architecture for LKL by querying the LKL developers list
      • More about this (also includes Technical specifications and reasons for the problems) here
  • Phase 2
    • Re-Implement Python Bindings build in DCE
    • Adding --apiscan feature to DCE
    • Debug dce script failures from Phase 1
    • Implement and test fopencookie based FILE stream redirection
  • Phase 3
    • Introduce Github Actions CI to ns-3-dce
    • Collaborate with mentors and project maintainers on resolving existing issues
    • Produce Docker image and documentation to ease installation process
    • Produce final DCE releases

Weekly Reports

  • Community Bonding Period (May 17 - June 7)
    • Figured out possible Scheduling bottlenecks in LKL in blocking network calls. Section 4.5
    • Developed a beta docker port for ns-3-dce. Section 1.4
    • Discussed the bright side of porting the latest Linux kernel using the net-next-nuse architecture.
    • Discussed possible regression tests for verifying both performance and results.
  • Week 1 (June 7 - June 14)
    • Implemented the first Linux Kernel-5.12 port for DCE. Section 5
    • Passed 5 tests/examples. Section 5.12
    • Initiate talks with LKL team to get reviews on a possible SMP port of LKL. Section 4.5
  • Week 2 (June 14 - June 21)
    • Opened PR to integrate Github Actions Workflow for DCE #118
    • Opened PR to the DCE repo for adding support for custom Glibc build #117
    • Opened PR to Bake to support pulling and building required dependencies !9
    • Got patches on pyViz dependency checks and configure_arguments attribute for depends_on field, merged into upstream bake : !8 and !7
    • Initiated discussions on issues with current Bake environment and fixes to problems like regexp based file lookups
    • Debugging Linux kernel timekeeping inconsistencies, net-device xmit packet loss and untimely socket connection request timeout over custom(P2P,Csma,Wifi etc.) registered net-device
  • Week 3 (June 21 - June 28)
    • Integrated ns-3-dev way of generating Python Bindings for DCE. (PR yet to be made)
    • Identified possible positions where packets are being dropped in the IP Layer for the UDP protocol, in the Linux Kernel.
  • Week 4 (June 28 - July 5)
    • Figured out specific setuptools(50.3.2) and setuptools_scm(5.0.0) versions which supports building python bindings on Ubuntu-16.04
    • Identified Linux kernel commit which valides packets with CHECKSUM_PARTIAL (possibly linked to hardware offloading)
    • Linux-4.9.273 (LTS release with BBR support) : 35 tests passed
    • Linux-5.10.47 (LTS release) : Fixed timeout issues, spotted inconsistent results and one-way traffic due to probable packet drops.
    • Repos :
  • Week 5 (July 5 - July 12)
    • DCE-Linux-5.10.47 : Fixed NULL current task pointer : commit
    • Linux-5.10.47 : Dropped down to net-next-nuse's iterative sysctl interface, Fixed failure on encountering link in table
    • Linux-5.10.47 : Setup ethtool , Tested offloading and checksum operations and fetched kernel device properties
    • Linux-5.10.47 : Patch for HW checksum BUG_ON and xmit skb_checksum_help : commit
    • Python --apiscan argument for DCE : Repo for pybind-apiscan
    • Python Bindings for DCE : Pull Request #120
  • Week 6 (July 12 - July 19)
  • Week 7 (July 19 - July 26)
    • Fixed bug caused by this glibc commit for 2.30+, by erasing the DF_1_PIE bit, set in the flags entry of the Dynamic Linking Section of the ELF header of the probable PIE executable : commit
    • PIE flag removal is implemented to be automatic and will be initiated only when a file is being copied into elf-cache/0 and it's a valid ELF valid(starts with \177ELF\002) with valid elf header.
    • dce-umip-nenmo : Identified that the socket polling events are being enqueued and dequeued from the rr task scheduler m_active queue, making it never return, but only return for specific cases like RngRun=2, but never returns back on gdb no matter what the conditions are.
    • Also, the LinuxSockImpl::Poll(...) which should probably be called whenever the polling event returns back with an event(POLLIN, POLLOUT, etc.), which would further call applications like PacketSink to increase the number of RX bytes, is never being called.
  • Week 8 (July 26 - Aug 2)
    • Discussed with mentors on DCE releses and the order of priority of each.
    • Started work on DCE-1.11 release with Tom Sir
    • Design Decisions and discussions with mentors in progress for docker based DCE setup
    • Tom Sir and I discussed the bug in dce-umip-nemo, and DCE's LTE script failures, and possible causes behind it.
    • Started work on the possible fix suggested by Tom Sir to DCE's LTE script failures #122
  • Week 9 (Aug 2 - Aug 9)
    • Debugged dce-iperf script failures when using the Linux stack. Ipv4Linux implementation in DCE, works differently as compared to ns-3's Ipv4L3Protocol. More technical details about why the script fails can be found here
    • Tom Sir suggested a patch for the dce-iperf example script to avoid static node index references while using GetAddress(). PR #126
    • Tom Sir and I debugged the LTE script issues with Mac48Address usage in DCE(constraint due to net-next-nuse's implementation). #122
    • Discussed DCE docker build design and implemented a protoype. Docker Repo
    • We tried to close all (known) issues for a possible DCE-1.11 release
    • Started work on getting a DCE-1.12 release for Ubuntu-20.04
  • Week 10 (Aug 9 - Aug 16)
    • Participated in code review in a new PR on fopencookie based FILE stream callback redirection #128
    • Figured out an execve based edge case where the implementation failed
    • Also, tried to provide technical specifications and reasons for the failure above. Details here
    • Opened PR for adding --apiscan feature to DCE #129
    • Followed up with possible fixes on suggestions from Tom Sir on my PR #129

Final Project Evaluation Report

The final project evaluation report is hosted on : https://ns-3-dce-linux-upgrade.github.io/

Please read through the report which covers details of all the work done during GSoC 2021 on this project.

Artifacts for Workshop on ns-3 publication

A Workshop on ns-3 paper based on the work in this project was published in June 2022. This section of the wiki page describes how to reproduce the simulation data presented in that paper.

Figures 4, 5, and 6 all provide performance results that are dependent on both how the software was compiled (optimizations) and on the underlying machine CPU. It will be difficult to reproduce precisely the same results on a different machine, but similar trends in the data should be reproducible.

It should also be noted that in Figures 4 and 5, the curves labeled 'Tazaki' are experimental data points that were copied from a previous paper by the author and not reproduced by us.

Figure 4 and 5 data were generated on a machine running Ubuntu 16.04 with an Intel Core i7-4770 3.40 GHz CPU. Figure 6 data was generated on a machine running Ubuntu 20.04 with an Intel Core i7-1065G7 1.30 GHz CPU.

Both ns-3-dce and ns-3.34 were build in optimized mode, and other software (net-next-nuse-4.4.0 and ELF loader) were built with the standard Bake configuration.

DCE software used is found on the performance branch of Tom Henderson's ns-3-dce repository. This branch as of commit a13b443 (Feb. 18, 2022) was used. This branch is consistent with the DCE 1.11 release, except that additional Bash scripts were added to generate each curve, and the program `example/dce-udp-perf.cc` was slightly modified as compared with the DCE 1.11 release.

For DCE 1.11 on Ubuntu 16.04, the ELF loader (elf-loader) must be compiled and used for some curves. ELF loader is not available for Ubuntu 20.04 at the time of this writing.

Figure 4

The curve labeled 'DCE-1.11 ELF' (which uses DCE kernel mode) can be reproduced using the 'figure3-elf.sh' script, which will generate an output file called 'output-fig3-elf'. This figure used to be numbered Figure 3 in an earlier version of the paper. The data points plotted are generated by dividing the values in column 4 (Packets) by the values in column 5 (seconds). The curve labeled 'Tazaki' was transcribed from the referenced publication.

Figure 5

Figure 5 provides four curves of generated data, plus a curve labeled 'Tazaki' which was transcribed from the referenced publication. A Bash script was used for each curve. The curve labeled 'ns-3' was generated using the 'run-rate-vs-time-ns3.sh' script, yielding a 'rate-vs-time.ns3.dat' file; the curve plotted is the plot of column 3 (rate) vs. column 5 (execution time). Similarly, the curve labeled 'DCE-1.11 ELF' was generated using the 'run-rate-vs-time-elf-kernel.sh' script, the curve labeled 'DCE-1.11 ELF user' was generated using the 'run-rate-vs-time-elf-user.sh' script, and the curve 'DCE-1.11' was generated using the 'run-rate-vs-time-kernel.sh' script.

Figure 6

Figure 6 was generated similarly to Figure 5 but on a different machine (Ubuntu 20.04 with different hardware) and with software labelled DCE 1.12 which is a precursor to the actual DCE 1.12 release. The scripts used to generate Figure 5 data can also be slightly modified to generate Figure 6 data.

Unlike Figures 4 and 5, which were based on software in Tom Henderson's 'performance' branch, Figure 6 is based on software staged in Parth Pratim Chatterjee's repositories, which can be built as follows.

  • DCE-1.12 (Native Build, Linux Kernel-4.4.0)

git clone https://gitlab.com/ParthPratim1/bake.git
cd bake
git checkout dce-1.12
./bake.py configure -e dce-linux-1.12
./bake.py download
./bake.py build

  • DCE-1.12 with net-next-nuse-5.10 (Native Build, Linux Kernel-5.10.0)

git clone https://gitlab.com/ParthPratim1/bake.git
cd bake
git checkout dce-1.12-linux-5.10
./bake.py configure -e dce-linux-1.12
./bake.py download
./bake.py build

git clone https://github.com/ParthPratim/dce-docker-beta.git
cd dce-docker-beta
sudo docker-compose up -d
sudo docker exec -it ns-3-dce /bin/bash
cd /home/bake
git init
git remote add origin https://gitlab.com/ParthPratim1/bake.git
git fetch origin
git checkout docker-new
./bake.py configure -e dce-linux-1.12
./bake.py download
./bake.py build