GSOC2023ns3-ai

From Nsnam
Jump to: navigation, search

Main Page - Current Development - Developer FAQ - Tools - Related Projects - Project Ideas - Summer Projects

Installation - Troubleshooting - User FAQ - HOWTOs - Samples - Models - Education - Contributed Code - Papers

Back to GSoC 2023 projects

Contents

Project Overview

About the Project

The objective of this proposed project is to enhance the ns3-ai module, which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory. The main focus of this enhancement is to optimize performance and improve usability.

To accomplish this goal, the project will introduce additional APIs that support data structures such as vector in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance. Also, the project will integrate Gymnasium API like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into an environment that agents can efficiently and seamlessly interact with.

In addition, the project will enhance the existing examples, documentation and tutorials, while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.

Furthermore, the project aims to provide examples utilizing pure C++-based ML frameworks. This will offer researchers more options for integrating with ML. The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

About Me

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects. In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects. I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol. In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach. Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

Milestones

Based on my proposal, I divide my project into two phases, listed below.

Phase one (before midterm evaluation)

Enhancements for the interface

std::vector support

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

gym-like interface

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

Enhancements for existing examples

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

Phase two (after midterm evaluation)

Integration of ns-3 and C++-based ML frameworks

Apply Tensorflow C++ APIs and PyTorch C++ APIs to examples using Python-based ML frameworks. Also, provide Cmake configurations that both works on Linux and macOS, and documentation on building & running.

Finishing new examples and benchmarking test

Finish some new examples using Gym interface and vector-based message interface. Compare Gym interface's performance with ns3-gym, and compare vector-based message interface's performance with struct-based message interface.

Weekly Report

Week 1 (May 29 - June 4)

Achievements

  1. Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
  2. Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
  3. (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

Problems

  1. The code is quite naive and possibly includes some extra interactions that lowers performance.
  2. I have not tested the new interface on Linux.
  3. (Update on June 3) The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.
  4. (Update on June 3) Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.

Todo next week

  1. Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
  2. Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
  3. (Update on June 3) Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.
  4. (Update on June 3) Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".
  5. If I have time, I will test my code on Linux.

Week 2 (June 5 - June 11)

Achievements

  1. Updated the Thompson Sampling example to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
  2. Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

Problems

  1. The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

Todo next week

  1. Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
  2. Or, try to optimize the code to make small data interaction faster.

Week 3 (June 12 - June 18)

Achievements

  1. Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of Thompson Sampling example shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
    • I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
    • I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin counting code to force always spinning, and the running time reduced a lot.
    • To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
  2. Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

Problems

  1. The examples has not been tested on Linux yet, which will take place next week.

Todo next week

  1. Start working on ns3-gym-like interface, which is one of the milestones.
  2. Work with Hao to release the previous version of ns3ai.
  3. Test the three currently available examples on Linux system.

Week 4 (June 19 - June 25)

Achievements

  1. Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
  2. Read the paper of ns3-gym and tried running the code to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
  3. Linux usage is tested.

Problems

  1. The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the gym.make() API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

Todo next week

  1. Continue developing Gym interface.

Week 5 (June 26 - July 2)

Achievements

  1. Completed the a-plus-b example of Gym interface.

Problems

Todo next week

  1. Continue developing other examples using Gym interface.

Week 6 - 7 (July 3 - July 16)

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

Achievements

  1. Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
  2. Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
  3. Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
  4. Updated all READMEs that contains step by step instructions for how to build and run the examples.

Problems

  1. Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
  2. Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

Todo next week

  1. Provide some initial benchmark of Gym interface with ns3-gym.
  2. Do midterm evaluation.

Week 8 (July 17 - July 23)

Achievements

  1. Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!
  2. Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

Problems

  1. My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

Todo next week

  1. Conduct benchmarking of interaction time on RL-TCP example.

Week 9 (July 24 - July 30)

Achievements

  1. Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.

Problems

Todo next week

  1. Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

Week 10 (July 31 - Aug 6)

Achievements

  1. Update the lte-cqi example to use the msg interface.
  2. Working on multi-bss example, based on Juan's branch of ns-3-dev.

Problems

Todo next week

  1. Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

Week 11 (Aug 7 - Aug 13)

Achievements

  1. Get familiar with the RL algorithm in multi-bss example, based on the slide and the code.
  2. Finished the Python binding and cmake configuration of multi-bss example, with some problems.
  3. Wrote a quick start guide on the ns3-ai interface, and thoroughly updated the READMEs in the repository.

Problems

  1. The Python script of multi-BSS is incompatible with C++ code, having sightly different data definition.
  2. The algorithm should automatically change the CCA, but CAA is not changed (always -82)

Todo next week

  1. Complete the Multi-BSS example's code and do some benchmark.

Week 12 (Aug 14 - Aug 20)

Achievements

  1. Finish the code for Multi-BSS (can run now and the CCA value changes to approximately -70)
  2. Updating documents for future release by Hao
  3. Enhanced some cmake configurations for better usability
    • Add protobuf-generate function for protobuf installations that don’t provide it
    • Change the cmake target from ‘ns3-ai’ to ‘ai’. Now the ns3-ai mudule can be built with ./ns3 build ai. (custom modules cannot be built directly with ./ns3 build if they have a ‘ns3-’ prefix, due to some settings in ./ns3 script)

Problems

  1. In Multi-BSS example, at 1 min of simulation, VR tpt ≈ 5Mbps, can’t meet requirement (50 Mbps). Occasionally VR delay ≈ 0 and tpt ≈ 1e8, possibly due to statistics error.

Todo next week

  1. Adjust the parameters in RL algorithm to meet VR requirements on throughput
  2. Benchmarking (compare running time with previous interface)

Week 13 (Aug 21 - Aug 27)

Achievements

  1. Finished a simple example using C++-based ML framework TensorFlow.
  2. Continue modifying Multi-BSS parameters to meet VR requirements.

Problems

  1. The C++-based example is incomplete, due to the lack of TensorFlow C API as described in their current-status. The example will be available when TensorFlow C API provides Gradients and Neural Networks library.
  2. The Multi-BSS example still doesn't meet the VR requirements after training, although I changed some parameters. Maybe I have to leave its optimization to future work (after GSoC). The benchmarking is not affected.

Todo next week

  1. Finish the last C++-based ML example (RL-TCP using PyTorch C++ API)
  2. Update vector-based message interface's benchmarking result.

Week 14 (Aug 28 - Aug 3)

Achievements

  1. Finished RL-TCP example using pure C++ interface
  2. Benchmarked pure C++ (pure C++ vs msg interface in terms of processing time) and vector-based interface (vector vs struct in terms of data transmission time)
    • Pure C++ is twice faster than msg interface, due to removal of interprocess communication.
    • Unfortunately, vector-based message interface is slower (1.2x in CPU cycle count) than struct-based interface. Possibly due to slow access of vector on Python side.
  3. Updated many documents for PR.

Problems

  1. As mentioned above, vector interface is slow. In the future, we may need to integrate C++-based linear algebra libraries like Eigen, which has faster Python bindings than pybind11's std::vector binding, provided by some open source projects like eigenpy.