Nsnam - User contributions [en]

GSOC2023ns3-aiFinalReport

2023-09-14T15:07:33Z

Muyuan: update slides

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

Slides: [https://drive.google.com/file/d/1mK03ABBM1r-CMOp6RLiuDYPwyNTyIVaP/view?usp=share_link final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR has been merged.'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Merged
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-14T03:09:59Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

Slides: [https://drive.google.com/file/d/12ewBfqhkkjuOG2cI0PvVywaP1C9MRlFZ/view?usp=share_link final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR has been merged.'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Merged
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-14T03:09:20Z

Muyuan: update merge status

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

Slides: [https://drive.google.com/file/d/12ewBfqhkkjuOG2cI0PvVywaP1C9MRlFZ/view?usp=share_link final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR is merged.'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Merged
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-12T15:36:07Z

Muyuan: update slides

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

Slides: [https://drive.google.com/file/d/12ewBfqhkkjuOG2cI0PvVywaP1C9MRlFZ/view?usp=share_link final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR is being reviewed by my mentor (as of Sep. 11, 2023).'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open (as of Sep. 11, 2023)
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-12T09:04:00Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

Slides: [https://drive.google.com/file/d/1s3kFlvSefTCQtve8AEq5yxGVTn75abTS/view?usp=share_link final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR is being reviewed by my mentor (as of Sep. 11, 2023).'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open (as of Sep. 11, 2023)
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-12T08:55:44Z

Muyuan: update slides v2

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

Slices: [https://drive.google.com/file/d/1s3kFlvSefTCQtve8AEq5yxGVTn75abTS/view?usp=share_link final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR is being reviewed by my mentor (as of Sep. 11, 2023).'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open (as of Sep. 11, 2023)
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-11T15:31:06Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

Slices: [https://drive.google.com/file/d/1RoyLcrhyMQTuCN_3nq8ORIT6jzqpH1ao/view?usp=sharing final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR is being reviewed by my mentor (as of Sep. 11, 2023).'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open (as of Sep. 11, 2023)
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-11T15:30:50Z

Muyuan: add slides

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)
Slices: [https://drive.google.com/file/d/1RoyLcrhyMQTuCN_3nq8ORIT6jzqpH1ao/view?usp=sharing final report slides] (for reference)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR is being reviewed by my mentor (as of Sep. 11, 2023).'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open (as of Sep. 11, 2023)
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-11T15:25:45Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. '''The MR is being reviewed by my mentor (as of Sep. 11, 2023).'''
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open (as of Sep. 11, 2023)
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-11T15:19:46Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch named "improvements", and it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-11T15:19:13Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.
* Why the branch is named "cmake": because one of my early tasks was to add Cmake support for ns3-ai (to be compatible with ns-3.36+). During GSoC I have worked on another branch names "improvements", but it was eventually merged into cmake branch.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-11T15:13:00Z

Muyuan: update links

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai].
I created a single MR that contain all my works to be merged into the [https://github.com/hust-diangroup/ns3-ai/tree/cmake upstream cmake branch]. In this MR, there are 110+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/21c9d4a30a7b20e3532a6bfac289768bf42acedc/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-09T03:55:55Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my teachers Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-08T15:34:52Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

= Acknowledgments =

I extend my heartfelt gratitude to my mentors Hao and Collin for their invaluable suggestions and comments that have guided me through the challenges during the GSoC 2023. Collaborating with the ns-3 community has been an enriching experience, expanding not only my technical knowledge but also fostering my skills in communication and oral presentation. At the same time, I am also very grateful to my mentors Prof. Yayu Gao and Prof. Xiaojun Hei at HUST, who have provided me with a lot of encouragement when I encountered difficulties. Additionally, I would like to express my appreciation to Google for offering this remarkable opportunity.

GSOC2023ns3-aiFinalReport

2023-09-08T15:18:53Z

Muyuan: finish initial draft

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

'''Note: Each URL showed below, if it is for my source code, points to contents as of my last commit during GSoC period.'''

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

A few things were mentioned in the proposal, but is not completed in my actual work:

# The LTE-handover example, which was intended to be an example using vector interface, similar to Multi-BSS. This example has not started because of limited time.
# Support for std::string in shared memory. Development for this support was postponed because the vector interface had the highest priority. Also, it's not considered a 'must do' in the project.
# Pure C++ ML example using TensorFlow C API. This has failed because of inadequate C API for gradients and neural networks, as mentioned above in 'Pure C++ example' section.

= Future Works =

# Add more examples, such as LTE-handover, to ns3-ai for better demonstration of the tool.
# Optimize the vector-based message interface to reach its full potential on transmitting vectors or matrices of data.

GSOC2023ns3-aiFinalReport

2023-09-08T14:45:18Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

All links to my repository below belongs to my [https://github.com/ShenMuyuan/ns3-ai/commit/51e0c34a90de88db02b1016db626bb0cb605c166 last commit to improvements branch].

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

= Build and Run the Code =

A detailed guide on how to setup ns3-ai module is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md here]. You must install ns-3 prior to install ns3-ai. To test ns3-ai, you can build and run the provided examples (listed in the above 'Phase 2' section) according to their documentations.

= Proposal vs. Actual Work =

GSOC2023ns3-aiFinalReport

2023-09-08T14:29:52Z

Muyuan:

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

All links to my repository below belongs to my [https://github.com/ShenMuyuan/ns3-ai/commit/51e0c34a90de88db02b1016db626bb0cb605c166 last commit to improvements branch].

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

See [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/docs/benchmarking ns3-ai benchmarking documentation] for more detailed information.

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

GSOC2023ns3-aiFinalReport

2023-09-08T14:27:50Z

Muyuan: finish phase 2

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

All links to my repository below belongs to my [https://github.com/ShenMuyuan/ns3-ai/commit/51e0c34a90de88db02b1016db626bb0cb605c166 last commit to improvements branch].

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/using-pure-cpp.md here]

=== Benchmarking ===

I benchmarked three items:

# '''Gym interface vs ns3-gym''' in terms of '''transmission time''': This benchmark is based on the RL-TCP example, measuring the CPU cycle count during C++ to Python and Python to C++ data transmissions, and compare the mean and standard deviation of cycles. Results show that in both directions, the transmission time of ns3-ai's Gym interface is '''more than 15 times shorter''' than that of ns3-gym ('''shorter is better''').
# '''Vector-based vs. struct-based''' message interface in terms of '''transmission time''': The benchmark is based on Multi-BSS example, on benchmark_vector branch. Unfortunately, in terms of action transmission time (from C++'s beginning of write to Python's complete read), the vector-based is '''1.2 times longer''' than the struct-based ('''shorter is better'''). The extra time is caused by Python's slow reading of vectors. Measurements show that in reading rxPower (received power in nodes in first BSS) at Python side, vector interface spent 20% to 50% more time than struct interface.
#* To deal with the slow vector access on Python side in the future, '''one possible solution is to integrate Eigen''' on C++ side and use existing Eigen-Python bindings like pybind11's Eigen support or eigenpy to convert linear algebra types into numpy or scipy types.
# '''Pure C++ vs. struct-based message interface''' in terms of '''processing time''': The benchmark is based on the pure C++ (libtorch) and message interface (PyTorch) version of RL-TCP example. We compare the processing time (i.e. transmission time + DRL algorithm time for message interface, DRL algorithm time for pure C++) for the two interfaces, including the mean and the standard deviation. Results show that the processing time of pure C++ implementation is '''more than twice shorter''' than that of message interface implementation ('''shorter is better''').

Overall, the Gym interface is much faster than ns3-gym, and the pure C++ interface is more efficient than message interface. The vector interface needs to be enhanced in the future, especially in the optimization of Python side access.

GSOC2023ns3-aiFinalReport

2023-09-08T13:45:15Z

Muyuan: finish phase 2 part 2

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

All links to my repository below belongs to my [https://github.com/ShenMuyuan/ns3-ai/commit/51e0c34a90de88db02b1016db626bb0cb605c166 last commit to improvements branch].

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

In the development of a pure C++-based ML framework example, I tried to rewrite the LTE-CQI example (originally using tensorflow as Python-based ML framework) to utilize [https://www.tensorflow.org/install/lang_c TensorFlow C API], and the RL-TCP example (originally using torch as Python-based ML framework) to employ [https://pytorch.org/cppdocs/ PyTorch C++ API]. Unfortunately, only the latter succeed. The pure C++ version of LTE-CQI failed because there was limited support for gradients and neural networks in TensorFlow's C API. So, for TensorFlow C I only provide [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi/pure-cpp an example that checks libtensorflow version]. Although I succeeded in converting Python code to C++ in [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp/pure-cpp RL-TCP example], the process was slow and difficult due to the lack of official documents and examples. For instance, C++ API doesn't provide the useful load_state_dict function for copying policy net parameters to target net. It took me a while to find out the equivalent C++ function to do that (torch::save and torch::load, and the module must be defined with TORCH_MODULE macro).

I also wrote a guide on how to use C++-based ML frameworks in ns-3 (by installing in ns3-ai): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/using-pure-cpp.md here]

=== Benchmarking ===

GSOC2023ns3-aiFinalReport

2023-09-08T13:20:11Z

Muyuan: finish phase 2 part 1

{{TOC}}

Back to [[GSOC2023ns3-ai]] (page containing my weekly updates, not the final report)

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin

== Project Goals ==

The main focus of this project is to '''optimize performance''' and '''improve usability''' of the '''ns3-ai module''', which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory.

To accomplish this goal, the project will '''introduce additional APIs that support data structures such as vector''' in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance.
Also, the project will '''integrate Gymnasium API''' like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into a environment that agents can efficiently and seamlessly interact with.
In addition, the project will '''enhance the existing examples, documentation and tutorials''', while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.
Furthermore, the project aims to '''provide examples utilizing pure C++-based ML frameworks'''. This will offer researchers more options for integrating with ML.

The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

= Merge Requests and Commits =

Throughout the project, my development is based my [https://github.com/ShenMuyuan/ns3-ai/tree/improvements improvements] branch of [https://github.com/hust-diangroup/ns3-ai ns3-ai]. The improvements branch originates from the [https://github.com/ShenMuyuan/ns3-ai/tree/cmake cmake branch], because I was fixing problems about cmake compatibility before GSoC. So, I created a single MR that contain all my works to be merged into cmake branch. In this MR, there are 120+ commits by me, with author name 'ShenMuyuan' or 'Mu-YuanShen' or 'eicsmy'. The cmake branch will be merged to upstream by my mentor.

{| class="wikitable"
|+ Merge Requests
|-
! No. !! Name !! Status
|-
| [1] || [https://github.com/hust-diangroup/ns3-ai/pull/97 merge to cmake branch] || Open
|}

= Project Details =

All links to my repository below belongs to my [https://github.com/ShenMuyuan/ns3-ai/commit/51e0c34a90de88db02b1016db626bb0cb605c166 last commit to improvements branch].

== Community Bonding Period ==

During community bonding period, I started bi-weekly meetings with my mentors and we decided on the project plan, which is
prioritizing the development of new interfaces, than develop more examples & enhance documentations.

There are two new interfaces, including vector interface (later, we called it vector-based message interface, as it shared some fundamentals with
the struct-based message interface) and Gym interface. Also, we talked about some details of new examples like LTE-handover and Multi-BSS.

I also read the ns3-ai code thoroughly to understand its IPC principles and learned some reinforcement learning basics.

== Phase 1 ==

=== std::vector support ===

To add std::vector into shared memory is not easy with ns3-ai's original design, because Python's ctypes library does not provide STL templates
support (it can only support C structures and functions). In order to support vector, I refactored the original model completely, replacing ctypes with Boost C++ library which is more flexible for interprocess communication. My works include:
* Utilized Boost's '''boost::interprocess::managed_shared_memory''' to store data (as well as synchronization variables) in shared memory. This shared segment can be used for '''data transmission between C++ and Python'''. The two directions, C++-to-Python and Python-to-C++, occupies two different regions in shared memory. It also supports '''custom memory allocator for STL''', a instance of boost::interprocess::allocator, which ensures that when STL allocates new memory, that memory is come from the shared memory rather than other heap memory.
** The shared memory creation can be found in the constructor of Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L60 code]
* Developed '''spinlock-based semaphore''' to synchronize reads & writes operations in shared memory. The original synchronization method works, but the "version number" concept and the "control block" data structures may cause confusion and distraction for beginners. Also, the "version number" is just a complex implementation of the well-known semaphore. To improve ease of use and enhance code readability, I created a semaphore that '''only spins but does not sleep while waiting''' based on Boost's semaphore. It has performance comparable to the original with '''better readability and usability'''.
** The semaphore operations and their implementation can be found in structure Ns3AiSemaphore: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-semaphore.h#L28 code]
** The usage of the semaphore in Ns3AiMsgInterfaceImpl: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L194 code] and more usage below
* Built the '''vector-based''' interface with '''multiple configurable options'''. The vector interface is in parallel with the struct interface in terms of creation and usage, and there is an attribute that users can set in early code in order to '''choose one of the interfaces'''. If the vector interface is chosen, the C++-to-Python and Python-to-C++ vectors are created in shared memory and will contain no elements. It requires users to call resize or push_back to adjust their length before use. Another attribute is whether the interface '''handles simulation end'''. If that attribute is set, the interface will perform a simple protocol to notify Python side when C++ side simulation finishes. Other configurable attributes include memory segment size and names of objects constructed in shared memory.
** Note: the attributes are not part of ns-3 attribute system, because Ns3AiMsgInterface is a Singleton rather than Object.
** Attributes setting in Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L306 code] and more setting below
** How the protocol works when the interface is destroyed: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/ns3-ai-msg-interface.h#L126 code]
* Provided '''Python binding boilerplate code''' in examples. Python side accesses the shared memory and the objects in it (vectors or structs) via C++ functions exposed to Python. The '''exposure of C++ class functions and members''' is achieved with '''Pybind11''', a lightweight python binding library. The C++ binding code, linked with Pybind11, is compiled into dynamically-linked library that Python can import as a module. Because the C++ side interface is template-based and Python does not support template natively, the Python binding module needs to be separately generated for every program (the creation is done by a cmake target dependency so it's seamless). Although the binding contains many lines of C++ code and is difficult to write from scratch, users can '''modify from an existing binding code''' to generate Python binding modules quickly, and I provide many boilerplate on that (the *_py.cc files in all examples).
** Some of the example boilerplate code: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b/use-msg-stru/apb_py.cc#L29 binding code for struct-based message interface in A-Plus-B example], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss/multi_bss_py.cc#L33 binding code for vector-based message interface in Multi-BSS example]

=== Gymnasium API ===

The [https://gymnasium.farama.org/index.html Gymnasium API] for ns3-ai is aimed to be based on shared memory rather than sockets communication, which can provide faster data exchange than [https://github.com/tkn-tub/ns3-gym ns3-gym] does. While many of the [https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface Gym interface] code is from ns3-gym's repository, I made some substantial changes in order for it to have a shared memory backend. My works include:
* Modified OpenGymInterface to '''use Ns3AiMsgInterface for IPC'''. OpenGymInterface is created by ns3-gym developers, providing code to create Gym-compatible environments in ns-3. It contains functions to get state or action spaces, observe the environment in ns-3 and execute the actions (maybe changing parameters in simulation). Those function use callbacks registered by OpenGymEnv at runtime. To make callbacks work well, custom environment must inherit from OpenGymEnv and implement the class methods such as GetActionSpace, GetObservationSpace, GetObservation and ExecuteActions. All states and actions are serialized by Google's Protocol Buffers and then transmitted and de-serialized by the peer. What I did is changing the ZeroMQ socket's send & receive functions to Ns3AiMsgInterface's send & receive functions, and ensuring that Ns3AiMsgInterface is properly initialized. The underlying message interface for transmitting serialized messages is struct-based. The struct contains a buffer (uint8_t array) and its capacity.
** Example of my changed part: [https://github.com/tkn-tub/ns3-gym/blob/6007f4b3811af0cffcacf9a6151e5b9d2f4ef3ae/model/opengym_interface.cc#L190 before (in ns3-gym's repo)], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L104 after (in my ns3-ai repo)]
** Initialization of Ns3AiMsgInterface: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-interface.cc#L56 code]
** Note: in the above configuration, handling finish is set to false because the protocol of notifying Python side that C++ side has finished is unnecessary for Gym. Gym interface has its own protocol for handling finish, which is [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/cpp/ns3-ai-gym-env.cc#L78 NotifySimulationEnd on C++ side] and then [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L270 'done' becoming true] when [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L294 Python steps].
* Created '''Python binding''' for accessing the shared structure containing '''serialized message string'''. Binding that structure containing array is similar to binding a common structure, except that the array is specially treated to convert its contents to Python's '''memoryview'''. With memoryview, Python side can read and write to the array seamlessly, like what you can do in C++ with std::array.
** Obtaining the memoryview in binding: [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/msg_py_binding.cc#L33 code]
** Note: different length of array must have different memoryview object for Python to deal with. In the above code, get_buffer returns the buffer that is actually used (for reading), while get_buffer_full returns the buffer that has the full length (for writing). Example usage in Ns3Env (the Python side Gym environment created with gym.make): [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L115 array read] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/py/ns3ai_gym_env/envs/ns3_environment.py#L130 array write]

== Phase 2 ==

=== Examples and documentation update ===

To demonstrate the usage of the message interface and Gym interface, all existing examples are updated to use the new interfaces. Also, a new example "Multi-BSS" is created to benchmark the performance of vector interface. All of them can be successfully built using the "./ns3 build" command with the updated Cmake files, without needing to copy the examples to scratch folder. Updated examples and the interfaces supported by them are listed below:
* A-Plus-B example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/a-plus-b directory]): In this example, C++ side starts by setting 2 random numbers between 0 and 10 in shared memory. Then, Python side gets the numbers and sets the sum of the numbers in shared memory (in another region). Finally, C++ gets the sum that Python set. The procedure is analogous to C++ passing RL states to Python and Python passing RL actions back to C++, and is repeated many times. Supported interfaces:
** Struct-based message interface
** Vector-based message interface
** Gym interface
* LTE-CQI example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/lte-cqi directory]): CQI prediction example. The original work is done based on 5G NR branch in ns-3, and previous developers have made some changes to make it also run in LTE codebase in ns-3 mainline. Supported interfaces:
** Struct-based message interface
* Multi-BSS example ('''new example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/multi-bss directory]): The example is based on and modified from [https://gitlab.com/juanvleonr/ns-3-dev/-/tree/clean-tgax?ref_type=heads juanvleonr's clean-tgax branch]. The C++ side simulates a VR gaming scenario showed below, in which 4 BSSs operate in separate apartments in a 2 by 2 grid. Each BSS contains 1 AP and 4 STAs. One of the STA in the first BSS is a VR device generating burst UL traffic, while other devices have normal UL traffic. Supported interfaces:
** Struct-based message interface (available at [https://github.com/ShenMuyuan/ns3-ai/tree/dd8dd3a489f8faf8a380841b73c250d23c1a3710/examples/multi-bss the benchmarking branch])
** Vector-based message interface
* Rate-Control example ('''updated example''') (including constant rate & Thompson Sampling) ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rate-control directory]): There are existing models of constant rate and Thompson sampling algorithms in Wi-Fi module. Here they are implemented in Python to show how to develop a new rate control algorithm for the Wi-Fi module using ns3-ai. Supported interfaces:
** Struct-based message interface
* RL-TCP example ('''updated example''') ([https://github.com/ShenMuyuan/ns3-ai/tree/51e0c34a90de88db02b1016db626bb0cb605c166/examples/rl-tcp directory]): This example applies Q-learning algorithms (Q-learning and deep Q-learning) to TCP congestion control for real-time changes in the environment of network transmission. By strengthening the learning management sliding window and threshold size, the network can get better throughput and smaller delay. Supported interfaces:
** Struct-based message interface
** Gym interface

Documents are updated along with the examples. Apart from all the README.md in example directories, I added [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/docs/install.md instruction for installation], [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/msg-interface/README.md message interface tutorial] and [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/model/gym-interface/README.md Gym interface tutorial] as separate documents linked to the [https://github.com/ShenMuyuan/ns3-ai/blob/51e0c34a90de88db02b1016db626bb0cb605c166/README.md updated root README.md].

=== Pure C++ example ===

=== Benchmarking ===

GSOC2023ns3-aiFinalReport

2023-09-08T12:14:17Z

Muyuan: finish phase 1

GSOC2023ns3-aiFinalReport

2023-09-08T11:01:09Z

Muyuan: gym api

GSOC2023ns3-aiFinalReport

2023-09-08T09:52:27Z

Muyuan: add links

GSOC2023ns3-aiFinalReport

2023-09-08T09:23:49Z

Muyuan: vector part of phase 1

GSOC2023ns3-aiFinalReport

2023-09-07T15:29:27Z

Muyuan:

GSOC2023ns3-aiFinalReport

2023-09-07T14:48:33Z

Muyuan: add MR part

GSOC2023ns3-aiFinalReport

2023-09-07T14:30:58Z

Muyuan: add project goals

GSOC2023ns3-ai

2023-09-07T13:49:00Z

Muyuan:

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo
* '''Repository:''' https://github.com/ShenMuyuan/ns3-ai/tree/improvements
* '''Final report:''' [[GSOC2023ns3-aiFinalReport]]

== About the Project ==

The objective of this proposed project is to enhance the ns3-ai module, which facilitates the connection between ns-3 and Python-based ML frameworks using shared memory. The main focus of this enhancement is to optimize performance and improve usability.

To accomplish this goal, the project will introduce additional APIs that support data structures such as vector in shared memory IPC. This will effectively reduce the required interaction between C++ and Python, resulting in improved performance. Also, the project will integrate Gymnasium API like ns3-gym's but has a shared-memory-based backend, to turn ns-3 into an environment that agents can efficiently and seamlessly interact with.

In addition, the project will enhance the existing examples, documentation and tutorials, while also integrating new examples that cover scenarios like Multi-BSS in VR. By doing so, users will have more comprehensive resources at their disposal.

Furthermore, the project aims to provide examples utilizing pure C++-based ML frameworks. This will offer researchers more options for integrating with ML. The overall aim of the project is to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze network related algorithms with enhanced efficiency and flexibility.

== About Me ==

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my proposal, I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

Apply [https://www.tensorflow.org/api_docs/cc Tensorflow C++ APIs] and [https://pytorch.org/cppdocs/ PyTorch C++ APIs] to examples
using Python-based ML frameworks. Also, provide Cmake configurations that both works on Linux and macOS, and documentation on building
& running.

=== Finishing new examples and benchmarking test ===

Finish some new examples using Gym interface and vector-based message interface. Compare Gym interface's performance with ns3-gym,
and compare vector-based message interface's performance with struct-based message interface.

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

== Week 10 (July 31 - Aug 6) ==

=== Achievements ===

# Update the lte-cqi example to use the msg interface.
# Working on multi-bss example, based on [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads Juan's branch of ns-3-dev].

=== Problems ===

=== Todo next week ===

# Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

== Week 11 (Aug 7 - Aug 13) ==

=== Achievements ===

# Get familiar with the RL algorithm in multi-bss example, based on [https://www.nsnam.org/tutorials/consortium23/MultiBSS-UW.pdf the slide] and [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads the code].
# Finished the Python binding and cmake configuration of multi-bss example, with some problems.
# Wrote a quick start guide on the ns3-ai interface, and thoroughly updated the READMEs in the repository.

=== Problems ===

# The Python script of multi-BSS is incompatible with C++ code, having sightly different data definition.
# The algorithm should automatically change the CCA, but CAA is not changed (always -82)

=== Todo next week ===

# Complete the Multi-BSS example's code and do some benchmark.

== Week 12 (Aug 14 - Aug 20) ==

=== Achievements ===

# Finish the code for Multi-BSS (can run now and the CCA value changes to approximately -70)
# Updating documents for future release by Hao
# Enhanced some cmake configurations for better usability
#* Add protobuf-generate function for protobuf installations that don’t provide it
#* Change the cmake target from ‘ns3-ai’ to ‘ai’. Now the ns3-ai mudule can be built with ./ns3 build ai. (custom modules cannot be built directly with ./ns3 build if they have a ‘ns3-’ prefix, due to some settings in ./ns3 script)

=== Problems ===

# In Multi-BSS example, at 1 min of simulation, VR tpt ≈ 5Mbps, can’t meet requirement (50 Mbps). Occasionally VR delay ≈ 0 and tpt ≈ 1e8, possibly due to statistics error.

=== Todo next week ===

# Adjust the parameters in RL algorithm to meet VR requirements on throughput
# Benchmarking (compare running time with previous interface)

== Week 13 (Aug 21 - Aug 27) ==

=== Achievements ===

# Finished a simple example using C++-based ML framework TensorFlow.
# Continue modifying Multi-BSS parameters to meet VR requirements.

=== Problems ===

# The C++-based example is incomplete, due to the lack of TensorFlow C API as described in their [https://github.com/tensorflow/docs/blob/master/site/en/r1/guide/extend/bindings.md#current-status current-status]. The example will be available when TensorFlow C API provides Gradients and Neural Networks library.
# The Multi-BSS example still doesn't meet the VR requirements after training, although I changed some parameters. Maybe I have to leave its optimization to future work (after GSoC). The benchmarking is not affected.

=== Todo next week ===

# Finish the last C++-based ML example (RL-TCP using PyTorch C++ API)
# Update vector-based message interface's benchmarking result.

== Week 14 (Aug 28 - Aug 3) ==

=== Achievements ===

# Finished RL-TCP example using pure C++ interface
# Benchmarked pure C++ (pure C++ vs msg interface in terms of processing time) and vector-based interface (vector vs struct in terms of data transmission time)
#* Pure C++ is twice faster than msg interface, due to removal of interprocess communication.
#* Unfortunately, vector-based message interface is slower (1.2x in CPU cycle count) than struct-based interface. Possibly due to slow access of vector on Python side.
# Updated many documents for PR.

=== Problems ===

# As mentioned above, vector interface is slow. In the future, we may need to integrate C++-based linear algebra libraries like Eigen, which has faster Python bindings than pybind11's std::vector binding, provided by some open source projects like eigenpy.

GSOC2023ns3-ai

2023-08-29T02:20:57Z

Muyuan:

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo

== Project Goals ==

The proposed project aims to enhance the ns3-ai module, which provides interfaces between ns-3 and Python-based ML frameworks using shared memory, with a focus on performance optimization and expanding the range of supported data structures.
To achieve this, the project will introduce APIs for additional data structures like vector and string in shared memory IPC to reduce the interaction between C++ and Python. Additionally, the project will provide examples demonstrating how to implement ML algorithms within ns-3 using C++ and open-source frameworks such as TensorFlow and PyTorch. The project will also improve the current examples and documentation and integrate new examples, such as LTE handover.
Overall, the project aims to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze large-scale networks with greater efficiency and flexibility.

== Repository ==

https://github.com/ShenMuyuan/ns3-ai/tree/improvements

== About Me ==

=== Education ===

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

=== Experience with ns-3 ===

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my [https://diangroup.feishu.cn/docx/PSK9d18KLoLvXCxmGIkcbhXTn5b proposal], I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

Apply [https://www.tensorflow.org/api_docs/cc Tensorflow C++ APIs] and [https://pytorch.org/cppdocs/ PyTorch C++ APIs] to examples
using Python-based ML frameworks. Also, provide Cmake configurations that both works on Linux and macOS, and documentation on building
& running.

=== Finishing new examples and benchmarking test ===

Finish some new examples using Gym interface and vector-based message interface. Compare Gym interface's performance with ns3-gym,
and compare vector-based message interface's performance with struct-based message interface.

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

== Week 10 (July 31 - Aug 6) ==

=== Achievements ===

# Update the lte-cqi example to use the msg interface.
# Working on multi-bss example, based on [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads Juan's branch of ns-3-dev].

=== Problems ===

=== Todo next week ===

# Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

== Week 11 (Aug 7 - Aug 13) ==

=== Achievements ===

# Get familiar with the RL algorithm in multi-bss example, based on [https://www.nsnam.org/tutorials/consortium23/MultiBSS-UW.pdf the slide] and [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads the code].
# Finished the Python binding and cmake configuration of multi-bss example, with some problems.
# Wrote a quick start guide on the ns3-ai interface, and thoroughly updated the READMEs in the repository.

=== Problems ===

# The Python script of multi-BSS is incompatible with C++ code, having sightly different data definition.
# The algorithm should automatically change the CCA, but CAA is not changed (always -82)

=== Todo next week ===

# Complete the Multi-BSS example's code and do some benchmark.

== Week 12 (Aug 14 - Aug 20) ==

=== Achievements ===

# Finish the code for Multi-BSS (can run now and the CCA value changes to approximately -70)
# Updating documents for future release by Hao
# Enhanced some cmake configurations for better usability
#* Add protobuf-generate function for protobuf installations that don’t provide it
#* Change the cmake target from ‘ns3-ai’ to ‘ai’. Now the ns3-ai mudule can be built with ./ns3 build ai. (custom modules cannot be built directly with ./ns3 build if they have a ‘ns3-’ prefix, due to some settings in ./ns3 script)

=== Problems ===

# In Multi-BSS example, at 1 min of simulation, VR tpt ≈ 5Mbps, can’t meet requirement (50 Mbps). Occasionally VR delay ≈ 0 and tpt ≈ 1e8, possibly due to statistics error.

=== Todo next week ===

# Adjust the parameters in RL algorithm to meet VR requirements on throughput
# Benchmarking (compare running time with previous interface)

== Week 13 (Aug 21 - Aug 27) ==

=== Achievements ===

# Finished a simple example using C++-based ML framework TensorFlow.
# Continue modifying Multi-BSS parameters to meet VR requirements.

=== Problems ===

# The C++-based example is incomplete, due to the lack of TensorFlow C API as described in their [https://github.com/tensorflow/docs/blob/master/site/en/r1/guide/extend/bindings.md#current-status current-status]. The example will be available when TensorFlow C API provides Gradients and Neural Networks library.
# The Multi-BSS example still doesn't meet the VR requirements after training, although I changed some parameters. Maybe I have to leave its optimization to future work (after GSoC). The benchmarking is not affected.

=== Todo next week ===

# Finish the last C++-based ML example (RL-TCP using PyTorch C++ API)
# Update vector-based message interface's benchmarking result.

GSOC2023ns3-ai

2023-08-25T07:37:52Z

Muyuan:

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo

== Project Goals ==

The proposed project aims to enhance the ns3-ai module, which provides interfaces between ns-3 and Python-based ML frameworks using shared memory, with a focus on performance optimization and expanding the range of supported data structures.
To achieve this, the project will introduce APIs for additional data structures like vector and string in shared memory IPC to reduce the interaction between C++ and Python. Additionally, the project will provide examples demonstrating how to implement ML algorithms within ns-3 using C++ and open-source frameworks such as TensorFlow and PyTorch. The project will also improve the current examples and documentation and integrate new examples, such as LTE handover.
Overall, the project aims to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze large-scale networks with greater efficiency and flexibility.

== Repository ==

https://github.com/ShenMuyuan/ns3-ai/tree/improvements

== About Me ==

=== Education ===

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

=== Experience with ns-3 ===

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my [https://diangroup.feishu.cn/docx/PSK9d18KLoLvXCxmGIkcbhXTn5b proposal], I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

Apply [https://www.tensorflow.org/api_docs/cc Tensorflow C++ APIs] and [https://pytorch.org/cppdocs/ PyTorch C++ APIs] to examples
using Python-based ML frameworks. Also, provide Cmake configurations that both works on Linux and macOS, and documentation on building
& running.

=== Finishing new examples and benchmarking test ===

Finish some new examples using Gym interface and vector-based message interface. Compare Gym interface's performance with ns3-gym,
and compare vector-based message interface's performance with struct-based message interface.

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

== Week 10 (July 31 - Aug 6) ==

=== Achievements ===

# Update the lte-cqi example to use the msg interface.
# Working on multi-bss example, based on [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads Juan's branch of ns-3-dev].

=== Problems ===

=== Todo next week ===

# Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

== Week 11 (Aug 7 - Aug 13) ==

=== Achievements ===

# Get familiar with the RL algorithm in multi-bss example, based on [https://www.nsnam.org/tutorials/consortium23/MultiBSS-UW.pdf the slide] and [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads the code].
# Finished the Python binding and cmake configuration of multi-bss example, with some problems.
# Wrote a quick start guide on the ns3-ai interface, and thoroughly updated the READMEs in the repository.

=== Problems ===

# The Python script of multi-BSS is incompatible with C++ code, having sightly different data definition.
# The algorithm should automatically change the CCA, but CAA is not changed (always -82)

=== Todo next week ===

# Complete the Multi-BSS example's code and do some benchmark.

== Week 12 (Aug 14 - Aug 20) ==

=== Achievements ===

# Finish the code for Multi-BSS (can run now and the CCA value changes to approximately -70)
# Updating documents for future release by Hao
# Enhanced some cmake configurations for better usability
#* Add protobuf-generate function for protobuf installations that don’t provide it
#* Change the cmake target from ‘ns3-ai’ to ‘ai’. Now the ns3-ai mudule can be built with ./ns3 build ai. (custom modules cannot be built directly with ./ns3 build if they have a ‘ns3-’ prefix, due to some settings in ./ns3 script)

=== Problems ===

# In Multi-BSS example, at 1 min of simulation, VR tpt ≈ 5Mbps, can’t meet requirement (50 Mbps). Occasionally VR delay ≈ 0 and tpt ≈ 1e8, possibly due to statistics error.

=== Todo next week ===

# Adjust the parameters in RL algorithm to meet VR requirements on throughput
# Benchmarking (compare running time with previous interface)

GSOC2023ns3-ai

2023-08-25T03:16:56Z

Muyuan: /* Finishing new examples and benchmarking test */

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo

== Project Goals ==

The proposed project aims to enhance the ns3-ai module, which provides interfaces between ns-3 and Python-based ML frameworks using shared memory, with a focus on performance optimization and expanding the range of supported data structures.
To achieve this, the project will introduce APIs for additional data structures like vector and string in shared memory IPC to reduce the interaction between C++ and Python. Additionally, the project will provide examples demonstrating how to implement ML algorithms within ns-3 using C++ and open-source frameworks such as TensorFlow and PyTorch. The project will also improve the current examples and documentation and integrate new examples, such as LTE handover.
Overall, the project aims to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze large-scale networks with greater efficiency and flexibility.

== Repository ==

https://github.com/ShenMuyuan/ns3-ai/tree/improvements

== About Me ==

=== Education ===

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

=== Experience with ns-3 ===

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my [https://diangroup.feishu.cn/docx/PSK9d18KLoLvXCxmGIkcbhXTn5b proposal], I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

Apply [https://www.tensorflow.org/api_docs/cc Tensorflow C++ APIs] and [https://pytorch.org/cppdocs/ PyTorch C++ APIs] to examples
using Python-based ML frameworks. Also, provide Cmake configurations that both works on Linux and macOS, and documentation on building
& running.

=== Finishing new examples and benchmarking test ===

Finish some new examples using Gym interface and vector-based message interface. Compare Gym interface's performance with ns3-gym,
and compare vector-based message interface's performance with struct-based message interface.

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

== Week 10 (July 31 - Aug 6) ==

=== Achievements ===

# Update the lte-cqi example to use the msg interface.
# Working on multi-bss example, based on [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads Juan's branch of ns-3-dev].

=== Problems ===

=== Todo next week ===

# Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

== Week 10 (Aug 7 - Aug 13) ==

=== Achievements ===

# Get familiar with the RL algorithm in multi-bss example, based on [https://www.nsnam.org/tutorials/consortium23/MultiBSS-UW.pdf the slide] and [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads the code].
# Finished the Python binding and cmake configuration of multi-bss example, with some problems.
# Wrote a quick start guide on the ns3-ai interface, and thoroughly updated the READMEs in the repository.

=== Problems ===

# The Python script of multi-BSS is incompatible with C++ code, having sightly different data definition.
# The algorithm should automatically change the CCA, but CAA is not changed (always -82)

=== Todo next week ===

# Complete the Multi-BSS example's code and do some benchmark.

== Week 10 (Aug 14 - Aug 20) ==

=== Achievements ===

# Finish the code for Multi-BSS (can run now and the CCA value changes to approximately -70)
# Updating documents for future release by Hao
# Enhanced some cmake configurations for better usability
#* Add protobuf-generate function for protobuf installations that don’t provide it
#* Change the cmake target from ‘ns3-ai’ to ‘ai’. Now the ns3-ai mudule can be built with ./ns3 build ai. (custom modules cannot be built directly with ./ns3 build if they have a ‘ns3-’ prefix, due to some settings in ./ns3 script)

=== Problems ===

# In Multi-BSS example, at 1 min of simulation, VR tpt ≈ 5Mbps, can’t meet requirement (50 Mbps). Occasionally VR delay ≈ 0 and tpt ≈ 1e8, possibly due to statistics error.

=== Todo next week ===

# Adjust the parameters in RL algorithm to meet VR requirements on throughput
# Benchmarking (compare running time with previous interface)

GSOC2023ns3-ai

2023-08-25T03:13:50Z

Muyuan: /* Integration of ns-3 and C++-based ML frameworks */

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo

== Project Goals ==

The proposed project aims to enhance the ns3-ai module, which provides interfaces between ns-3 and Python-based ML frameworks using shared memory, with a focus on performance optimization and expanding the range of supported data structures.
To achieve this, the project will introduce APIs for additional data structures like vector and string in shared memory IPC to reduce the interaction between C++ and Python. Additionally, the project will provide examples demonstrating how to implement ML algorithms within ns-3 using C++ and open-source frameworks such as TensorFlow and PyTorch. The project will also improve the current examples and documentation and integrate new examples, such as LTE handover.
Overall, the project aims to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze large-scale networks with greater efficiency and flexibility.

== Repository ==

https://github.com/ShenMuyuan/ns3-ai/tree/improvements

== About Me ==

=== Education ===

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

=== Experience with ns-3 ===

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my [https://diangroup.feishu.cn/docx/PSK9d18KLoLvXCxmGIkcbhXTn5b proposal], I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

Apply [https://www.tensorflow.org/api_docs/cc Tensorflow C++ APIs] and [https://pytorch.org/cppdocs/ PyTorch C++ APIs] to examples
using Python-based ML frameworks. Also, provide Cmake configurations that both works on Linux and macOS, and documentation on building
& running.

=== Finishing new examples and benchmarking test ===

TODO

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

== Week 10 (July 31 - Aug 6) ==

=== Achievements ===

# Update the lte-cqi example to use the msg interface.
# Working on multi-bss example, based on [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads Juan's branch of ns-3-dev].

=== Problems ===

=== Todo next week ===

# Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

== Week 10 (Aug 7 - Aug 13) ==

=== Achievements ===

# Get familiar with the RL algorithm in multi-bss example, based on [https://www.nsnam.org/tutorials/consortium23/MultiBSS-UW.pdf the slide] and [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads the code].
# Finished the Python binding and cmake configuration of multi-bss example, with some problems.
# Wrote a quick start guide on the ns3-ai interface, and thoroughly updated the READMEs in the repository.

=== Problems ===

# The Python script of multi-BSS is incompatible with C++ code, having sightly different data definition.
# The algorithm should automatically change the CCA, but CAA is not changed (always -82)

=== Todo next week ===

# Complete the Multi-BSS example's code and do some benchmark.

== Week 10 (Aug 14 - Aug 20) ==

=== Achievements ===

# Finish the code for Multi-BSS (can run now and the CCA value changes to approximately -70)
# Updating documents for future release by Hao
# Enhanced some cmake configurations for better usability
#* Add protobuf-generate function for protobuf installations that don’t provide it
#* Change the cmake target from ‘ns3-ai’ to ‘ai’. Now the ns3-ai mudule can be built with ./ns3 build ai. (custom modules cannot be built directly with ./ns3 build if they have a ‘ns3-’ prefix, due to some settings in ./ns3 script)

=== Problems ===

# In Multi-BSS example, at 1 min of simulation, VR tpt ≈ 5Mbps, can’t meet requirement (50 Mbps). Occasionally VR delay ≈ 0 and tpt ≈ 1e8, possibly due to statistics error.

=== Todo next week ===

# Adjust the parameters in RL algorithm to meet VR requirements on throughput
# Benchmarking (compare running time with previous interface)

GSOC2023ns3-ai

2023-08-18T00:39:33Z

Muyuan:

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo

== Project Goals ==

The proposed project aims to enhance the ns3-ai module, which provides interfaces between ns-3 and Python-based ML frameworks using shared memory, with a focus on performance optimization and expanding the range of supported data structures.
To achieve this, the project will introduce APIs for additional data structures like vector and string in shared memory IPC to reduce the interaction between C++ and Python. Additionally, the project will provide examples demonstrating how to implement ML algorithms within ns-3 using C++ and open-source frameworks such as TensorFlow and PyTorch. The project will also improve the current examples and documentation and integrate new examples, such as LTE handover.
Overall, the project aims to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze large-scale networks with greater efficiency and flexibility.

== Repository ==

https://github.com/ShenMuyuan/ns3-ai/tree/improvements

== About Me ==

=== Education ===

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

=== Experience with ns-3 ===

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my [https://diangroup.feishu.cn/docx/PSK9d18KLoLvXCxmGIkcbhXTn5b proposal], I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

TODO

=== Finishing new examples and benchmarking test ===

TODO

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

== Week 10 (July 31 - Aug 6) ==

=== Achievements ===

# Update the lte-cqi example to use the msg interface.
# Working on multi-bss example, based on [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads Juan's branch of ns-3-dev].

=== Problems ===

=== Todo next week ===

# Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

== Week 10 (Aug 7 - Aug 13) ==

=== Achievements ===

# Get familiar with the RL algorithm in multi-bss example, based on [https://www.nsnam.org/tutorials/consortium23/MultiBSS-UW.pdf the slide] and [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads the code].
# Finished the Python binding and cmake configuration of multi-bss example, with some problems.
# Wrote a quick start guide on the ns3-ai interface, and thoroughly updated the READMEs in the repository.

=== Problems ===

# The Python script of multi-BSS is incompatible with C++ code, having sightly different data definition.
# The algorithm should automatically change the CCA, but CAA is not changed (always -82)

=== Todo next week ===

# Complete the Multi-BSS example's code and do some benchmark.

== Week 10 (Aug 14 - Aug 20) ==

=== Achievements ===

# Finish the code for Multi-BSS (can run now and the CCA value changes to approximately -70)
# Updating documents for future release by Hao
# Enhanced some cmake configurations for better usability
#* Add protobuf-generate function for protobuf installations that don’t provide it
#* Change the cmake target from ‘ns3-ai’ to ‘ai’. Now the ns3-ai mudule can be built with ./ns3 build ai. (custom modules cannot be built directly with ./ns3 build if they have a ‘ns3-’ prefix, due to some settings in ./ns3 script)

=== Problems ===

# In Multi-BSS example, at 1 min of simulation, VR tpt ≈ 5Mbps, can’t meet requirement (50 Mbps). Occasionally VR delay ≈ 0 and tpt ≈ 1e8, possibly due to statistics error.

=== Todo next week ===

# Adjust the parameters in RL algorithm to meet VR requirements on throughput
# Benchmarking (compare running time with previous interface)

GSOC2023ns3-ai

2023-08-10T03:23:46Z

Muyuan:

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo

== Project Goals ==

The proposed project aims to enhance the ns3-ai module, which provides interfaces between ns-3 and Python-based ML frameworks using shared memory, with a focus on performance optimization and expanding the range of supported data structures.
To achieve this, the project will introduce APIs for additional data structures like vector and string in shared memory IPC to reduce the interaction between C++ and Python. Additionally, the project will provide examples demonstrating how to implement ML algorithms within ns-3 using C++ and open-source frameworks such as TensorFlow and PyTorch. The project will also improve the current examples and documentation and integrate new examples, such as LTE handover.
Overall, the project aims to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze large-scale networks with greater efficiency and flexibility.

== Repository ==

https://github.com/ShenMuyuan/ns3-ai/tree/improvements

== About Me ==

=== Education ===

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

=== Experience with ns-3 ===

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my [https://diangroup.feishu.cn/docx/PSK9d18KLoLvXCxmGIkcbhXTn5b proposal], I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

TODO

=== Finishing new examples and benchmarking test ===

TODO

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

== Week 10 (July 31 - Aug 6) ==

=== Achievements ===

# Update the lte-cqi example to use the msg interface.
# Working on multi-bss example, based on [https://gitlab.com/juanvleonr/ns-3-dev/-/blob/clean-tgax/scratch/tgax-residential.cc?ref_type=heads Juan's branch of ns-3-dev].

=== Problems ===

=== Todo next week ===

# Finish multi-bss example. My work will include making it compile with the latest ns-3, porting it to ns3-ai's new interface and changing some directory structure (move the tgax code under src to contrib).

GSOC2023ns3-ai

2023-07-30T07:12:53Z

Muyuan:

{{TOC}}

Back to [[Summer_Projects#Google_Summer_of_Code_2023 | GSoC 2023 projects]]

= Project Overview =

* '''Project Name:''' ns3-ai enhancements
* '''Student:''' Muyuan Shen
* '''Mentors:''' Collin Brady and Hao Yin
* '''Google page:''' https://summerofcode.withgoogle.com/programs/2023/projects/A4KZ7dxo

== Project Goals ==

The proposed project aims to enhance the ns3-ai module, which provides interfaces between ns-3 and Python-based ML frameworks using shared memory, with a focus on performance optimization and expanding the range of supported data structures.
To achieve this, the project will introduce APIs for additional data structures like vector and string in shared memory IPC to reduce the interaction between C++ and Python. Additionally, the project will provide examples demonstrating how to implement ML algorithms within ns-3 using C++ and open-source frameworks such as TensorFlow and PyTorch. The project will also improve the current examples and documentation and integrate new examples, such as LTE handover.
Overall, the project aims to expand and accelerate the capabilities of the ns3-ai module, enabling users to simulate and analyze large-scale networks with greater efficiency and flexibility.

== Repository ==

https://github.com/ShenMuyuan/ns3-ai/tree/improvements

== About Me ==

=== Education ===

As a junior at Huazhong University of Science and Technology, I am majoring in electronic engineering. I am proud to be a member of the Undergraduate Program for Advanced Project-based Information Science Education, also known as the Seed Class, and currently serve as the class monitor. Additionally, I am a project leader in the Dian group, where I engage in extracurricular technical projects.
In terms of relevant coursework, I have excelled in network programming through courses such as C programming language and computer network, both of which I achieved a perfect grade point of 4.0. These courses have equipped me with a strong foundation in network programming, which I believe will enable me to contribute effectively to relevant projects.
I am a motivated and skilled undergraduate student with a passion for network programming and a track record of academic excellence.

=== Experience with ns-3 ===

During my academic journey, I have had the opportunity to explore computer networking through labs and projects. In particular, in the labs for the computer networking course, I gained valuable insights into how different parameters, such as the number of STAs, CW range, and packet arrival rate, can impact network throughput in the WiFi DCF protocol.
In addition, I have worked on a project that leverages ns-3 as a simulation platform with Prof. Yayu Gao. Through this project, I have gained practical experience in simulating WiFi MAC rate control algorithms, which has further solidified my understanding of the ns-3's usage and its object-oriented programming approach.
Overall, my hands-on experience in both labs and projects has allowed me to apply theoretical concepts to practical scenarios and enhanced my network simulation and analysis skills.

= Milestones =

Based on my [https://diangroup.feishu.cn/docx/PSK9d18KLoLvXCxmGIkcbhXTn5b proposal], I divide my project into two phases, listed below.

== Phase one (before midterm evaluation) ==

=== Enhancements for the interface ===

==== std::vector support ====

Introduce APIs for storing data structures like std::vector in shared memory, to reduce the interaction between C++ and Python.

==== gym-like interface ====

Introduce a gym-like interface to enable users to train RL models directly in ns-3 with OpenAI Gym.

=== Enhancements for existing examples ===

Make all previous examples up to date with the Cmake building system introduced in ns3.36, also provide a new example to benchmark the running time of vectors.

== Phase two (after midterm evaluation) ==

=== Integration of ns-3 and C++-based ML frameworks ===

TODO

=== Finishing new examples and benchmarking test ===

TODO

= Weekly Report =

== Week 1 (May 29 - June 4) ==

=== Achievements ===

# Got familiar with the usage of Boost library, and the syntax of Cython pyx files. I am using Boost to support dynamic allocation and synchronization in shared memory and Cython to wrap C++ code for Python.
# Created the interface to support std::vector in shared memory. Also wrote a new a-plus-b example to demonstrate the usage. It is still in development and currently supports macOS.
# (Update on June 3) Now I am using pybind11 instead of Cython for Python binding, because pybind11 has similar performance but cleaner code. And also it is easier to use cmake to install the python module.

=== Problems ===

# The code is quite naive and possibly includes some extra interactions that lowers performance.
# I have not tested the new interface on Linux.
# (Update on June 3) {{strike|The new interface has hardcoded parts in the setup.py. Users need to explicitly specify their Boost library include and library paths.}}
# (Update on June 3) {{strike|Although I have only one example currently, if there is more, users need to repeatedly call the setup.py to install modules which lacks efficiency.}}

=== Todo next week ===

# Use the new interface in an existing example such as rl-tcp, compare running time with old interface, to know its performance better.
# Switch to a new branch called "improvements" instead of "cmake", which better shows the project goal.
# (Update on June 3) {{strike|Modify CMakeLists.txt to pass the result of find_package(Boost...) to setup.py, and remove the hardcoded part.}}
# (Update on June 3) {{strike|Make "pip install . --user" a target in Cmake, so that users can install Python modules more easily, like "./ns3 build ns3ai_interfaces".}}
# If I have time, I will test my code on Linux.

== Week 2 (June 5 - June 11) ==

=== Achievements ===

# Updated the [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] to use the new interface. Previously, it uses simple packed structure for information sharing. Now it uses the first element of shared std::vector (which is basically the same structure as before).
# Measured running time of the Thompson Sampling example, old interface vs new interface. Results: old about 5 seconds, new about 12 minutes.

=== Problems ===

# The benchmarking result above shows that, in terms of passing small amount of data in each interaction, the new interface is 150 times slower than the old interface.

=== Todo next week ===

# Measure running time of another example (the new multi-bss example) which passes large amount of data in each interaction, to check whether the new interface improves performance in that case. If the new outperforms the old, then the old and new interface can coexist for different cases. Else, I will consider modifying the implementation.
# Or, try to optimize the code to make small data interaction faster.

== Week 3 (June 12 - June 18) ==

=== Achievements ===

# Accelerated data interaction using spinlock-based semaphore as synchronization method. The running time of [https://github.com/ShenMuyuan/ns3-ai/tree/improvements/examples/rate-control Thompson Sampling example] shortened to 6 seconds on my machine, which means that the performance of small data interaction is close to the previous interface.
#* I tried eliminating data copying operations and use a lot of reference instead, but it hardly improves running time.
#* I guessed that semaphores will spin instead of sleep, which can save more time (although it wastes CPU). So in the synchronization code I replaced Boost.Interprocess condition variable with Boost.Interprocess semaphore. But there was no improvement. Investigation using Clion's builtin profiler shows that sleeping takes a large portion of running time. Then I read the source code of Boost and found that when a semaphore is waiting, it's not purely spinning. Actually, it puts process to sleep after the spinning time reaches a small threshold. I commented the spin [https://github.com/boostorg/interprocess/blob/a0c5a8ff176434c9024d4540ce092a2eebb8c5c3/include/boost/interprocess/sync/spin/wait.hpp#LL128C13-L128C13 counting code] to force always spinning, and the running time reduced a lot.
#* To avoid modifying library code, I created my own version of semaphore. My implementation of semaphore is similar to Boost's, but while waiting it only spins and never go to sleep. This significantly accelerates interaction between Python and C++, reducing the running time to 6s.
# Updated the a plus b and constant rate example. Currently available examples that use new interface: A Plus B, Thompson Sampling, Constant Rate.

=== Problems ===

# The examples has not been tested on Linux yet, which will take place next week.

=== Todo next week ===

# Start working on ns3-gym-like interface, which is one of the milestones.
# Work with Hao to release the previous version of ns3ai.
# Test the three currently available examples on Linux system.

== Week 4 (June 19 - June 25) ==

=== Achievements ===

# Due to my mentors' suggestions, I added a interface of shared single structure to reduce complexity when the usage of vector is unnecessary. Previously, when a single structure is shared (such as Thompson Sampling or Constant Rate examples), it requires a vector but uses only the first element.
# Read [https://bits.informatik.hu-berlin.de/~zubow/gawlowicz19_mswim.pdf the paper of ns3-gym] and tried running [https://github.com/tkn-tub/ns3-gym the code] to be more familiar with the OpenAI Gym interface. Now I am developing the Gym interface.
# Linux usage is tested.

=== Problems ===

# The ns3-gym README says it has some issues with the new OpenAI Gym framework, so that the <code>gym.make()</code> API is unavailable. Is there any ways to solve that? Or perhaps its only an issue with ns3-gym and not a problem for ns3-ai?

=== Todo next week ===

# Continue developing Gym interface.

== Week 5 (June 26 - July 2) ==

=== Achievements ===

# Completed the a-plus-b example of Gym interface.

=== Problems ===

=== Todo next week ===

# Continue developing other examples using Gym interface.

== Week 6 - 7 (July 3 - July 16) ==

About interface naming: for clarity, I call the interface that uses Boost shared memory directly (in which users need to define the shared structures or vectors) "msg interface", and the interface that is based on msg interface and provides Gym APIs "Gym interface". The former is low level, requires more coding and has stronger capabilities (such as std::vector sharing), while the latter is high level, easier to code but has limited functionality (RL with Gym).

=== Achievements ===

# Due to my mentors' suggestions, I modified the Gym interface so that it provides a base class that users can derive to make their own environment. Basically it is a fork of ns3-gym's interface, but in low level it uses Boost instead of ZeroMQ for interprocess communication.
# Completed the RL-TCP example using Gym interface & msg interface, and A plus B example using Gym interface.
# Done refactor of existing code including separating different interfaces in different directories and modifying CMakeLists files, providing clearer project structure and easier usage.
# Updated all READMEs that contains step by step instructions for how to build and run the examples.

=== Problems ===

# Proper destruction of the msg interface. In RL-TCP example, I had reference counting issue (the reference count didn't go to zero so an object was not destroyed), and fixed reference count by replacing some Ptr<> with raw pointer. There may be other better ways to solve that.
# Because the msg interface must have only one instance that provides synchronized access of shared memory segment, I use a local variable in a source file so that many functions in different classes can have access to the only interface. I noticed in ns-3 a SingleTon class is provided, is that a better way to define the msg interface?

=== Todo next week ===

# Provide some initial benchmark of Gym interface with ns3-gym.
# Do midterm evaluation.

== Week 8 (July 17 - July 23) ==

=== Achievements ===

# '''Successfully finished midterm evaluation. Thank you my mentors Collin and Hao for your guidance!'''
# Benchmarked the running time of RL-TCP example. In the scenario of 2 nodes with bottleneck_bandwidth=2Mbps, bottleneck_delay=0.01ms, access_bandwidth=10Mbps, access_delay=20ms, I simulated for 1000s and the results shows that ns3-ai is slightly faster then ns3-gym: ns3-ai costs 26 seconds and ns3-gym costs 27 seconds.

=== Problems ===

# My mentor suggests that the benchmark doesn't show the advantage of ns3-ai because it uses the total running time rather than C++-Python interaction time. Interaction time is more likely to have a big difference between ns3-ai and ns3-gym because interaction is the place where ns3-ai and ns3-gym differ most. Also, after knowing the C++-Python interaction time and the portion it takes in total time, it's easier to design examples that emphasize the interaction time and better demonstrate the performance of ns3-ai.

=== Todo next week ===

# Conduct benchmarking of interaction time on RL-TCP example.

== Week 9 (July 24 - July 30) ==

=== Achievements ===

# Benchmarked RL-TCP example (ns3-gym and ns3-ai's Gym interface version) based on C++-Python interaction time. Interaction time is the transmission time of the byte buffer containing serialized Gym environments or actions. To get accurate interaction time, I use CPU cycle (rdtsc in x86 instructions) rather than clock time. Each saved data is the end CPU cycle of a transmission minus the start CPU cycle of that transmission. The mean and standard deviation of the data are calculated. The result shows that in both C++ to Python and Python to C++ directions, the interaction time of ns3-ai is approximately 15 times shorter than that of ns3-gym.
#* ns-3 configuration: ./ns3 configure --enable-examples --build-profile=debug
#* Simulation parameters:
#** bottleneck_bandwidth=2Mbps
#** bottleneck_delay=0.01ms
#** access_bandwidth=10Mbps
#** access_delay=20ms
#** duration=100s
#** time step = 0.1s
#* Benchmark results:
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/cpp2py.png C++ to Python transmission time]
#** [https://github.com/ShenMuyuan/urban-pancake/blob/50ad463ee06377342ff83c9954a13cc66792b4d1/ns3ai_benchmark/py2cpp.png Python to C++ transmission time]

=== Problems ===

=== Todo next week ===

# Began developing Multi-BSS example which can demonstrate the usage of vector in message interface.

GSOC2023ns3-ai

2023-07-29T11:48:57Z

Muyuan:

GSOC2023ns3-ai

2023-07-18T15:14:41Z

Muyuan:

GSOC2023ns3-ai

2023-07-04T11:44:16Z

Muyuan:

GSOC2023ns3-ai

2023-06-26T16:52:06Z

Muyuan:

GSOC2023ns3-ai

2023-06-18T09:44:41Z

Muyuan:

GSOC2023ns3-ai

2023-06-12T13:42:44Z

Muyuan:

GSOC2023ns3-ai

2023-06-04T02:37:29Z

Muyuan:

Template:Strike

2023-06-04T02:35:40Z

Muyuan: Created page with "<s {{#if: {{{color|}}}| style="color:{{{color|}}}"}}>{{{1}}}</s><noinclude>"

GSOC2023ns3-ai

2023-06-02T10:11:54Z

Muyuan:

GSOC2023ns3-ai

2023-05-11T01:17:21Z

Muyuan: Added Project Goals, Repository and About Me