ns-3 Model Library

This is the ns-3 Model Library documentation. Primary documentation for the ns-3 project is organized as follows:

  • Several guides that are version controlled for each release (the latest release) and development tree:

    • Tutorial

    • Installation Guide

    • Manual

    • Model Library (this document)

    • Contributing Guide

  • ns-3 Doxygen: Documentation of the public APIs of the simulator

  • ns-3 wiki

This document is written in reStructuredText for Sphinx and is maintained in the doc/models directory of ns-3’s source code (and much of the source content is also pulled from the doc/ directory of each module. Source file column width is 100 columns.

1. Organization

This manual compiles documentation for ns-3 models and supporting software that enable users to construct network simulations. It is important to distinguish between modules and models:

  • ns-3 software is organized into separate modules that are each built as a separate software library. Individual ns-3 programs can link the modules (libraries) they need to conduct their simulation.

  • ns-3 models are abstract representations of real-world objects, protocols, devices, etc.

An ns-3 module may consist of more than one model (for instance, the internet module contains models for both TCP and UDP). In general, ns-3 models do not span multiple software modules, however.

This manual provides documentation about the models of ns-3. It complements two other sources of documentation concerning models:

  • the model APIs are documented, from a programming perspective, using Doxygen. Doxygen for ns-3 models is available on the project web server.

  • the ns-3 core is documented in the developer’s manual. ns-3 models make use of the facilities of the core, such as attributes, default values, random numbers, test frameworks, etc. Consult the main web site to find copies of the manual.

Finally, additional documentation about various aspects of ns-3 may exist on the project wiki.

A sample outline of how to write model library documentation can be found by executing the create-module.py program and looking at the template created in the file new-module/doc/new-module.rst.

$ cd src
$ ./create-module.py new-module

The remainder of this document is organized alphabetically by module name.

If you are new to ns-3, you might first want to read below about the network module, which contains some fundamental models for the simulator. The packet model, models for different address formats, and abstract base classes for objects such as nodes, net devices, channels, sockets, and applications are discussed there.

2. Animation

Animation is an important tool for network simulation. While ns-3 does not contain a default graphical animation tool, we currently have two ways to provide animation, namely using the PyViz method or the NetAnim method. The PyViz method is described in http://www.nsnam.org/wiki/PyViz.

We will describe the NetAnim method briefly here.

2.1. NetAnim

NetAnim is a standalone, Qt5-based software executable that uses a trace file generated during an ns-3 simulation to display the topology and animate the packet flow between nodes.

_images/NetAnim_3_105.png

An example of packet animation on wired-links

In addition, NetAnim also provides useful features such as tables to display meta-data of packets like the image below

_images/PacketStatistics.png

An example of tables for packet meta-data with protocol filters

A way to visualize the trajectory of a mobile node

_images/Trajectory.png

An example of the trajectory of a mobile node

A way to display the routing-tables of multiple nodes at various points in time

_images/RoutingTables.png

A way to display counters associated with multiple nodes as a chart or a table

_images/NodeCountersChart.png
_images/NodeCountersTable.png

A way to view the timeline of packet transmit and receive events

_images/PacketTimeline.png

2.1.1. Methodology

The class ns3::AnimationInterface is responsible for the creation the trace XML file. AnimationInterface uses the tracing infrastructure to track packet flows between nodes. AnimationInterface registers itself as a trace hook for tx and rx events before the simulation begins. When a packet is scheduled for transmission or reception, the corresponding tx and rx trace hooks in AnimationInterface are called. When the rx hooks are called, AnimationInterface will be aware of the two endpoints between which a packet has flowed, and adds this information to the trace file, in XML format along with the corresponding tx and rx timestamps. The XML format will be discussed in a later section. It is important to note that AnimationInterface records a packet only if the rx trace hooks are called. Every tx event must be matched by an rx event.

2.1.2. Downloading NetAnim

If NetAnim is not already available in the ns-3 package you downloaded, you can do the following:

The latest version of NetAnim can be downloaded using git with the following command:

$ git clone https://gitlab.com/nsnam/netanim.git

2.1.3. Building NetAnim

2.1.3.1. Prerequisites

Qt5 (5.4 and over) is required to build NetAnim. The ns-3 Installation Guide lists some packages to install for some Linux systems, for macOS <https://www.nsnam.org/docs/installation/html/macos.html#optional>, and for Windows.

The Qt site also provides download options.

2.1.3.2. Build steps

To build NetAnim use the following commands:

$ cd netanim
$ make clean
$ qmake NetAnim.pro
$ make

Note: qmake could be “qmake-qt5” in some systems

This should create an executable named “NetAnim” in the same directory:

 $ ls -l NetAnim
-rwxr-xr-x 1 john john 390395 2012-05-22 08:32 NetAnim

2.1.4. Usage

Using NetAnim is a two-step process

Step 1:Generate the animation XML trace file during simulation using “ns3::AnimationInterface” in the ns-3 code base.

Step 2:Load the XML trace file generated in Step 1 with the offline Qt4-based animator named NetAnim.

2.1.4.1. Step 1: Generate XML animation trace file

The class “AnimationInterface” under “src/netanim” uses underlying ns-3 trace sources to construct a timestamped ASCII file in XML format.

Examples are found under src/netanim/examples Example:

$ ./ns3 configure -d debug --enable-examples
$ ./ns3 run "dumbbell-animation"

The above will create an XML file dumbbell-animation.xml

2.1.4.1.1. Mandatory
  1. Ensure that your program’s CMakeLists.txt includes the “netanim” module. An example of such a CMakeLists.txt is at src/netanim/examples/CMakeLists.txt.

  2. Include the header [#include “ns3/netanim-module.h”] in your test program

  3. Add the statement

AnimationInterface anim("animation.xml");  // where "animation.xml" is any arbitrary filename

[for versions before ns-3.13 you also have to use the line “anim.SetXMLOutput() to set the XML mode and also use anim.StartAnimation();]

2.1.4.1.2. Optional

The following are optional but useful steps:

// Step 1
anim.SetMobilityPollInterval(Seconds(1));

AnimationInterface records the position of all nodes every 250 ms by default. The statement above sets the periodic interval at which AnimationInterface records the position of all nodes. If the nodes are expected to move very little, it is useful to set a high mobility poll interval to avoid large XML files.

// Step 2
anim.SetConstantPosition(Ptr< Node > n, double x, double y);

AnimationInterface requires that the position of all nodes be set. In ns-3 this is done by setting an associated MobilityModel. “SetConstantPosition” is a quick way to set the x-y coordinates of a node which is stationary.

// Step 3
anim.SetStartTime(Seconds(150)); and anim.SetStopTime(Seconds(150));

AnimationInterface can generate large XML files. The above statements restricts the window between which AnimationInterface does tracing. Restricting the window serves to focus only on relevant portions of the simulation and creating manageably small XML files

// Step 4
AnimationInterface anim("animation.xml", 50000);

Using the above constructor ensures that each animation XML trace file has only 50000 packets. For example, if AnimationInterface captures 150000 packets, using the above constructor splits the capture into 3 files

  • animation.xml - containing the packet range 1-50000

  • animation.xml-1 - containing the packet range 50001-100000

  • animation.xml-2 - containing the packet range 100001-150000

// Step 5
anim.EnablePacketMetadata(true);

With the above statement, AnimationInterface records the meta-data of each packet in the xml trace file. Metadata can be used by NetAnim to provide better statistics and filter, along with providing some brief information about the packet such as TCP sequence number or source & destination IP address during packet animation.

CAUTION: Enabling this feature will result in larger XML trace files. Please do NOT enable this feature when using Wimax links.

// Step 6
anim.UpdateNodeDescription(5, "Access-point");

With the above statement, AnimationInterface assigns the text “Access-point” to node 5.

// Step 7
anim.UpdateNodeSize(6, 1.5, 1.5);

With the above statement, AnimationInterface sets the node size to scale by 1.5. NetAnim automatically scales the graphics view to fit the oboundaries of the topology. This means that NetAnim, can abnormally scale a node’s size too high or too low. Using AnimationInterface::UpdateNodeSize allows you to overwrite the default scaling in NetAnim and use your own custom scale.

// Step 8
anim.UpdateNodeCounter(89, 7, 3.4);

With the above statement, AnimationInterface sets the counter with Id == 89, associated with Node 7 with the value 3.4. The counter with Id 89 is obtained using AnimationInterface::AddNodeCounter. An example usage for this is in src/netanim/examples/resource-counters.cc.

2.1.4.2. Step 2: Loading the XML in NetAnim
  1. Assuming NetAnim was built, use the command “./NetAnim” to launch NetAnim. Please review the section “Building NetAnim” if NetAnim is not available.

  2. When NetAnim is opened, click on the File open button at the top-left corner, select the XML file generated during Step 1.

  3. Hit the green play button to begin animation.

Here is a video illustrating this http://www.youtube.com/watch?v=tz_hUuNwFDs

2.1.5. Wiki

For detailed instructions on installing “NetAnim”, F.A.Qs and loading the XML trace file (mentioned earlier) using NetAnim please refer: http://www.nsnam.org/wiki/NetAnim

3. Antenna Module

3.1. Design documentation

3.1.1. Overview

The Antenna module provides:

  1. a class (Angles) and utility functions to deal with angles

  2. a base class (AntennaModel) that provides an interface for the modeling of the radiation pattern of an antenna;

  3. a set of classes derived from this base class that each models the radiation pattern of different types of antennas;

  4. a base class (PhasedArrayModel) that provides a flexible interface for modeling a number of Phase Antenna Array (PAA) models

  5. a class (UniformPlanarArray) derived from this base class, implementing a Uniform Planar Array (UPA) supporting both rectangular and linear lattices

3.1.2. Angles

The Angles class holds information about an angle in 3D space using spherical coordinates in radian units. Specifically, it uses the azimuth-inclination convention, where

  • Inclination is the angle between the zenith direction (positive z-axis) and the desired direction. It is included in the range [0, pi] radians.

  • Azimuth is the signed angle measured from the positive x-axis, where a positive direction goes towards the positive y-axis. It is included in the range [-pi, pi) radians.

Multiple constructors are present, supporting the most common ways to encode information on a direction. A static boolean variable allows the user to decide whether angles should be printed in radian or degree units.

A number of angle-related utilities are offered, such as radians/degree conversions, for both scalars and vectors, and angle wrapping.

3.1.3. AntennaModel

The AntennaModel uses the coordinate system adopted in [Balanis] and depicted in Figure Coordinate system of the AntennaModel. This system is obtained by translating the Cartesian coordinate system used by the ns-3 MobilityModel into the new origin o which is the location of the antenna, and then transforming the coordinates of every generic point p of the space from Cartesian coordinates (x,y,z) into spherical coordinates (r, \theta,\phi). The antenna model neglects the radial component r, and only considers the angle components (\theta, \phi). An antenna radiation pattern is then expressed as a mathematical function g(\theta, \phi) \longrightarrow \mathcal{R} that returns the gain (in dB) for each possible direction of transmission/reception. All angles are expressed in radians.

_images/antenna-coordinate-system.png

Coordinate system of the AntennaModel

3.1.4. Single antenna models

In this section we describe the antenna radiation pattern models that are included within the antenna module.

3.1.4.1. IsotropicAntennaModel

This antenna radiation pattern model provides a unitary gain (0 dB) for all direction.

3.1.4.2. CosineAntennaModel

This is the cosine model described in [Chunjian]: the antenna gain is determined as:

g(\phi, \theta) = \cos^{n} \left(\frac{\phi - \phi_{0}}{2}  \right)

where \phi_{0} is the azimuthal orientation of the antenna (i.e., its direction of maximum gain) and the exponential

n = -\frac{3}{20 \log_{10} \left( \cos \frac{\phi_{3dB}}{4} \right)}

determines the desired 3dB beamwidth \phi_{3dB}. Note that this radiation pattern is independent of the inclination angle \theta.

A major difference between the model of [Chunjian] and the one implemented in the class CosineAntennaModel is that only the element factor (i.e., what described by the above formulas) is considered. In fact, [Chunjian] also considered an additional antenna array factor. The reason why the latter is excluded is that we expect that the average user would desire to specify a given beamwidth exactly, without adding an array factor at a latter stage which would in practice alter the effective beamwidth of the resulting radiation pattern.

3.1.4.3. ParabolicAntennaModel

This model is based on the parabolic approximation of the main lobe radiation pattern. It is often used in the context of cellular system to model the radiation pattern of a cell sector, see for instance [R4-092042a] and [Calcev]. The antenna gain in dB is determined as:

g_{dB}(\phi, \theta) = -\min \left( 12 \left(\frac{\phi  - \phi_{0}}{\phi_{3dB}} \right)^2, A_{max} \right)

where \phi_{0} is the azimuthal orientation of the antenna (i.e., its direction of maximum gain), \phi_{3dB} is its 3 dB beamwidth, and A_{max} is the maximum attenuation in dB of the antenna. Note that this radiation pattern is independent of the inclination angle \theta.

3.1.4.4. ThreeGppAntennaModel

This model implements the antenna element described in 38901. Parameters are fixed from the technical report, thus no attributes nor setters are provided. The model is largely based on the ParabolicAntennaModel.

3.1.5. Phased Array Model

The class PhasedArrayModel has been created with flexibility in mind. It abstracts the basic idea of a Phased Antenna Array (PAA) by removing any constraint on the position of each element, and instead generalizes the concept of steering and beamforming vectors, solely based on the generalized location of the antenna elements. For details on Phased Array Antennas see for instance [Mailloux].

Derived classes must implement the following functions:

  • GetNumElems: returns the number of antenna elements

  • GetElementLocation: returns the location of the antenna element with the specified index, normalized with respect to the wavelength

  • GetElementFieldPattern: returns the horizontal and vertical components of the antenna element field pattern at the specified direction. Same polarization (configurable) for all antenna elements of the array is considered.

The class PhasedArrayModel also assumes that all antenna elements are equal, a typical key assumption which allows to model the PAA field pattern as the sum of the array factor, given by the geometry of the location of the antenna elements, and the element field pattern. Any class derived from AntennaModel is a valid antenna element for the PhasedArrayModel, allowing for a great flexibility of the framework.

3.1.5.1. UniformPlanarArray

The class UniformPlanarArray is a generic implementation of Uniform Planar Arrays (UPAs), supporting rectangular and linear regular lattices. It closely follows the implementation described in the 3GPP TR 38.901 38901, considering only a single panel, i.e., N_{g} = M_{g} = 1.

By default, the antenna array is orthogonal to the x-axis, pointing towards the positive direction, but the orientation can be changed through the attributes “BearingAngle”, which adjusts the azimuth angle, and “DowntiltAngle”, which adjusts the elevation angle. The slant angle is instead fixed and assumed to be 0.

The number of antenna elements in the vertical and horizontal directions can be configured through the attributes “NumRows” and “NumColumns”, while the spacing between the horizontal and vertical elements can be configured through the attributes “AntennaHorizontalSpacing” and “AntennaVerticalSpacing”.

UniformPlannarArray supports the concept of antenna ports following the sub-array partition model for TXRU virtualization, as described in Section 5.2.2 of 3GPP TR 36.897 [3GPP_TR36897]. The number of antenna ports in vertical and horizontal directions can be configured through the attributes “NumVerticalPorts” and “NumHorizontalPorts”, respectively. For example, if “NumRows” and “NumColumns” are configured to 2 and 4, and the number of “NumVerticalPorts” and “NumHorizontalPorts” to 1 and 2, then the antenna elements belonging to the first two columns of the antenna array will belong to the first antenna port, and the third and the fourth columns will belong to the second antenna port. Note that “NumRows” and “NumColumns” must be a multiple of “NumVerticalPorts” and “NumHorizontalPorts”, respectively.

Whether the antenna is dual-polarized or not is configured through the attribute “IsDualPolarized”. In case the antenna array is dual polarized, the total number of antenna elements is doubled and the two polarizations are overlapped in space. The polarization slant angle of the antenna elements belonging to the first polarization are configured through the attribute “PolSlantAngle”; while the antenna elements of the second polarization have the polarization slant angle minus 90 degrees, as described in 38901 (i.e., {\zeta}).

3.1.5.2. CircularApertureAntennaModel

The class CircularApertureAntennaModel implements the radiation pattern described in 38811. Specifically, the latter represents parabolic antennas, i.e., antennas which are typically used for achieving long range communications such as earth-to-satellite links. The default boresight orientation is parallel to the positive z-axis, and it can be tuned by using the AntennaInclination and AntennaAzimuth parameters. This implementation provides an exact characterization of the antenna field pattern, by leveraging the standard library Bessel functions implementation introduced with C++17. Accordingly, the antenna gain G at an angle \theta from the boresight main beam is evaluated as:

G \cdot 4\left | \frac{J_{1}\left ( k\cdot a\cdot sin\theta \right )}{k\cdot a\cdot sin\theta} \right
|^{2}\;\;\;\;\; for\; 0<\left | \theta \right |\leq 90^{\circ} \\
G \cdot 1\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; for\; \theta=0

where J_{1}() is the Bessel function of the first kind and first order, and a is the radius of the antenna’s circular aperture. The parameter k is equal to k=\frac{2\pi f}{x}, where f is the carrier frequency, and c is the speed of light in vacuum. The parameters G (in logarithmic scale), a and f can be configured by using the attributes “AntennaMaxGainDb”, “AntennaCircularApertureRadius” and “OperatingFrequency”, respectively. This type of antennas features a symmetric radiation pattern, meaning that a single angle, measured from the boresight direction, is sufficient to characterize the radiation strength along a given direction.

_images/circular-antenna-pattern.png

Circular aperture antenna radiation pattern with G = 38.5 dB and a = 10 \frac{c}{f}.

Balanis

C.A. Balanis, “Antenna Theory - Analysis and Design”, Wiley, 2nd Ed.

Chunjian(1,2,3)

Li Chunjian, “Efficient Antenna Patterns for Three-Sector WCDMA Systems”, Master of Science Thesis, Chalmers University of Technology, Göteborg, Sweden, 2003

Calcev

George Calcev and Matt Dillon, “Antenna Tilt Control in CDMA Networks”, in Proc. of the 2nd Annual International Wireless Internet Conference (WICON), 2006

R4-092042a

3GPP TSG RAN WG4 (Radio) Meeting #51, R4-092042, Simulation assumptions and parameters for FDD HeNB RF requirements.

38901(1,2,3)

3GPP. 2018. TR 38.901, Study on channel model for frequencies from 0.5 to 100 GHz, V15.0.0. (2018-06).

38811

3GPP. 2018. TR 38.811, Study on New Radio (NR) to support non-terrestrial networks, V15.4.0. (2020-09).

Mailloux

Robert J. Mailloux, “Phased Array Antenna Handbook”, Artech House, 2nd Ed.

3GPP_TR36897

3GPP. 2015. TR 36.897. Study on elevation beamforming / Full-Dimension (FD) Multiple Input Multiple Output (MIMO) for LTE. V13.0.0. (2015-06)

3.2. User Documentation

The antenna modeled can be used with all the wireless technologies and physical layer models that support it. Currently, this includes the physical layer models based on the SpectrumPhy. Please refer to the documentation of each of these models for details.

3.3. Testing Documentation

In this section we describe the test suites included with the antenna module that verify its correct functionality.

3.3.1. Angles

The unit test suite angles verifies that the Angles class is constructed properly by correct conversion from 3D Cartesian coordinates according to the available methods (construction from a single vector and from a pair of vectors). For each method, several test cases are provided that compare the values (\phi, \theta) determined by the constructor to known reference values. The test passes if for each case the values are equal to the reference up to a tolerance of 10^{-10} which accounts for numerical errors.

3.3.2. DegreesToRadians

The unit test suite degrees-radians verifies that the methods DegreesToRadians and RadiansToDegrees work properly by comparing with known reference values in a number of test cases. Each test case passes if the comparison is equal up to a tolerance of 10^{-10} which accounts for numerical errors.

3.3.3. IsotropicAntennaModel

The unit test suite isotropic-antenna-model checks that the IsotropicAntennaModel class works properly, i.e., returns always a 0dB gain regardless of the direction.

3.3.4. CosineAntennaModel

The unit test suite cosine-antenna-model checks that the CosineAntennaModel class works properly. Several test cases are provided that check for the antenna gain value calculated at different directions and for different values of the orientation, the reference gain and the beamwidth. The reference gain is calculated by hand. Each test case passes if the reference gain in dB is equal to the value returned by CosineAntennaModel within a tolerance of 0.001, which accounts for the approximation done for the calculation of the reference values.

3.3.5. ParabolicAntennaModel

The unit test suite parabolic-antenna-model checks that the ParabolicAntennaModel class works properly. Several test cases are provided that check for the antenna gain value calculated at different directions and for different values of the orientation, the maximum attenuation and the beamwidth. The reference gain is calculated by hand. Each test case passes if the reference gain in dB is equal to the value returned by ParabolicAntennaModel within a tolerance of 0.001, which accounts for the approximation done for the calculation of the reference values.

4. Ad Hoc On-Demand Distance Vector (AODV)

This model implements the base specification of the Ad Hoc On-Demand Distance Vector (AODV) protocol. The implementation is based on RFC 3561.

The model was written by Elena Buchatskaia and Pavel Boyko of ITTP RAS, and is based on the ns-2 AODV model developed by the CMU/MONARCH group and optimized and tuned by Samir Das and Mahesh Marina, University of Cincinnati, and also on the AODV-UU implementation by Erik Nordström of Uppsala University.

4.1. Model Description

The source code for the AODV model lives in the directory src/aodv.

4.1.1. Design

Class ns3::aodv::RoutingProtocol implements all functionality of service packet exchange and inherits from ns3::Ipv4RoutingProtocol. The base class defines two virtual functions for packet routing and forwarding. The first one, ns3::aodv::RouteOutput, is used for locally originated packets, and the second one, ns3::aodv::RouteInput, is used for forwarding and/or delivering received packets.

Protocol operation depends on many adjustable parameters. Parameters for this functionality are attributes of ns3::aodv::RoutingProtocol. Parameter default values are drawn from the RFC and allow the enabling/disabling protocol features, such as broadcasting HELLO messages, broadcasting data packets and so on.

AODV discovers routes on demand. Therefore, the AODV model buffers all packets while a route request packet (RREQ) is disseminated. A packet queue is implemented in aodv-rqueue.cc. A smart pointer to the packet, ns3::Ipv4RoutingProtocol::ErrorCallback, ns3::Ipv4RoutingProtocol::UnicastForwardCallback, and the IP header are stored in this queue. The packet queue implements garbage collection of old packets and a queue size limit.

The routing table implementation supports garbage collection of old entries and state machine, defined in the standard. It is implemented as a STL map container. The key is a destination IP address.

Some elements of protocol operation aren’t described in the RFC. These elements generally concern cooperation of different OSI model layers. The model uses the following heuristics:

  • This AODV implementation can detect the presence of unidirectional links and avoid them if necessary. If the node the model receives an RREQ for is a neighbor, the cause may be a unidirectional link. This heuristic is taken from AODV-UU implementation and can be disabled.

  • Protocol operation strongly depends on broken link detection mechanism. The model implements two such heuristics. First, this implementation support HELLO messages. However HELLO messages are not a good way to perform neighbor sensing in a wireless environment (at least not over 802.11). Therefore, one may experience bad performance when running over wireless. There are several reasons for this: 1) HELLO messages are broadcasted. In 802.11, broadcasting is often done at a lower bit rate than unicasting, thus HELLO messages can travel further than unicast data. 2) HELLO messages are small, thus less prone to bit errors than data transmissions, and 3) Broadcast transmissions are not guaranteed to be bidirectional, unlike unicast transmissions. Second, we use layer 2 feedback when possible. Link are considered to be broken if frame transmission results in a transmission failure for all retries. This mechanism is meant for active links and works faster than the first method.

The layer 2 feedback implementation relies on the TxErrHeader trace source, currently supported in AdhocWifiMac only.

4.1.2. Scope and Limitations

The model is for IPv4 only. The following optional protocol optimizations are not implemented:

  1. Local link repair.

  2. RREP, RREQ and HELLO message extensions.

These techniques require direct access to IP header, which contradicts the assertion from the AODV RFC that AODV works over UDP. This model uses UDP for simplicity, hindering the ability to implement certain protocol optimizations. The model doesn’t use low layer raw sockets because they are not portable.

4.1.3. Future Work

No announced plans.

5. 3GPP HTTP applications

5.1. Model Description

The model is a part of the applications library. The HTTP model is based on a commonly used 3GPP model in standardization [4].

5.1.1. Design

This traffic generator simulates web browsing traffic using the Hypertext Transfer Protocol (HTTP). It consists of one or more ThreeGppHttpClient applications which connect to a ThreeGppHttpServer application. The client models a web browser which requests web pages to the server. The server is then responsible to serve the web pages as requested. Please refer to ThreeGppHttpClientHelper and ThreeGppHttpServerHelper for usage instructions.

Technically speaking, the client transmits request objects to demand a service from the server. Depending on the type of request received, the server transmits either:

  • a main object, i.e., the HTML file of the web page; or

  • an embedded object, e.g., an image referenced by the HTML file.

The main and embedded object sizes are illustrated in figures 3GPP HTTP main object size histogram and 3GPP HTTP embedded object size histogram.

_images/http-main-object-size.png

3GPP HTTP main object size histogram

_images/http-embedded-object-size.png

3GPP HTTP embedded object size histogram

A major portion of the traffic pattern is reading time, which does not generate any traffic. Because of this, one may need to simulate a good number of clients and/or sufficiently long simulation duration in order to generate any significant traffic in the system. Reading time is illustrated in 3GPP HTTP reading time histogram.

_images/http-reading-time.png

3GPP HTTP reading time histogram

5.1.1.1. 3GPP HTTP server description

3GPP HTTP server is a model application which simulates the traffic of a web server. This application works in conjunction with ThreeGppHttpClient applications.

The application works by responding to requests. Each request is a small packet of data which contains ThreeGppHttpHeader. The value of the content type field of the header determines the type of object that the client is requesting. The possible type is either a main object or an embedded object.

The application is responsible to generate the right type of object and send it back to the client. The size of each object to be sent is randomly determined (see ThreeGppHttpVariables). Each object may be sent as multiple packets due to limited socket buffer space.

To assist with the transmission, the application maintains several instances of ThreeGppHttpServerTxBuffer. Each instance keeps track of the object type to be served and the number of bytes left to be sent.

The application accepts connection request from clients. Every connection is kept open until the client disconnects.

Maximum transmission unit (MTU) size is configurable in ThreeGppHttpServer or in ThreeGppHttpVariables. By default, the low variant is 536 bytes and high variant is 1460 bytes. The default values are set with the intention of having a TCP header (size of which is 40 bytes) added in the packet in such way that lower layers can avoid splitting packets. The change of MTU sizes affects all TCP sockets after the server application has started. It is mainly visible in sizes of packets received by ThreeGppHttpClient applications.

5.1.1.2. 3GPP HTTP client description

3GPP HTTP client is a model application which simulates the traffic of a web browser. This application works in conjunction with an ThreeGppHttpServer application.

In summary, the application works as follows.

  1. Upon start, it opens a connection to the destination web server (ThreeGppHttpServer).

  2. After the connection is established, the application immediately requests a main object from the server by sending a request packet.

  3. After receiving a main object (which can take some time if it consists of several packets), the application “parses” the main object. Parsing time is illustrated in figure 3GPP HTTP parsing time histogram.

  4. The parsing takes a short time (randomly determined) to determine the number of embedded objects (also randomly determined) in the web page. Number of embedded object is illustrated in 3GPP HTTP number of embedded objects histogram.

    • If at least one embedded object is determined, the application requests

      the first embedded object from the server. The request for the next embedded object follows after the previous embedded object has been completely received.

    • If there is no more embedded object to request, the application enters

      the reading time.

  5. Reading time is a long delay (again, randomly determined) where the application does not induce any network traffic, thus simulating the user reading the downloaded web page.

  6. After the reading time is finished, the process repeats to step #2.

_images/http-parsing-time.png

3GPP HTTP parsing time histogram

_images/http-num-of-embedded-objects.png

3GPP HTTP number of embedded objects histogram

The client models HTTP persistent connection, i.e., HTTP 1.1, where the connection to the server is maintained and used for transmitting and receiving all objects.

Each request by default has a constant size of 350 bytes. A ThreeGppHttpHeader is attached to each request packet. The header contains information such as the content type requested (either main object or embedded object) and the timestamp when the packet is transmitted (which will be used to compute the delay and RTT of the packet).

5.1.2. References

Many aspects of the traffic are randomly determined by ThreeGppHttpVariables. A separate instance of this object is used by the HTTP server and client applications. These characteristics are based on a legacy 3GPP specification. The description can be found in the following references:

[1] 3GPP TR 25.892, “Feasibility Study for Orthogonal Frequency Division Multiplexing (OFDM) for UTRAN enhancement”

[2] IEEE 802.16m, “Evaluation Methodology Document (EMD)”, IEEE 802.16m-08/004r5, July 2008.

[3] NGMN Alliance, “NGMN Radio Access Performance Evaluation Methodology”, v1.0, January 2008.

[4] 3GPP2-TSGC5, “HTTP, FTP and TCP models for 1xEV-DV simulations”, 2001.

5.2. Usage

The three-gpp-http-example can be referenced to see basic usage of the HTTP applications. In summary, using the ThreeGppHttpServerHelper and ThreeGppHttpClientHelper allow the user to easily install ThreeGppHttpServer and ThreeGppHttpClient applications to nodes. The helper objects can be used to configure attribute values for the client and server objects, but not for the ThreeGppHttpVariables object. Configuration of variables is done by modifying attributes of ThreeGppHttpVariables, which should be done prior to helpers installing applications to nodes.

The client and server provide a number of ns-3 trace sources such as “Tx”, “Rx”, “RxDelay”, and “StateTransition” on the server side, and a large number on the client side (“ConnectionEstablished”, “ConnectionClosed”,”TxMainObjectRequest”, “TxEmbeddedObjectRequest”, “RxMainObjectPacket”, “RxMainObject”, “RxEmbeddedObjectPacket”, “RxEmbeddedObject”, “Rx”, “RxDelay”, “RxRtt”, “StateTransition”).

5.2.1. Building the 3GPP HTTP applications

Building the applications does not require any special steps to be taken. It suffices to enable the applications module.

5.2.2. Examples

For an example demonstrating HTTP applications run:

$ ./ns3 run 'three-gpp-http-example'

By default, the example will print out the web page requests of the client and responses of the server and client receiving content packets by using LOG_INFO of ThreeGppHttpServer and ThreeGppHttpClient.

5.2.3. Tests

For testing HTTP applications, three-gpp-http-client-server-test is provided. Run:

$ ./test.py -s three-gpp-http-client-server-test

The test consists of simple Internet nodes having HTTP server and client applications installed. Multiple variant scenarios are tested: delay is 3ms, 30ms or 300ms, bit error rate 0 or 5.0*10^(-6), MTU size 536 or 1460 bytes and either IPV4 or IPV6 is used. A simulation with each combination of these parameters is run multiple times to verify functionality with different random variables.

Test cases themselves are rather simple: test verifies that HTTP object packet bytes sent match total bytes received by the client, and that ThreeGppHttpHeader matches the expected packet.

6. Bridge NetDevice

Placeholder chapter

Some examples of the use of Bridge NetDevice can be found in examples/csma/ directory.

7. BRITE Integration

This model implements an interface to BRITE, the Boston university Representative Internet Topology gEnerator 1. BRITE is a standard tool for generating realistic internet topologies. The ns-3 model, described herein, provides a helper class to facilitate generating ns-3 specific topologies using BRITE configuration files. BRITE builds the original graph which is stored as nodes and edges in the ns-3 BriteTopolgyHelper class. In the ns-3 integration of BRITE, the generator generates a topology and then provides access to leaf nodes for each AS generated. ns-3 users can than attach custom topologies to these leaf nodes either by creating them manually or using topology generators provided in ns-3.

There are three major types of topologies available in BRITE: Router, AS, and Hierarchical which is a combination of AS and Router. For the purposes of ns-3 simulation, the most useful are likely to be Router and Hierarchical. Router level topologies be generated using either the Waxman model or the Barabasi-Albert model. Each model has different parameters that effect topology creation. For flat router topologies, all nodes are considered to be in the same AS.

BRITE Hierarchical topologies contain two levels. The first is the AS level. This level can be also be created by using either the Waxman model or the Barabasi-Albert model. Then for each node in the AS topology, a router level topology is constructed. These router level topologies can again either use the Waxman model or the Barbasi-Albert model. BRITE interconnects these separate router topologies as specified by the AS level topology. Once the hierarchical topology is constructed, it is flattened into a large router level topology.

Further information can be found in the BRITE user manual: http://www.cs.bu.edu/brite/publications/usermanual.pdf

7.1. Model Description

The model relies on building an external BRITE library, and then building some ns-3 helpers that call out to the library. The source code for the ns-3 helpers lives in the directory src/brite/helper.

7.1.1. Design

To generate the BRITE topology, ns-3 helpers call out to the external BRITE library, and using a standard BRITE configuration file, the BRITE code builds a graph with nodes and edges according to this configuration file. Please see the BRITE documentation or the example configuration files in src/brite/examples/conf_files to get a better grasp of BRITE configuration options. The graph built by BRITE is returned to ns-3, and a ns-3 implementation of the graph is built. Leaf nodes for each AS are available for the user to either attach custom topologies or install ns-3 applications directly.

7.1.2. References

1

Alberto Medina, Anukool Lakhina, Ibrahim Matta, and John Byers. BRITE: An Approach to Universal Topology Generation. In Proceedings of the International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunications Systems- MASCOTS ‘01, Cincinnati, Ohio, August 2001.

7.2. Usage

The brite-generic-example can be referenced to see basic usage of the BRITE interface. In summary, the BriteTopologyHelper is used as the interface point by passing in a BRITE configuration file. Along with the configuration file a BRITE formatted random seed file can also be passed in. If a seed file is not passed in, the helper will create a seed file using ns-3’s UniformRandomVariable. Once the topology has been generated by BRITE, BuildBriteTopology() is called to create the ns-3 representation. Next IP Address can be assigned to the topology using either AssignIpv4Addresses() or AssignIpv6Addresses(). It should be noted that each point-to-point link in the topology will be treated as a new network therefore for IPV4 a /30 subnet should be used to avoid wasting a large amount of the available address space.

Example BRITE configuration files can be found in /src/brite/examples/conf_files/. ASBarbasi and ASWaxman are examples of AS only topologies. The RTBarabasi and RTWaxman files are examples of router only topologies. Finally the TD_ASBarabasi_RTWaxman configuration file is an example of a Hierarchical topology that uses the Barabasi-Albert model for the AS level and the Waxman model for each of the router level topologies. Information on the BRITE parameters used in these files can be found in the BRITE user manual.

7.2.1. Building BRITE Integration

The first step is to download and build the ns-3 specific BRITE repository:

$ hg clone http://code.nsnam.org/BRITE
$ cd BRITE
$ make

This will build BRITE and create a library, libbrite.so, within the BRITE directory.

Once BRITE has been built successfully, we proceed to configure ns-3 with BRITE support. Change to your ns-3 directory:

$ ./ns3 configure --with-brite=/your/path/to/brite/source --enable-examples

Make sure it says ‘enabled’ beside ‘BRITE Integration’. If it does not, then something has gone wrong. Either you have forgotten to build BRITE first following the steps above, or ns-3 could not find your BRITE directory.

Next, build ns-3:

$ ./ns3

7.2.2. Examples

For an example demonstrating BRITE integration run:

$ ./ns3 run 'brite-generic-example'

By enabling the verbose parameter, the example will print out the node and edge information in a similar format to standard BRITE output. There are many other command-line parameters including confFile, tracing, and nix, described below:

confFile

A BRITE configuration file. Many different BRITE configuration file examples exist in the src/brite/examples/conf_files directory, for example, RTBarabasi20.conf and RTWaxman.conf. Please refer to the conf_files directory for more examples.

tracing

Enables ascii tracing.

nix

Enables nix-vector routing. Global routing is used by default.

The generic BRITE example also support visualization using pyviz, assuming python bindings in ns-3 are enabled:

$ ./ns3 run brite-generic-example --vis

Simulations involving BRITE can also be used with MPI. The total number of MPI instances is passed to the BRITE topology helper where a modulo divide is used to assign the nodes for each AS to a MPI instance. An example can be found in src/brite/examples:

$ mpirun -np 2 ./ns3 run brite-MPI-example

Please see the ns-3 MPI documentation for information on setting up MPI with ns-3.

8. Buildings Module

cd .. include:: replace.txt

8.1. Design documentation

8.1.1. Overview

The Buildings module provides:

  1. a new class (Building) that models the presence of a building in a simulation scenario;

  2. a new class (MobilityBuildingInfo) that allows to specify the location, size and characteristics of buildings present in the simulated area, and allows the placement of nodes inside those buildings;

  3. a container class with the definition of the most useful pathloss models and the correspondent variables called BuildingsPropagationLossModel.

  4. a new propagation model (HybridBuildingsPropagationLossModel) working with the mobility model just introduced, that allows to model the phenomenon of indoor/outdoor propagation in the presence of buildings.

  5. a simplified model working only with Okumura Hata (OhBuildingsPropagationLossModel) considering the phenomenon of indoor/outdoor propagation in the presence of buildings.

  6. a channel condition model (BuildingsChannelConditionModel) which determined the LOS/NLOS channel condition based on the Building objects deployed in the scenario.

  7. hybrid channel condition models (ThreeGppV2vUrbanChannelConditionModel and ThreeGppV2vHighwayChannelConditionModel) specifically designed to model vehicular environments (more information can be found in the documentation of the propagation module)

The models have been designed with LTE in mind, though their implementation is in fact independent from any LTE-specific code, and can be used with other ns-3 wireless technologies as well (e.g., wifi, wimax).

The HybridBuildingsPropagationLossModel pathloss model included is obtained through a combination of several well known pathloss models in order to mimic different environmental scenarios such as urban, suburban and open areas. Moreover, the model considers both outdoor and indoor indoor and outdoor communication has to be included since HeNB might be installed either within building and either outside. In case of indoor communication, the model has to consider also the type of building in outdoor <-> indoor communication according to some general criteria such as the wall penetration losses of the common materials; moreover it includes some general configuration for the internal walls in indoor communications.

The OhBuildingsPropagationLossModel pathloss model has been created for simplifying the previous one removing the thresholds for switching from one model to other. For doing this it has been used only one propagation model from the one available (i.e., the Okumura Hata). The presence of building is still considered in the model; therefore all the considerations of above regarding the building type are still valid. The same consideration can be done for what concern the environmental scenario and frequency since both of them are parameters of the model considered.

8.1.2. The Building class

The model includes a specific class called Building which contains a ns3 Box class for defining the dimension of the building. In order to implements the characteristics of the pathloss models included, the Building class supports the following attributes:

  • building type:

    • Residential (default value)

    • Office

    • Commercial

  • external walls type

    • Wood

    • ConcreteWithWindows (default value)

    • ConcreteWithoutWindows

    • StoneBlocks

  • number of floors (default value 1, which means only ground-floor)

  • number of rooms in x-axis (default value 1)

  • number of rooms in y-axis (default value 1)

The Building class is based on the following assumptions:

  • a buildings is represented as a rectangular parallelepiped (i.e., a box)

  • the walls are parallel to the x, y, and z axis

  • a building is divided into a grid of rooms, identified by the following parameters:

    • number of floors

    • number of rooms along the x-axis

    • number of rooms along the y-axis

  • the z axis is the vertical axis, i.e., floor numbers increase for increasing z axis values

  • the x and y room indices start from 1 and increase along the x and y axis respectively

  • all rooms in a building have equal size

8.1.3. The MobilityBuildingInfo class

The MobilityBuildingInfo class, which inherits from the ns3 class Object, is in charge of maintaining information about the position of a node with respect to building. The information managed by MobilityBuildingInfo is:

  • whether the node is indoor or outdoor

  • if indoor:

    • in which building the node is

    • in which room the node is positioned (x, y and floor room indices)

The class MobilityBuildingInfo is used by BuildingsPropagationLossModel class, which inherits from the ns3 class PropagationLossModel and manages the pathloss computation of the single components and their composition according to the nodes’ positions. Moreover, it implements also the shadowing, that is the loss due to obstacles in the main path (i.e., vegetation, buildings, etc.).

It is to be noted that, MobilityBuildingInfo can be used by any other propagation model. However, based on the information at the time of this writing, only the ones defined in the building module are designed for considering the constraints introduced by the buildings.

8.1.4. ItuR1238PropagationLossModel

This class implements a building-dependent indoor propagation loss model based on the ITU P.1238 model, which includes losses due to type of building (i.e., residential, office and commercial). The analytical expression is given in the following.

L_\mathrm{total} = 20\log f + N\log d + L_f(n)- 28 [dB]

where:

N = \left\{ \begin{array}{lll} 28 & residential \\ 30 & office \\ 22 & commercial\end{array} \right. : power loss coefficient [dB]

L_f = \left\{ \begin{array}{lll} 4n & residential \\ 15+4(n-1) & office \\ 6+3(n-1) & commercial\end{array} \right.

n : number of floors between base station and mobile (n\ge 1)

f : frequency [MHz]

d : distance (where d > 1) [m]

8.1.5. BuildingsPropagationLossModel

The BuildingsPropagationLossModel provides an additional set of building-dependent pathloss model elements that are used to implement different pathloss logics. These pathloss model elements are described in the following subsections.

8.1.5.1. External Wall Loss (EWL)

This component models the penetration loss through walls for indoor to outdoor communications and vice-versa. The values are taken from the [cost231] model.

  • Wood ~ 4 dB

  • Concrete with windows (not metallized) ~ 7 dB

  • Concrete without windows ~ 15 dB (spans between 10 and 20 in COST231)

  • Stone blocks ~ 12 dB

8.1.5.2. Internal Walls Loss (IWL)

This component models the penetration loss occurring in indoor-to-indoor communications within the same building. The total loss is calculated assuming that each single internal wall has a constant penetration loss L_{siw}, and approximating the number of walls that are penetrated with the manhattan distance (in number of rooms) between the transmitter and the receiver. In detail, let x_1, y_1, x_2, y_2 denote the room number along the x and y axis respectively for user 1 and 2; the total loss L_{IWL} is calculated as

L_{IWL} = L_{siw} (|x_1 -x_2| + |y_1 - y_2|)

8.1.5.3. Height Gain Model (HG)

This component model the gain due to the fact that the transmitting device is on a floor above the ground. In the literature [turkmani] this gain has been evaluated as about 2 dB per floor. This gain can be applied to all the indoor to outdoor communications and vice-versa.

8.1.5.4. Shadowing Model

The shadowing is modeled according to a log-normal distribution with variable standard deviation as function of the relative position (indoor or outdoor) of the MobilityModel instances involved. One random value is drawn for each pair of MobilityModels, and stays constant for that pair during the whole simulation. Thus, the model is appropriate for static nodes only.

The model considers that the mean of the shadowing loss in dB is always 0. For the variance, the model considers three possible values of standard deviation, in detail:

  • outdoor (m_shadowingSigmaOutdoor, default value of 7 dB) \rightarrow X_\mathrm{O} \sim N(\mu_\mathrm{O}, \sigma_\mathrm{O}^2).

  • indoor (m_shadowingSigmaIndoor, default value of 10 dB) \rightarrow X_\mathrm{I} \sim N(\mu_\mathrm{I}, \sigma_\mathrm{I}^2).

  • external walls penetration (m_shadowingSigmaExtWalls, default value 5 dB) \rightarrow X_\mathrm{W} \sim N(\mu_\mathrm{W}, \sigma_\mathrm{W}^2)

The simulator generates a shadowing value per each active link according to nodes’ position the first time the link is used for transmitting. In case of transmissions from outdoor nodes to indoor ones, and vice-versa, the standard deviation (\sigma_\mathrm{IO}) has to be calculated as the square root of the sum of the quadratic values of the standard deviatio in case of outdoor nodes and the one for the external walls penetration. This is due to the fact that that the components producing the shadowing are independent of each other; therefore, the variance of a distribution resulting from the sum of two independent normal ones is the sum of the variances.

X \sim N(\mu,\sigma^2) \mbox{ and } Y \sim N(\nu,\tau^2)

Z = X + Y \sim Z (\mu + \nu, \sigma^2 + \tau^2)

\Rightarrow \sigma_\mathrm{IO} = \sqrt{\sigma_\mathrm{O}^2 + \sigma_\mathrm{W}^2}

8.1.6. Pathloss logics

In the following we describe the different pathloss logic that are implemented by inheriting from BuildingsPropagationLossModel.

8.1.6.1. HybridBuildingsPropagationLossModel

The HybridBuildingsPropagationLossModel pathloss model included is obtained through a combination of several well known pathloss models in order to mimic different outdoor and indoor scenarios, as well as indoor-to-outdoor and outdoor-to-indoor scenarios. In detail, the class HybridBuildingsPropagationLossModel integrates the following pathloss models:

  • OkumuraHataPropagationLossModel (OH) (at frequencies > 2.3 GHz substituted by Kun2600MhzPropagationLossModel)

  • ItuR1411LosPropagationLossModel and ItuR1411NlosOverRooftopPropagationLossModel (I1411)

  • ItuR1238PropagationLossModel (I1238)

  • the pathloss elements of the BuildingsPropagationLossModel (EWL, HG, IWL)

The following pseudo-code illustrates how the different pathloss model elements described above are integrated in HybridBuildingsPropagationLossModel:

if (txNode is outdoor)
  then
    if (rxNode is outdoor)
      then
        if (distance > 1 km)
          then
            if (rxNode or txNode is below the rooftop)
              then
                L = I1411
              else
                L = OH
          else
            L = I1411
      else (rxNode is indoor)
        if (distance > 1 km)
          then
            if (rxNode or txNode is below the rooftop)
              L = I1411 + EWL + HG
            else
              L = OH + EWL + HG
          else
            L = I1411 + EWL + HG
else (txNode is indoor)
  if (rxNode is indoor)
    then
     if (same building)
        then
          L = I1238 + IWL
        else
          L = I1411 + 2*EWL
   else (rxNode is outdoor)
    if (distance > 1 km)
      then
        if (rxNode or txNode is below the rooftop)
              then
                L = I1411 + EWL + HG
              else
                L = OH + EWL + HG
      else
        L = I1411 + EWL

We note that, for the case of communication between two nodes below rooftop level with distance is greater then 1 km, we still consider the I1411 model, since OH is specifically designed for macro cells and therefore for antennas above the roof-top level.

For the ITU-R P.1411 model we consider both the LOS and NLoS versions. In particular, we considers the LoS propagation for distances that are shorted than a tunable threshold (m_itu1411NlosThreshold). In case on NLoS propagation, the over the roof-top model is taken in consideration for modeling both macro BS and SC. In case on NLoS several parameters scenario dependent have been included, such as average street width, orientation, etc. The values of such parameters have to be properly set according to the scenario implemented, the model does not calculate natively their values. In case any values is provided, the standard ones are used, apart for the height of the mobile and BS, which instead their integrity is tested directly in the code (i.e., they have to be greater then zero). In the following we give the expressions of the components of the model.

We also note that the use of different propagation models (OH, I1411, I1238 with their variants) in HybridBuildingsPropagationLossModel can result in discontinuities of the pathloss with respect to distance. A proper tuning of the attributes (especially the distance threshold attributes) can avoid these discontinuities. However, since the behavior of each model depends on several other parameters (frequency, node height, etc), there is no default value of these thresholds that can avoid the discontinuities in all possible configurations. Hence, an appropriate tuning of these parameters is left to the user.

8.1.6.2. OhBuildingsPropagationLossModel

The OhBuildingsPropagationLossModel class has been created as a simple means to solve the discontinuity problems of HybridBuildingsPropagationLossModel without doing scenario-specific parameter tuning. The solution is to use only one propagation loss model (i.e., Okumura Hata), while retaining the structure of the pathloss logic for the calculation of other path loss components (such as wall penetration losses). The result is a model that is free of discontinuities (except those due to walls), but that is less realistic overall for a generic scenario with buildings and outdoor/indoor users, e.g., because Okumura Hata is not suitable neither for indoor communications nor for outdoor communications below rooftop level.

In detail, the class OhBuildingsPropagationLossModel integrates the following pathloss models:

  • OkumuraHataPropagationLossModel (OH)

  • the pathloss elements of the BuildingsPropagationLossModel (EWL, HG, IWL)

The following pseudo-code illustrates how the different pathloss model elements described above are integrated in OhBuildingsPropagationLossModel:

if (txNode is outdoor)
  then
    if (rxNode is outdoor)
      then
        L = OH
      else (rxNode is indoor)
        L = OH + EWL
else (txNode is indoor)
  if (rxNode is indoor)
    then
     if (same building)
        then
          L = OH + IWL
        else
          L = OH + 2*EWL
   else (rxNode is outdoor)
      L = OH + EWL

We note that OhBuildingsPropagationLossModel is a significant simplification with respect to HybridBuildingsPropagationLossModel, due to the fact that OH is used always. While this gives a less accurate model in some scenarios (especially below rooftop and indoor), it effectively avoids the issue of pathloss discontinuities that affects HybridBuildingsPropagationLossModel.

8.2. User Documentation

8.2.1. How to use buildings in a simulation

In this section we explain the basic usage of the buildings model within a simulation program.

8.2.1.1. Include the headers

Add this at the beginning of your simulation program:

#include <ns3/buildings-module.h>
8.2.1.2. Create a building

As an example, let’s create a residential 10 x 20 x 10 building:

double x_min = 0.0;
double x_max = 10.0;
double y_min = 0.0;
double y_max = 20.0;
double z_min = 0.0;
double z_max = 10.0;
Ptr<Building> b = CreateObject<Building>();
b->SetBoundaries(Box(x_min, x_max, y_min, y_max, z_min, z_max));
b->SetBuildingType(Building::Residential);
b->SetExtWallsType(Building::ConcreteWithWindows);
b->SetNFloors(3);
b->SetNRoomsX(3);
b->SetNRoomsY(2);

This building has three floors and an internal 3 x 2 grid of rooms of equal size.

The helper class GridBuildingAllocator is also available to easily create a set of buildings with identical characteristics placed on a rectangular grid. Here’s an example of how to use it:

Ptr<GridBuildingAllocator>  gridBuildingAllocator;
gridBuildingAllocator = CreateObject<GridBuildingAllocator>();
gridBuildingAllocator->SetAttribute("GridWidth", UintegerValue(3));
gridBuildingAllocator->SetAttribute("LengthX", DoubleValue(7));
gridBuildingAllocator->SetAttribute("LengthY", DoubleValue(13));
gridBuildingAllocator->SetAttribute("DeltaX", DoubleValue(3));
gridBuildingAllocator->SetAttribute("DeltaY", DoubleValue(3));
gridBuildingAllocator->SetAttribute("Height", DoubleValue(6));
gridBuildingAllocator->SetBuildingAttribute("NRoomsX", UintegerValue(2));
gridBuildingAllocator->SetBuildingAttribute("NRoomsY", UintegerValue(4));
gridBuildingAllocator->SetBuildingAttribute("NFloors", UintegerValue(2));
gridBuildingAllocator->SetAttribute("MinX", DoubleValue(0));
gridBuildingAllocator->SetAttribute("MinY", DoubleValue(0));
gridBuildingAllocator->Create(6);

This will create a 3x2 grid of 6 buildings, each 7 x 13 x 6 m with 2 x 4 rooms inside and 2 foors; the buildings are spaced by 3 m on both the x and the y axis.

8.2.1.3. Setup nodes and mobility models

Nodes and mobility models are configured as usual, however in order to use them with the buildings model you need an additional call to BuildingsHelper::Install(), so as to let the mobility model include the information on their position w.r.t. the buildings. Here is an example:

MobilityHelper mobility;
mobility.SetMobilityModel("ns3::ConstantPositionMobilityModel");
ueNodes.Create(2);
mobility.Install(ueNodes);
BuildingsHelper::Install(ueNodes);

It is to be noted that any mobility model can be used. However, the user is advised to make sure that the behavior of the mobility model being used is consistent with the presence of Buildings. For example, using a simple random mobility over the whole simulation area in presence of buildings might easily results in node moving in and out of buildings, regardless of the presence of walls.

One dedicated buildings-aware mobility model is the RandomWalk2dOutdoorMobilityModel. This class is similar to the RandomWalk2dMobilityModel but avoids placing the trajectory on a path that would intersect a building wall. If a boundary is encountered (either the bounding box or a building wall), the model rebounds with a random direction and speed that ensures that the trajectory stays outside the buildings. An example program that demonstrates the use of this model is the src/buildings/examples/outdoor-random-walk-example.cc which has an associated shell script to plot the traces generated. Another example program demonstrates how this outdoor mobility model can be used as the basis of a group mobility model, with the outdoor buildings-aware model serving as the parent or reference mobility model, and with additional nodes defining a child mobility model providing the offset from the reference mobility model. This example, src/buildings/example/outdoor-group-mobility-example.cc, also has an associated shell script (outdoor-group-mobility-animate.sh) that can be used to generate an animated GIF of the group’s movement.

8.2.1.4. Place some nodes

You can place nodes in your simulation using several methods, which are described in the following.

8.2.1.4.1. Legacy positioning methods

Any legacy ns-3 positioning method can be used to place node in the simulation. The important additional step is to For example, you can place nodes manually like this:

Ptr<ConstantPositionMobilityModel> mm0 = enbNodes.Get(0)->GetObject<ConstantPositionMobilityModel>();
Ptr<ConstantPositionMobilityModel> mm1 = enbNodes.Get(1)->GetObject<ConstantPositionMobilityModel>();
mm0->SetPosition(Vector(5.0, 5.0, 1.5));
mm1->SetPosition(Vector(30.0, 40.0, 1.5));

MobilityHelper mobility;
mobility.SetMobilityModel("ns3::ConstantPositionMobilityModel");
ueNodes.Create(2);
mobility.Install(ueNodes);
BuildingsHelper::Install(ueNodes);
mm0->SetPosition(Vector(5.0, 5.0, 1.5));
mm1->SetPosition(Vector(30.0, 40.0, 1.5));

Alternatively, you could use any existing PositionAllocator class. The coordinates of the node will determine whether it is placed outdoor or indoor and, if indoor, in which building and room it is placed.

8.2.1.4.2. Building-specific positioning methods

The following position allocator classes are available to place node in special positions with respect to buildings:

  • RandomBuildingPositionAllocator: Allocate each position by randomly choosing a building from the list of all buildings, and then randomly choosing a position inside the building.

  • RandomRoomPositionAllocator: Allocate each position by randomly choosing a room from the list of rooms in all buildings, and then randomly choosing a position inside the room.

  • SameRoomPositionAllocator: Walks a given NodeContainer sequentially, and for each node allocate a new position randomly in the same room of that node.

  • FixedRoomPositionAllocator: Generate a random position uniformly distributed in the volume of a chosen room inside a chosen building.

8.2.1.5. Making the Mobility Model Consistent for a node

Initially, a mobility model of a node is made consistent when a node is initialized, which eventually triggers a call to the DoInitialize method of the MobilityBuildingInfo` class. In particular, it calls the MakeMobilityModelConsistent method, which goes through the lists of all buildings, determine if the node is indoor or outdoor, and if indoor it also determines the building in which the node is located and the corresponding floor number inside the building. Moreover, this method also caches the position of the node, which is used to make the mobility model consistent for a moving node whenever the IsInside method of MobilityBuildingInfo class is called.

8.2.1.6. Building-aware pathloss model

After you placed buildings and nodes in a simulation, you can use a building-aware pathloss model in a simulation exactly in the same way you would use any regular path loss model. How to do this is specific for the wireless module that you are considering (lte, wifi, wimax, etc.), so please refer to the documentation of that model for specific instructions.

8.2.1.7. Building-aware channel condition models

The class BuildingsChannelConditionModel implements a channel condition model which determines the LOS/NLOS channel state based on the buildings deployed in the scenario. In addition, based on the wall material of the building, low/high building penetration losses are considered, as defined in 3GPP TS 38.901 7.4.3.1. In particular, for O2I condition, in case of Wood or ConcreteWithWindows material, low losses are considered in the pathloss calculation. In case the material has been set to ConcreteWithoutWindows or StoneBlocks, high losses are considered. Notice that in certain corner cases, such as the I2O2I interference, the model underestimates losses by applying either low or high losses based on the wall material of the involved nodes. For a more accurate estimation the model can be further extended.

The classes ThreeGppV2vUrbanChannelConditionModel and ThreeGppV2vHighwayChannelConditionModel implement hybrid channel condition models, specifically designed to model vehicular environments. More information can be found in the documentation of the propagation module.

8.2.2. Main configurable attributes

The Building class has the following configurable parameters:

  • building type: Residential, Office and Commercial.

  • external walls type: Wood, ConcreteWithWindows, ConcreteWithoutWindows and StoneBlocks.

  • building bounds: a Box class with the building bounds.

  • number of floors.

  • number of rooms in x-axis and y-axis (rooms can be placed only in a grid way).

The BuildingMobilityLossModel parameter configurable with the ns3 attribute system is represented by the bound (string Bounds) of the simulation area by providing a Box class with the area bounds. Moreover, by means of its methods the following parameters can be configured:

  • the number of floor the node is placed (default 0).

  • the position in the rooms grid.

The BuildingPropagationLossModel class has the following configurable parameters configurable with the attribute system:

  • Frequency: reference frequency (default 2160 MHz), note that by setting the frequency the wavelength is set accordingly automatically and vice-versa).

  • Lambda: the wavelength (0.139 meters, considering the above frequency).

  • ShadowSigmaOutdoor: the standard deviation of the shadowing for outdoor nodes (default 7.0).

  • ShadowSigmaIndoor: the standard deviation of the shadowing for indoor nodes (default 8.0).

  • ShadowSigmaExtWalls: the standard deviation of the shadowing due to external walls penetration for outdoor to indoor communications (default 5.0).

  • RooftopLevel: the level of the rooftop of the building in meters (default 20 meters).

  • Los2NlosThr: the value of distance of the switching point between line-of-sigth and non-line-of-sight propagation model in meters (default 200 meters).

  • ITU1411DistanceThr: the value of distance of the switching point between short range (ITU 1211) communications and long range (Okumura Hata) in meters (default 200 meters).

  • MinDistance: the minimum distance in meters between two nodes for evaluating the pathloss (considered neglictible before this threshold) (default 0.5 meters).

  • Environment: the environment scenario among Urban, SubUrban and OpenAreas (default Urban).

  • CitySize: the dimension of the city among Small, Medium, Large (default Large).

In order to use the hybrid mode, the class to be used is the HybridBuildingMobilityLossModel, which allows the selection of the proper pathloss model according to the pathloss logic presented in the design chapter. However, this solution has the problem that the pathloss model switching points might present discontinuities due to the different characteristics of the model. This implies that according to the specific scenario, the threshold used for switching have to be properly tuned. The simple OhBuildingMobilityLossModel overcome this problem by using only the Okumura Hata model and the wall penetration losses.

8.3. Testing Documentation

8.3.1. Overview

To test and validate the ns-3 Building Pathloss module, some test suites is provided which are integrated with the ns-3 test framework. To run them, you need to have configured the build of the simulator in this way:

$ ./ns3 configure --enable-tests --enable-modules=buildings
$ ./test.py

The above will run not only the test suites belonging to the buildings module, but also those belonging to all the other ns-3 modules on which the buildings module depends. See the ns-3 manual for generic information on the testing framework.

You can get a more detailed report in HTML format in this way:

$ ./test.py -w results.html

After the above command has run, you can view the detailed result for each test by opening the file results.html with a web browser.

You can run each test suite separately using this command:

$ ./test.py -s test-suite-name

For more details about test.py and the ns-3 testing framework, please refer to the ns-3 manual.

8.3.2. Description of the test suites

8.3.2.1. BuildingsHelper test

The test suite buildings-helper checks that the method BuildingsHelper::MakeAllInstancesConsistent () works properly, i.e., that the BuildingsHelper is successful in locating if nodes are outdoor or indoor, and if indoor that they are located in the correct building, room and floor. Several test cases are provided with different buildings (having different size, position, rooms and floors) and different node positions. The test passes if each every node is located correctly.

8.3.2.2. BuildingPositionAllocator test

The test suite building-position-allocator feature two test cases that check that respectively RandomRoomPositionAllocator and SameRoomPositionAllocator work properly. Each test cases involves a single 2x3x2 room building (total 12 rooms) at known coordinates and respectively 24 and 48 nodes. Both tests check that the number of nodes allocated in each room is the expected one and that the position of the nodes is also correct.

8.3.2.3. Buildings Pathloss tests

The test suite buildings-pathloss-model provides different unit tests that compare the expected results of the buildings pathloss module in specific scenarios with pre calculated values obtained offline with an Octave script (test/reference/buildings-pathloss.m). The tests are considered passed if the two values are equal up to a tolerance of 0.1, which is deemed appropriate for the typical usage of pathloss values (which are in dB).

In the following we detailed the scenarios considered, their selection has been done for covering the wide set of possible pathloss logic combinations. The pathloss logic results therefore implicitly tested.

8.3.2.3.1. Test #1 Okumura Hata

In this test we test the standard Okumura Hata model; therefore both eNB and UE are placed outside at a distance of 2000 m. The frequency used is the E-UTRA band #5, which correspond to 869 MHz (see table 5.5-1 of 36.101). The test includes also the validation of the areas extensions (i.e., urban, suburban and open-areas) and of the city size (small, medium and large).

8.3.2.3.2. Test #2 COST231 Model

This test is aimed at validating the COST231 model. The test is similar to the Okumura Hata one, except that the frequency used is the EUTRA band #1 (2140 MHz) and that the test can be performed only for large and small cities in urban scenarios due to model limitations.

8.3.2.3.3. Test #3 2.6 GHz model

This test validates the 2.6 GHz Kun model. The test is similar to Okumura Hata one except that the frequency is the EUTRA band #7 (2620 MHz) and the test can be performed only in urban scenario.

8.3.2.3.4. Test #4 ITU1411 LoS model

This test is aimed at validating the ITU1411 model in case of line of sight within street canyons transmissions. In this case the UE is placed at 100 meters far from the eNB, since the threshold for switching between LoS and NLoS is left to default one (i.e., 200 m.).

8.3.2.3.5. Test #5 ITU1411 NLoS model

This test is aimed at validating the ITU1411 model in case of non line of sight over the rooftop transmissions. In this case the UE is placed at 900 meters far from the eNB, in order to be above the threshold for switching between LoS and NLoS is left to default one (i.e., 200 m.).

8.3.2.3.6. Test #6 ITUP1238 model

This test is aimed at validating the ITUP1238 model in case of indoor transmissions. In this case both the UE and the eNB are placed in a residential building with walls made of concrete with windows. Ue is placed at the second floor and distances 30 meters far from the eNB, which is placed at the first floor.

8.3.2.3.7. Test #7 Outdoor -> Indoor with Okumura Hata model

This test validates the outdoor to indoor transmissions for large distances. In this case the UE is placed in a residential building with wall made of concrete with windows and distances 2000 meters from the outdoor eNB.

8.3.2.3.8. Test #8 Outdoor -> Indoor with ITU1411 model

This test validates the outdoor to indoor transmissions for short distances. In this case the UE is placed in a residential building with walls made of concrete with windows and distances 100 meters from the outdoor eNB.

8.3.2.3.9. Test #9 Indoor -> Outdoor with ITU1411 model

This test validates the outdoor to indoor transmissions for very short distances. In this case the eNB is placed in the second floor of a residential building with walls made of concrete with windows and distances 100 meters from the outdoor UE (i.e., LoS communication). Therefore the height gain has to be included in the pathloss evaluation.

8.3.2.3.10. Test #10 Indoor -> Outdoor with ITU1411 model

This test validates the outdoor to indoor transmissions for short distances. In this case the eNB is placed in the second floor of a residential building with walls made of concrete with windows and distances 500 meters from the outdoor UE (i.e., NLoS communication). Therefore the height gain has to be included in the pathloss evaluation.

8.3.2.4. Buildings Shadowing Test

The test suite buildings-shadowing-test is a unit test intended to verify the statistical distribution of the shadowing model implemented by BuildingsPathlossModel. The shadowing is modeled according to a normal distribution with mean \mu = 0 and variable standard deviation \sigma, according to models commonly used in literature. Three test cases are provided, which cover the cases of indoor, outdoor and indoor-to-outdoor communications. Each test case generates 1000 different samples of shadowing for different pairs of MobilityModel instances in a given scenario. Shadowing values are obtained by subtracting from the total loss value returned by HybridBuildingsPathlossModel the path loss component which is constant and pre-determined for each test case. The test verifies that the sample mean and sample variance of the shadowing values fall within the 99% confidence interval of the sample mean and sample variance. The test also verifies that the shadowing values returned at successive times for the same pair of MobilityModel instances is constant.

8.3.2.5. Buildings Channel Condition Model Test

The BuildingsChannelConditionModelTestSuite tests the class BuildingsChannelConditionModel. It checks if the channel condition between two nodes is correctly determined when a building is deployed.

8.4. References

turkmani

Turkmani A.M.D., J.D. Parson and D.G. Lewis, “Radio propagation into buildings at 441, 900 and 1400 MHz”, in Proc. of 4th Int. Conference on Land Mobile Radio, 1987.

9. Click Modular Router Integration

Click is a software architecture for building configurable routers. By using different combinations of packet processing units called elements, a Click router can be made to perform a specific kind of functionality. This flexibility provides a good platform for testing and experimenting with different protocols.

9.1. Model Description

The source code for the Click model lives in the directory src/click.

9.1.1. Design

ns-3’s design is well suited for an integration with Click due to the following reasons:

  • Packets in ns-3 are serialised/deserialised as they move up/down the stack. This allows ns-3 packets to be passed to and from Click as they are.

  • This also means that any kind of ns-3 traffic generator and transport should work easily on top of Click.

  • By striving to implement click as an Ipv4RoutingProtocol instance, we can avoid significant changes to the LL and MAC layer of the ns-3 code.

The design goal was to make the ns-3-click public API simple enough such that the user needs to merely add an Ipv4ClickRouting instance to the node, and inform each Click node of the Click configuration file (.click file) that it is to use.

This model implements the interface to the Click Modular Router and provides the Ipv4ClickRouting class to allow a node to use Click for external routing. Unlike normal Ipv4RoutingProtocol sub types, Ipv4ClickRouting doesn’t use a RouteInput() method, but instead, receives a packet on the appropriate interface and processes it accordingly. Note that you need to have a routing table type element in your Click graph to use Click for external routing. This is needed by the RouteOutput() function inherited from Ipv4RoutingProtocol. Furthermore, a Click based node uses a different kind of L3 in the form of Ipv4L3ClickProtocol, which is a trimmed down version of Ipv4L3Protocol. Ipv4L3ClickProtocol passes on packets passing through the stack to Ipv4ClickRouting for processing.

9.1.1.1. Developing a Simulator API to allow ns-3 to interact with Click

Much of the API is already well defined, which allows Click to probe for information from the simulator (like a Node’s ID, an Interface ID and so forth). By retaining most of the methods, it should be possible to write new implementations specific to ns-3 for the same functionality.

Hence, for the Click integration with ns-3, a class named Ipv4ClickRouting will handle the interaction with Click. The code for the same can be found in src/click/model/ipv4-click-routing.{cc,h}.

9.1.1.2. Packet hand off between ns-3 and Click

There are four kinds of packet hand-offs that can occur between ns-3 and Click.

  • L4 to L3

  • L3 to L4

  • L3 to L2

  • L2 to L3

To overcome this, we implement Ipv4L3ClickProtocol, a stripped down version of Ipv4L3Protocol. Ipv4L3ClickProtocol passes packets to and from Ipv4ClickRouting appropriately to perform routing.

9.1.2. Scope and Limitations

  • In its current state, the NS-3 Click Integration is limited to use only with L3, leaving NS-3 to handle L2. We are currently working on adding Click MAC support as well. See the usage section to make sure that you design your Click graphs accordingly.

  • Furthermore, ns-3-click will work only with userlevel elements. The complete list of elements are available at https://web.archive.org/web/20171003052722/http://read.cs.ucla.edu/click/elements. Elements that have ‘all’, ‘userlevel’ or ‘ns’ mentioned beside them may be used.

  • As of now, the ns-3 interface to Click is Ipv4 only. We will be adding Ipv6 support in the future.

9.1.3. References

  • Eddie Kohler, Robert Morris, Benjie Chen, John Jannotti, and M. Frans Kaashoek. The click modular router. ACM Transactions on Computer Systems 18(3), August 2000, pages 263-297.

  • Lalith Suresh P., and Ruben Merz. Ns-3-click: click modular router integration for ns-3. In Proc. of 3rd International ICST Workshop on NS-3 (WNS3), Barcelona, Spain. March, 2011.

  • Michael Neufeld, Ashish Jain, and Dirk Grunwald. Nsclick: bridging network simulation and deployment. MSWiM ‘02: Proceedings of the 5th ACM international workshop on Modeling analysis and simulation of wireless and mobile systems, 2002, Atlanta, Georgia, USA. http://doi.acm.org/10.1145/570758.570772

9.2. Usage

9.2.1. Building Click

The first step is to clone Click from the github repository and build it:

$ git clone https://github.com/kohler/click
$ cd click/
$ ./configure --disable-linuxmodule --enable-nsclick --enable-wifi
$ make

The –enable-wifi flag may be skipped if you don’t intend on using Click with Wifi. * Note: You don’t need to do a ‘make install’.

Once Click has been built successfully, change into the ns-3 directory and configure ns-3 with Click Integration support:

$ ./ns3 configure --enable-examples --enable-tests --with-nsclick=/path/to/click/source

Hint: If you have click installed one directory above ns-3 (such as in the ns-3-allinone directory), and the name of the directory is ‘click’ (or a symbolic link to the directory is named ‘click’), then the –with-nsclick specifier is not necessary; the ns-3 build system will successfully find the directory.

If it says ‘enabled’ beside ‘NS-3 Click Integration Support’, then you’re good to go. Note: If running modular ns-3, the minimum set of modules required to run all ns-3-click examples is wifi, csma and config-store.

Next, try running one of the examples:

$ ./ns3 run nsclick-simple-lan

You may then view the resulting .pcap traces, which are named nsclick-simple-lan-0-0.pcap and nsclick-simple-lan-0-1.pcap.

9.2.2. Click Graph Instructions

The following should be kept in mind when making your Click graph:

  • Only userlevel elements can be used.

  • You will need to replace FromDevice and ToDevice elements with FromSimDevice and ToSimDevice elements.

  • Packets to the kernel are sent up using ToSimDevice(tap0,IP).

  • For any node, the device which sends/receives packets to/from the kernel, is named ‘tap0’. The remaining interfaces should be named eth0, eth1 and so forth (even if you’re using wifi). Please note that the device numbering should begin from 0. In future, this will be made flexible so that users can name devices in their Click file as they wish.

  • A routing table element is a mandatory. The OUTports of the routing table element should correspond to the interface number of the device through which the packet will ultimately be sent out. Violating this rule will lead to really weird packet traces. This routing table element’s name should then be passed to the Ipv4ClickRouting protocol object as a simulation parameter. See the Click examples for details.

  • The current implementation leaves Click with mainly L3 functionality, with ns-3 handling L2. We will soon begin working to support the use of MAC protocols on Click as well. This means that as of now, Click’s Wifi specific elements cannot be used with ns-3.

9.2.3. Debugging Packet Flows from Click

From any point within a Click graph, you may use the Print (https://web.archive.org/web/20171003052722/http://read.cs.ucla.edu/click/elements/print) element and its variants for pretty printing of packet contents. Furthermore, you may generate pcap traces of packets flowing through a Click graph by using the ToDump (https://web.archive.org/web/20171003052722/http://read.cs.ucla.edu/click/elements/todump) element as well. For instance:

myarpquerier
 -> Print(fromarpquery,64)
 -> ToDump(out_arpquery,PER_NODE 1)
 -> ethout;

and …will print the contents of packets that flow out of the ArpQuerier, then generate a pcap trace file which will have a suffix ‘out_arpquery’, for each node using the Click file, before pushing packets onto ‘ethout’.

9.2.4. Helper

To have a node run Click, the easiest way would be to use the ClickInternetStackHelper class in your simulation script. For instance:

ClickInternetStackHelper click;
click.SetClickFile(myNodeContainer, "nsclick-simple-lan.click");
click.SetRoutingTableElement(myNodeContainer, "u/rt");
click.Install(myNodeContainer);

The example scripts inside src/click/examples/ demonstrate the use of Click based nodes in different scenarios. The helper source can be found inside src/click/helper/click-internet-stack-helper.{h,cc}

9.2.5. Examples

The following examples have been written, which can be found in src/click/examples/:

  • nsclick-simple-lan.cc and nsclick-raw-wlan.cc: A Click based node communicating with a normal ns-3 node without Click, using Csma and Wifi respectively. It also demonstrates the use of TCP on top of Click, something which the original nsclick implementation for NS-2 couldn’t achieve.

  • nsclick-udp-client-server-csma.cc and nsclick-udp-client-server-wifi.cc: A 3 node LAN (Csma and Wifi respectively) wherein 2 Click based nodes run a UDP client, that sends packets to a third Click based node running a UDP server.

  • nsclick-routing.cc: One Click based node communicates to another via a third node that acts as an IP router (using the IP router Click configuration). This demonstrates routing using Click.

Scripts are available within <click-dir>/conf/ that allow you to generate Click files for some common scenarios. The IP Router used in nsclick-routing.cc was generated from the make-ip-conf.pl file and slightly adapted to work with ns-3-click.

9.3. Validation

This model has been tested as follows:

  • Unit tests have been written to verify the internals of Ipv4ClickRouting. This can be found in src/click/ipv4-click-routing-test.cc. These tests verify whether the methods inside Ipv4ClickRouting which deal with Device name to ID, IP Address from device name and Mac Address from device name bindings work as expected.

  • The examples have been used to test Click with actual simulation scenarios. These can be found in src/click/examples/. These tests cover the following: the use of different kinds of transports on top of Click, TCP/UDP, whether Click nodes can communicate with non-Click based nodes, whether Click nodes can communicate with each other, using Click to route packets using static routing.

  • Click has been tested with Csma, Wifi and Point-to-Point devices. Usage instructions are available in the preceding section.

10. CSMA NetDevice

This is the introduction to CSMA NetDevice chapter, to complement the CSMA model doxygen.

10.1. Overview of the CSMA model

The ns-3 CSMA device models a simple bus network in the spirit of Ethernet. Although it does not model any real physical network you could ever build or buy, it does provide some very useful functionality.

Typically when one thinks of a bus network Ethernet or IEEE 802.3 comes to mind. Ethernet uses CSMA/CD (Carrier Sense Multiple Access with Collision Detection with exponentially increasing backoff to contend for the shared transmission medium. The ns-3 CSMA device models only a portion of this process, using the nature of the globally available channel to provide instantaneous (faster than light) carrier sense and priority-based collision “avoidance.” Collisions in the sense of Ethernet never happen and so the ns-3 CSMA device does not model collision detection, nor will any transmission in progress be “jammed.”

10.1.1. CSMA Layer Model

There are a number of conventions in use for describing layered communications architectures in the literature and in textbooks. The most common layering model is the ISO seven layer reference model. In this view the CsmaNetDevice and CsmaChannel pair occupies the lowest two layers – at the physical (layer one), and data link (layer two) positions. Another important reference model is that specified by RFC 1122, “Requirements for Internet Hosts – Communication Layers.” In this view the CsmaNetDevice and CsmaChannel pair occupies the lowest layer – the link layer. There is also a seemingly endless litany of alternative descriptions found in textbooks and in the literature. We adopt the naming conventions used in the IEEE 802 standards which speak of LLC, MAC, MII and PHY layering. These acronyms are defined as:

  • LLC: Logical Link Control;

  • MAC: Media Access Control;

  • MII: Media Independent Interface;

  • PHY: Physical Layer.

In this case the LLC and MAC are sublayers of the OSI data link layer and the MII and PHY are sublayers of the OSI physical layer.

The “top” of the CSMA device defines the transition from the network layer to the data link layer. This transition is performed by higher layers by calling either CsmaNetDevice::Send or CsmaNetDevice::SendFrom.

In contrast to the IEEE 802.3 standards, there is no precisely specified PHY in the CSMA model in the sense of wire types, signals or pinouts. The “bottom” interface of the CsmaNetDevice can be thought of as as a kind of Media Independent Interface (MII) as seen in the “Fast Ethernet” (IEEE 802.3u) specifications. This MII interface fits into a corresponding media independent interface on the CsmaChannel. You will not find the equivalent of a 10BASE-T or a 1000BASE-LX PHY.

The CsmaNetDevice calls the CsmaChannel through a media independent interface. There is a method defined to tell the channel when to start “wiggling the wires” using the method CsmaChannel::TransmitStart, and a method to tell the channel when the transmission process is done and the channel should begin propagating the last bit across the “wire”: CsmaChannel::TransmitEnd.

When the TransmitEnd method is executed, the channel will model a single uniform signal propagation delay in the medium and deliver copes of the packet to each of the devices attached to the packet via the CsmaNetDevice::Receive method.

There is a “pin” in the device media independent interface corresponding to “COL” (collision). The state of the channel may be sensed by calling CsmaChannel::GetState. Each device will look at this “pin” before starting a send and will perform appropriate backoff operations if required.

Properly received packets are forwarded up to higher levels from the CsmaNetDevice via a callback mechanism. The callback function is initialized by the higher layer (when the net device is attached) using CsmaNetDevice::SetReceiveCallback and is invoked upon “proper” reception of a packet by the net device in order to forward the packet up the protocol stack.

10.2. CSMA Channel Model

The class CsmaChannel models the actual transmission medium. There is no fixed limit for the number of devices connected to the channel. The CsmaChannel models a data rate and a speed-of-light delay which can be accessed via the attributes “DataRate” and “Delay” respectively. The data rate provided to the channel is used to set the data rates used by the transmitter sections of the CSMA devices connected to the channel. There is no way to independently set data rates in the devices. Since the data rate is only used to calculate a delay time, there is no limitation (other than by the data type holding the value) on the speed at which CSMA channels and devices can operate; and no restriction based on any kind of PHY characteristics.

The CsmaChannel has three states, IDLE, TRANSMITTING and PROPAGATING. These three states are “seen” instantaneously by all devices on the channel. By this we mean that if one device begins or ends a simulated transmission, all devices on the channel are immediately aware of the change in state. There is no time during which one device may see an IDLE channel while another device physically further away in the collision domain may have begun transmitting with the associated signals not propagated down the channel to other devices. Thus there is no need for collision detection in the CsmaChannel model and it is not implemented in any way.

We do, as the name indicates, have a Carrier Sense aspect to the model. Since the simulator is single threaded, access to the common channel will be serialized by the simulator. This provides a deterministic mechanism for contending for the channel. The channel is allocated (transitioned from state IDLE to state TRANSMITTING) on a first-come first-served basis. The channel always goes through a three state process:

IDLE -> TRANSMITTING -> PROPAGATING -> IDLE

The TRANSMITTING state models the time during which the source net device is actually wiggling the signals on the wire. The PROPAGATING state models the time after the last bit was sent, when the signal is propagating down the wire to the “far end.”

The transition to the TRANSMITTING state is driven by a call to CsmaChannel::TransmitStart which is called by the net device that transmits the packet. It is the responsibility of that device to end the transmission with a call to CsmaChannel::TransmitEnd at the appropriate simulation time that reflects the time elapsed to put all of the packet bits on the wire. When TransmitEnd is called, the channel schedules an event corresponding to a single speed-of-light delay. This delay applies to all net devices on the channel identically. You can think of a symmetrical hub in which the packet bits propagate to a central location and then back out equal length cables to the other devices on the channel. The single “speed of light” delay then corresponds to the time it takes for: 1) a signal to propagate from one CsmaNetDevice through its cable to the hub; plus 2) the time it takes for the hub to forward the packet out a port; plus 3) the time it takes for the signal in question to propagate to the destination net device.

The CsmaChannel models a broadcast medium so the packet is delivered to all of the devices on the channel (including the source) at the end of the propagation time. It is the responsibility of the sending device to determine whether or not it receives a packet broadcast over the channel.

The CsmaChannel provides following Attributes:

  • DataRate: The bitrate for packet transmission on connected devices;

  • Delay: The speed of light transmission delay for the channel.

10.3. CSMA Net Device Model

The CSMA network device appears somewhat like an Ethernet device. The CsmaNetDevice provides following Attributes:

  • Address: The Mac48Address of the device;

  • SendEnable: Enable packet transmission if true;

  • ReceiveEnable: Enable packet reception if true;

  • EncapsulationMode: Type of link layer encapsulation to use;

  • RxErrorModel: The receive error model;

  • TxQueue: The transmit queue used by the device;

  • InterframeGap: The optional time to wait between “frames”;

  • Rx: A trace source for received packets;

  • Drop: A trace source for dropped packets.

The CsmaNetDevice supports the assignment of a “receive error model.” This is an ErrorModel object that is used to simulate data corruption on the link.

Packets sent over the CsmaNetDevice are always routed through the transmit queue to provide a trace hook for packets sent out over the network. This transmit queue can be set (via attribute) to model different queuing strategies.

Also configurable by attribute is the encapsulation method used by the device. Every packet gets an EthernetHeader that includes the destination and source MAC addresses, and a length/type field. Every packet also gets an EthernetTrailer which includes the FCS. Data in the packet may be encapsulated in different ways.

By default, or by setting the “EncapsulationMode” attribute to “Dix”, the encapsulation is according to the DEC, Intel, Xerox standard. This is sometimes called EthernetII framing and is the familiar destination MAC, source MAC, EtherType, Data, CRC format.

If the “EncapsulationMode” attribute is set to “Llc”, the encapsulation is by LLC SNAP. In this case, a SNAP header is added that contains the EtherType (IP or ARP).

The other implemented encapsulation modes are IP_ARP (set “EncapsulationMode” to “IpArp”) in which the length type of the Ethernet header receives the protocol number of the packet; or ETHERNET_V1 (set “EncapsulationMode” to “EthernetV1”) in which the length type of the Ethernet header receives the length of the packet. A “Raw” encapsulation mode is defined but not implemented – use of the RAW mode results in an assertion.

Note that all net devices on a channel must be set to the same encapsulation mode for correct results. The encapsulation mode is not sensed at the receiver.

The CsmaNetDevice implements a random exponential backoff algorithm that is executed if the channel is determined to be busy (TRANSMITTING or PPROPAGATING) when the device wants to start propagating. This results in a random delay of up to pow (2, retries) - 1 microseconds before a retry is attempted. The default maximum number of retries is 1000.

10.4. Using the CsmaNetDevice

The CSMA net devices and channels are typically created and configured using the associated CsmaHelper object. The various ns-3 device helpers generally work in a similar way, and their use is seen in many of our example programs.

The conceptual model of interest is that of a bare computer “husk” into which you plug net devices. The bare computers are created using a NodeContainer helper. You just ask this helper to create as many computers (we call them Nodes) as you need on your network:

NodeContainer csmaNodes;
csmaNodes.Create(nCsmaNodes);

Once you have your nodes, you need to instantiate a CsmaHelper and set any attributes you may want to change.:

CsmaHelper csma;
csma.SetChannelAttribute("DataRate", StringValue("100Mbps"));
csma.SetChannelAttribute("Delay", TimeValue(NanoSeconds(6560)));

csma.SetDeviceAttribute("EncapsulationMode", StringValue("Dix"));
csma.SetDeviceAttribute("FrameSize", UintegerValue(2000));

Once the attributes are set, all that remains is to create the devices and install them on the required nodes, and to connect the devices together using a CSMA channel. When we create the net devices, we add them to a container to allow you to use them in the future. This all takes just one line of code.:

NetDeviceContainer csmaDevices = csma.Install(csmaNodes);

We recommend thinking carefully about changing these Attributes, since it can result in behavior that surprises users. We allow this because we believe flexibility is important. As an example of a possibly surprising effect of changing Attributes, consider the following:

The Mtu Attribute indicates the Maximum Transmission Unit to the device. This is the size of the largest Protocol Data Unit (PDU) that the device can send. This Attribute defaults to 1500 bytes and corresponds to a number found in RFC 894, “A Standard for the Transmission of IP Datagrams over Ethernet Networks.” The number is actually derived from the maximum packet size for 10Base5 (full-spec Ethernet) networks – 1518 bytes. If you subtract DIX encapsulation overhead for Ethernet packets (18 bytes) you will end up with a maximum possible data size (MTU) of 1500 bytes. One can also find that the MTU for IEEE 802.3 networks is 1492 bytes. This is because LLC/SNAP encapsulation adds an extra eight bytes of overhead to the packet. In both cases, the underlying network hardware is limited to 1518 bytes, but the MTU is different because the encapsulation is different.

If one leaves the Mtu Attribute at 1500 bytes and changes the encapsulation mode Attribute to Llc, the result will be a network that encapsulates 1500 byte PDUs with LLC/SNAP framing resulting in packets of 1526 bytes. This would be illegal in many networks, but we allow you do do this. This results in a simulation that quite subtly does not reflect what you might be expecting since a real device would balk at sending a 1526 byte packet.

There also exist jumbo frames (1500 < MTU <= 9000 bytes) and super-jumbo (MTU > 9000 bytes) frames that are not officially sanctioned by IEEE but are available in some high-speed (Gigabit) networks and NICs. In the CSMA model, one could leave the encapsulation mode set to Dix, and set the Mtu to 64000 bytes – even though an associated CsmaChannel DataRate was left at 10 megabits per second (certainly not Gigabit Ethernet). This would essentially model an Ethernet switch made out of vampire-tapped 1980s-style 10Base5 networks that support super-jumbo datagrams, which is certainly not something that was ever made, nor is likely to ever be made; however it is quite easy for you to configure.

Be careful about assumptions regarding what CSMA is actually modelling and how configuration (Attributes) may allow you to swerve considerably away from reality.

10.5. CSMA Tracing

Like all ns-3 devices, the CSMA Model provides a number of trace sources. These trace sources can be hooked using your own custom trace code, or you can use our helper functions to arrange for tracing to be enabled on devices you specify.

10.5.1. Upper-Level (MAC) Hooks

From the point of view of tracing in the net device, there are several interesting points to insert trace hooks. A convention inherited from other simulators is that packets destined for transmission onto attached networks pass through a single “transmit queue” in the net device. We provide trace hooks at this point in packet flow, which corresponds (abstractly) only to a transition from the network to data link layer, and call them collectively the device MAC hooks.

When a packet is sent to the CSMA net device for transmission it always passes through the transmit queue. The transmit queue in the CsmaNetDevice inherits from Queue, and therefore inherits three trace sources:

  • An Enqueue operation source (see Queue::m_traceEnqueue);

  • A Dequeue operation source (see Queue::m_traceDequeue);

  • A Drop operation source (see Queue::m_traceDrop).

The upper-level (MAC) trace hooks for the CsmaNetDevice are, in fact, exactly these three trace sources on the single transmit queue of the device.

The m_traceEnqueue event is triggered when a packet is placed on the transmit queue. This happens at the time that CsmaNetDevice::Send or CsmaNetDevice::SendFrom is called by a higher layer to queue a packet for transmission.

The m_traceDequeue event is triggered when a packet is removed from the transmit queue. Dequeues from the transmit queue can happen in three situations: 1) If the underlying channel is idle when the CsmaNetDevice::Send or CsmaNetDevice::SendFrom is called, a packet is dequeued from the transmit queue and immediately transmitted; 2) If the underlying channel is idle, a packet may be dequeued and immediately transmitted in an internal TransmitCompleteEvent that functions much like a transmit complete interrupt service routine; or 3) from the random exponential backoff handler if a timeout is detected.

Case (3) implies that a packet is dequeued from the transmit queue if it is unable to be transmitted according to the backoff rules. It is important to understand that this will appear as a Dequeued packet and it is easy to incorrectly assume that the packet was transmitted since it passed through the transmit queue. In fact, a packet is actually dropped by the net device in this case. The reason for this behavior is due to the definition of the Queue Drop event. The m_traceDrop event is, by definition, fired when a packet cannot be enqueued on the transmit queue because it is full. This event only fires if the queue is full and we do not overload this event to indicate that the CsmaChannel is “full.”

10.5.2. Lower-Level (PHY) Hooks

Similar to the upper level trace hooks, there are trace hooks available at the lower levels of the net device. We call these the PHY hooks. These events fire from the device methods that talk directly to the CsmaChannel.

The trace source m_dropTrace is called to indicate a packet that is dropped by the device. This happens in two cases: First, if the receive side of the net device is not enabled (see CsmaNetDevice::m_receiveEnable and the associated attribute “ReceiveEnable”).

The m_dropTrace is also used to indicate that a packet was discarded as corrupt if a receive error model is used (see CsmaNetDevice::m_receiveErrorModel and the associated attribute “ReceiveErrorModel”).

The other low-level trace source fires on reception of an accepted packet (see CsmaNetDevice::m_rxTrace). A packet is accepted if it is destined for the broadcast address, a multicast address, or to the MAC address assigned to the net device.

10.6. Summary

The ns3 CSMA model is a simplistic model of an Ethernet-like network. It supports a Carrier-Sense function and allows for Multiple Access to a shared medium. It is not physical in the sense that the state of the medium is instantaneously shared among all devices. This means that there is no collision detection required in this model and none is implemented. There will never be a “jam” of a packet already on the medium. Access to the shared channel is on a first-come first-served basis as determined by the simulator scheduler. If the channel is determined to be busy by looking at the global state, a random exponential backoff is performed and a retry is attempted.

Ns-3 Attributes provide a mechanism for setting various parameters in the device and channel such as addresses, encapsulation modes and error model selection. Trace hooks are provided in the usual manner with a set of upper level hooks corresponding to a transmit queue and used in ASCII tracing; and also a set of lower level hooks used in pcap tracing.

Although the ns-3 CsmaChannel and CsmaNetDevice does not model any kind of network you could build or buy, it does provide us with some useful functionality. You should, however, understand that it is explicitly not Ethernet or any flavor of IEEE 802.3 but an interesting subset.

11. DSDV Routing

Destination-Sequenced Distance Vector (DSDV) routing protocol is a pro-active, table-driven routing protocol for MANETs developed by Charles E. Perkins and Pravin Bhagwat in 1994. It uses the hop count as metric in route selection.

This model was developed by the ResiliNets research group at the University of Kansas. A paper on this model exists at this URL.

11.1. DSDV Routing Overview

DSDV Routing Table: Every node will maintain a table listing all the other nodes it has known either directly or through some neighbors. Every node has a single entry in the routing table. The entry will have information about the node’s IP address, last known sequence number and the hop count to reach that node. Along with these details the table also keeps track of the nexthop neighbor to reach the destination node, the timestamp of the last update received for that node.

The DSDV update message consists of three fields, Destination Address, Sequence Number and Hop Count.

Each node uses 2 mechanisms to send out the DSDV updates. They are,

  1. Periodic Updates

    Periodic updates are sent out after every m_periodicUpdateInterval(default:15s). In this update the node broadcasts out its entire routing table.

  2. Trigger Updates

    Trigger Updates are small updates in-between the periodic updates. These updates are sent out whenever a node receives a DSDV packet that caused a change in its routing table. The original paper did not clearly mention when for what change in the table should a DSDV update be sent out. The current implementation sends out an update irrespective of the change in the routing table.

The updates are accepted based on the metric for a particular node. The first factor determining the acceptance of an update is the sequence number. It has to accept the update if the sequence number of the update message is higher irrespective of the metric. If the update with same sequence number is received, then the update with least metric (hopCount) is given precedence.

In highly mobile scenarios, there is a high chance of route fluctuations, thus we have the concept of weighted settling time where an update with change in metric will not be advertised to neighbors. The node waits for the settling time to make sure that it did not receive the update from its old neighbor before sending out that update.

The current implementation covers all the above features of DSDV. The current implementation also has a request queue to buffer packets that have no routes to destination. The default is set to buffer up to 5 packets per destination.

11.2. References

Link to the Paper: http://portal.acm.org/citation.cfm?doid=190314.190336

12. DSR Routing

Dynamic Source Routing (DSR) protocol is a reactive routing protocol designed specifically for use in multi-hop wireless ad hoc networks of mobile nodes.

This model was developed by the ResiliNets research group at the University of Kansas.

12.1. DSR Routing Overview

This model implements the base specification of the Dynamic Source Routing (DSR) protocol. Implementation is based on RFC 4728, with some extensions and modifications to the RFC specifications.

DSR operates on a on-demand behavior. Therefore, our DSR model buffers all packets while a route request packet (RREQ) is disseminated. We implement a packet buffer in dsr-rsendbuff.cc. The packet queue implements garbage collection of old packets and a queue size limit. When the packet is sent out from the send buffer, it will be queued in maintenance buffer for next hop acknowledgment.

The maintenance buffer then buffers the already sent out packets and waits for the notification of packet delivery. Protocol operation strongly depends on broken link detection mechanism. We implement the three heuristics recommended based the RFC as follows:

First, we use link layer feedback when possible, which is also the fastest mechanism of these three to detect link errors. A link is considered to be broken if frame transmission results in a transmission failure for all retries. This mechanism is meant for active links and works much faster than in its absence. DSR is able to detect the link layer transmission failure and notify that as broken. Recalculation of routes will be triggered when needed. If user does not want to use link layer acknowledgment, it can be tuned by setting “LinkAcknowledgment” attribute to false in “dsr-routing.cc”.

Second, passive acknowledgment should be used whenever possible. The node turns on “promiscuous” receive mode, in which it can receive packets not destined for itself, and when the node assures the delivery of that data packet to its destination, it cancels the passive acknowledgment timer.

Last, we use a network layer acknowledge scheme to notify the receipt of a packet. Route request packet will not be acknowledged or retransmitted.

The Route Cache implementation support garbage collection of old entries and state machine, as defined in the standard. It implements as a STL map container. The key is the destination IP address.

DSR operates with direct access to IP header, and operates between network and transport layer. When packet is sent out from transport layer, it passes itself to DSR and DSR header is appended.

We have two caching mechanisms: path cache and link cache. The path cache saves the whole path in the cache. The paths are sorted based on the hop count, and whenever one path is not able to be used, we change to the next path. The link cache is a slightly better design in the sense that it uses different subpaths and uses Implemented Link Cache using Dijkstra algorithm, and this part is implemented by Song Luan <lsuper@mail.ustc.edu.cn>.

The following optional protocol optimizations aren’t implemented:

  • Flow state

  • First Hop External (F), Last Hop External (L) flags

  • Handling unknown DSR options

  • Two types of error headers:
    1. flow state not supported option

    2. unsupported option (not going to happen in simulation)

12.1.1. DSR update in ns-3.17

We originally used “TxErrHeader” in Ptr<WifiMac> to indicate the transmission error of a specific packet in link layer, however, it was not working quite correctly since even when the packet was dropped, this header was not recorded in the trace file. We used to a different path on implementing the link layer notification mechanism. We look into the trace file by finding packet receive event. If we find one receive event for the data packet, we count that as the indicator for successful data delivery.

12.1.2. Useful parameters

+------------------------- +------------------------------------+-------------+
| Parameter                | Description                        | Default     |
+==========================+====================================+=============+
| MaxSendBuffLen           | Maximum number of packets that can | 64          |
|                          | be stored in send buffer           |             |
+------------------------- +------------------------------------+-------------+
| MaxSendBuffTime          | Maximum time packets can be queued | Seconds(30) |
|                          | in the send buffer                 |             |
+------------------------- +------------------------------------+-------------+
| MaxMaintLen              | Maximum number of packets that can | 50          |
|                          | be stored in maintenance buffer    |             |
+------------------------- +------------------------------------+-------------+
| MaxMaintTime             | Maximum time packets can be queued | Seconds(30) |
|                          | in maintenance buffer              |             |
+------------------------- +------------------------------------+-------------+
| MaxCacheLen              | Maximum number of route entries    | 64          |
|                          | that can be stored in route cache  |             |
+------------------------- +------------------------------------+-------------+
| RouteCacheTimeout        | Maximum time the route cache can   | Seconds(300)|
|                          | be queued in route cache           |             |
+------------------------- +------------------------------------+-------------+
| RreqRetries              | Maximum number of retransmissions  | 16          |
|                          | for request discovery of a route   |             |
+------------------------- +------------------------------------+-------------+
| CacheType                | Use Link Cache or use Path Cache   | "LinkCache" |
|                          |                                    |             |
+------------------------- +------------------------------------+-------------+
| LinkAcknowledgment       | Enable Link layer acknowledgment   | True        |
|                          | mechanism                          |             |
+------------------------- +------------------------------------+-------------+

12.1.3. Implementation modification

  • The DsrFsHeader has added 3 fields: message type, source id, destination id, and these changes only for post-processing 1. Message type is used to identify the data packet from control packet 2. source id is used to identify the real source of the data packet since we have to deliver the packet hop-by-hop and the Ipv4Header is not carrying the real source and destination ip address as needed 3. destination id is for same reason of above

  • Route Reply header is not word-aligned in DSR RFC, change it to word-aligned in implementation

  • DSR works as a shim header between transport and network protocol, it needs its own forwarding mechanism, we are changing the packet transmission to hop-by-hop delivery, so we added two fields in dsr fixed header to notify packet delivery

12.1.4. Current Route Cache implementation

This implementation used “path cache”, which is simple to implement and ensures loop-free paths:

  • the path cache has automatic expire policy

  • the cache saves multiple route entries for a certain destination and sort the entries based on hop counts

  • the MaxEntriesEachDst can be tuned to change the maximum entries saved for a single destination

  • when adding multiple routes for one destination, the route is compared based on hop-count and expire time, the one with less hop count or relatively new route is favored

  • Future implementation may include “link cache” as another possibility

12.2. DSR Instructions

The following should be kept in mind when running DSR as routing protocol:

  • NodeTraversalTime is the time it takes to traverse two neighboring nodes and should be chosen to fit the transmission range

  • PassiveAckTimeout is the time a packet in maintenance buffer wait for passive acknowledgment, normally set as two times of NodeTraversalTime

  • RouteCacheTimeout should be set smaller value when the nodes’ velocity become higher. The default value is 300s.

12.3. Helper

To have a node run DSR, the easiest way would be to use the DsrHelper and DsrMainHelpers in your simulation script. For instance:

DsrHelper dsr;
DsrMainHelper dsrMain;
dsrMain.Install(dsr, adhocNodes);

The example scripts inside src/dsr/examples/ demonstrate the use of DSR based nodes in different scenarios. The helper source can be found inside src/dsr/helper/dsr-main-helper.{h,cc} and src/dsr/helper/dsr-helper.{h,cc}

12.4. Examples

The example can be found in src/dsr/examples/:

  • dsr.cc use DSR as routing protocol within a traditional MANETs environment[3].

DSR is also built in the routing comparison case in examples/routing/:

  • manet-routing-compare.cc is a comparison case with built in MANET routing protocols and can generate its own results.

12.5. Validation

This model has been tested as follows:

  • Unit tests have been written to verify the internals of DSR. This can be found in src/dsr/test/dsr-test-suite.cc. These tests verify whether the methods inside DSR module which deal with packet buffer, headers work correctly.

  • Simulation cases similar to [3] have been tested and have comparable results.

  • manet-routing-compare.cc has been used to compare DSR with three of other routing protocols.

A paper was presented on these results at the Workshop on ns-3 in 2011.

12.6. Limitations

The model is not fully compliant with RFC 4728. As an example, Dsr fixed size header has been extended and it is four octets longer then the RFC specification. As a consequence, the DSR headers can not be correctly decoded by Wireshark.

The model full compliance with the RFC is planned for the future.

13. Emulation Overview

ns-3 has been designed for integration into testbed and virtual machine environments. We have addressed this need by providing two kinds of net devices. The first kind of device is a file descriptor net device (FdNetDevice), which is a generic device type that can read and write from a file descriptor. By associating this file descriptor with different things on the host system, different capabilities can be provided. For instance, the FdNetDevice can be associated with an underlying packet socket to provide emulation capabilities. This allows ns-3 simulations to send data on a “real” network. The second kind, called a TapBridge NetDevice allows a “real” host to participate in an ns-3 simulation as if it were one of the simulated nodes. An ns-3 simulation may be constructed with any combination of simulated or emulated devices.

Note: Prior to ns-3.17, the emulation capability was provided by a special device called an Emu NetDevice; the Emu NetDevice has been replaced by the FdNetDevice.

One of the use-cases we want to support is that of a testbed. A concrete example of an environment of this kind is the ORBIT testbed. ORBIT is a laboratory emulator/field trial network arranged as a two dimensional grid of 400 802.11 radio nodes. We integrate with ORBIT by using their “imaging” process to load and run ns-3 simulations on the ORBIT array. We can use our EmuFdNetDevice to drive the hardware in the testbed and we can accumulate results either using the ns-3 tracing and logging functions, or the native ORBIT data gathering techniques. See http://www.orbit-lab.org/ for details on the ORBIT testbed.

A simulation of this kind is shown in the following figure:

_images/testbed.png

Example Implementation of Testbed Emulation.

You can see that there are separate hosts, each running a subset of a “global” simulation. Instead of an ns-3 channel connecting the hosts, we use real hardware provided by the testbed. This allows ns-3 applications and protocol stacks attached to a simulation node to communicate over real hardware.

We expect the primary use for this configuration will be to generate repeatable experimental results in a real-world network environment that includes all of the ns-3 tracing, logging, visualization and statistics gathering tools.

In what can be viewed as essentially an inverse configuration, we allow “real” machines running native applications and protocol stacks to integrate with an ns-3 simulation. This allows for the simulation of large networks connected to a real machine, and also enables virtualization. A simulation of this kind is shown in the following figure:

_images/emulated-channel.png

Implementation overview of emulated channel.

Here, you will see that there is a single host with a number of virtual machines running on it. An ns-3 simulation is shown running in the virtual machine shown in the center of the figure. This simulation has a number of nodes with associated ns-3 applications and protocol stacks that are talking to an ns-3 channel through native simulated ns-3 net devices.

There are also two virtual machines shown at the far left and far right of the figure. These VMs are running native (Linux) applications and protocol stacks. The VM is connected into the simulation by a Linux Tap net device. The user-mode handler for the Tap device is instantiated in the simulation and attached to a proxy node that represents the native VM in the simulation. These handlers allow the Tap devices on the native VMs to behave as if they were ns-3 net devices in the simulation VM. This, in turn, allows the native software and protocol suites in the native VMs to believe that they are connected to the simulated ns-3 channel.

We expect the typical use case for this environment will be to analyze the behavior of native applications and protocol suites in the presence of large simulated ns-3 networks.

The basic testbed mode of emulation uses raw sockets. Two other variants (netmap-based and DPDK-based emulation) have been recently added; these make use of more recent network interface cards that make use of directly-mapped memory capabilities to improve packet processing efficiency.

For more details:

13.1. File Descriptor NetDevice

The src/fd-net-device module provides the FdNetDevice class, which is able to read and write traffic using a file descriptor provided by the user. This file descriptor can be associated to a TAP device, to a raw socket, to a user space process generating/consuming traffic, etc. The user has full freedom to define how external traffic is generated and ns-3 traffic is consumed.

Different mechanisms to associate a simulation to external traffic can be provided through helper classes. Two specific helpers are provided:

  • EmuFdNetDeviceHelper (to associate the ns-3 device with a physical device in the host machine)

  • TapFdNetDeviceHelper (to associate the ns-3 device with the file descriptor from a tap device in the host machine)

13.1.1. Model Description

The source code for this module lives in the directory src/fd-net-device.

The FdNetDevice is a special type of ns-3 NetDevice that reads traffic to and from a file descriptor. That is, unlike pure simulation NetDevice objects that write frames to and from a simulated channel, this FdNetDevice directs frames out of the simulation to a file descriptor. The file descriptor may be associated to a Linux TUN/TAP device, to a socket, or to a user-space process.

It is up to the user of this device to provide a file descriptor. The type of file descriptor being provided determines what is being modelled. For instance, if the file descriptor provides a raw socket to a WiFi card on the host machine, the device being modelled is a WiFi device.

From the conceptual “top” of the device looking down, it looks to the simulated node like a device supporting a 48-bit IEEE MAC address that can be bridged, supports broadcast, and uses IPv4 ARP or IPv6 Neighbor Discovery, although these attributes can be tuned on a per-use-case basis.

13.1.1.1. Design

The FdNetDevice implementation makes use of a reader object, extended from the FdReader class in the ns-3 src/core module, which manages a separate thread from the main ns-3 execution thread, in order to read traffic from the file descriptor.

Upon invocation of the StartDevice method, the reader object is initialized and starts the reading thread. Before device start, a file descriptor must be previously associated to the FdNetDevice with the SetFileDescriptor invocation.

The creation and configuration of the file descriptor can be left to a number of helpers, described in more detail below. When this is done, the invocation of SetFileDescriptor is responsibility of the helper and must not be directly invoked by the user.

Upon reading an incoming frame from the file descriptor, the reader will pass the frame to the ReceiveCallback method, whose task it is to schedule the reception of the frame by the device as a ns-3 simulation event. Since the new frame is passed from the reader thread to the main ns-3 simulation thread, thread-safety issues are avoided by using the ScheduleWithContext call instead of the regular Schedule call.

In order to avoid overwhelming the scheduler when the incoming data rate is too high, a counter is kept with the number of frames that are currently scheduled to be received by the device. If this counter reaches the value given by the RxQueueSize attribute in the device, then the new frame will be dropped silently.

The actual reception of the new frame by the device occurs when the scheduled FordwarUp method is invoked by the simulator. This method acts as if a new frame had arrived from a channel attached to the device. The device then decapsulates the frame, removing any layer 2 headers, and forwards it to upper network stack layers of the node. The ForwardUp method will remove the frame headers, according to the frame encapsulation type defined by the EncapsulationMode attribute, and invoke the receive callback passing an IP packet.

An extra header, the PI header, can be present when the file descriptor is associated to a TAP device that was created without setting the IFF_NO_PI flag. This extra header is removed if EncapsulationMode is set to DIXPI value.

In the opposite direction, packets generated inside the simulation that are sent out through the device, will be passed to the Send method, which will in turn invoke the SendFrom method. The latter method will add the necessary layer 2 headers, and simply write the newly created frame to the file descriptor.

13.1.1.2. Scope and Limitations

Users of this device are cautioned that there is no flow control across the file descriptor boundary, when using in emulation mode. That is, in a Linux system, if the speed of writing network packets exceeds the ability of the underlying physical device to buffer the packets, backpressure up to the writing application will be applied to avoid local packet loss. No such flow control is provided across the file descriptor interface, so users must be aware of this limitation.

As explained before, the RxQueueSize attribute limits the number of packets that can be pending to be received by the device. Frames read from the file descriptor while the number of pending packets is in its maximum will be silently dropped.

The mtu of the device defaults to the Ethernet II MTU value. However, helpers are supposed to set the mtu to the right value to reflect the characteristics of the network interface associated to the file descriptor. If no helper is used, then the responsibility of setting the correct mtu value for the device falls back to the user. The size of the read buffer on the file descriptor reader is set to the mtu value in the StartDevice method.

The FdNetDevice class currently supports three encapsulation modes, DIX for Ethernet II frames, LLC for 802.2 LLC/SNAP frames, and DIXPI for Ethernet II frames with an additional TAP PI header. This means that traffic traversing the file descriptor is expected to be Ethernet II compatible. IEEE 802.1q (VLAN) tagging is not supported. Attaching an FdNetDevice to a wireless interface is possible as long as the driver provides Ethernet II frames to the socket API. Note that to associate a FdNetDevice to a wireless card in ad-hoc mode, the MAC address of the device must be set to the real card MAC address, else any incoming traffic a fake MAC address will be discarded by the driver.

As mentioned before, three helpers are provided with the fd-net-device module. Each individual helper (file descriptor type) may have platform limitations. For instance, threading, real-time simulation mode, and the ability to create TUN/TAP devices are prerequisites to using the provided helpers. Support for these modes can be found in the output of the ns3 configure step, e.g.:

Threading Primitives          : enabled
Real Time Simulator           : enabled
Emulated Net Device           : enabled
Tap Bridge                    : enabled

It is important to mention that while testing the FdNetDevice we have found an upper bound limit for TCP throughput when using 1Gb Ethernet links of 60Mbps. This limit is most likely due to the processing power of the computers involved in the tests.

13.1.2. Usage

The usage pattern for this type of device is similar to other net devices with helpers that install to node pointers or node containers. When using the base FdNetDeviceHelper the user is responsible for creating and setting the file descriptor by himself.

FdNetDeviceHelper fd;
NetDeviceContainer devices = fd.Install(nodes);

// file descriptor generation
...

device->SetFileDescriptor(fd);

Most commonly a FdNetDevice will be used to interact with the host system. In these cases it is almost certain that the user will want to run in real-time emulation mode, and to enable checksum computations. The typical program statements are as follows:

GlobalValue::Bind("SimulatorImplementationType", StringValue("ns3::RealtimeSimulatorImpl"));
GlobalValue::Bind("ChecksumEnabled", BooleanValue(true));

The easiest way to set up an experiment that interacts with a Linux host system is to user the Emu and Tap helpers. Perhaps the most unusual part of these helper implementations relates to the requirement for executing some of the code with super-user permissions. Rather than force the user to execute the entire simulation as root, we provide a small “creator” program that runs as root and does any required high-permission sockets work. The easiest way to set the right privileges for the “creator” programs, is by enabling the --enable-sudo flag when performing ns3 configure.

We do a similar thing for both the Emu and the Tap devices. The high-level view is that the CreateFileDescriptor method creates a local interprocess (Unix) socket, forks, and executes the small creation program. The small program, which runs as suid root, creates a raw socket and sends back the raw socket file descriptor over the Unix socket that is passed to it as a parameter. The raw socket is passed as a control message (sometimes called ancillary data) of type SCM_RIGHTS.

13.1.2.1. Helpers
13.1.2.1.1. EmuFdNetDeviceHelper

The EmuFdNetDeviceHelper creates a raw socket to an underlying physical device, and provides the socket descriptor to the FdNetDevice. This allows the ns-3 simulation to read frames from and write frames to a network device on the host.

The emulation helper permits to transparently integrate a simulated ns-3 node into a network composed of real nodes.

+----------------------+     +-----------------------+
|         host 1       |     |         host 2        |
+----------------------+     +-----------------------+
|    ns-3 simulation   |     |                       |
+----------------------+     |         Linux         |
|       ns-3 Node      |     |     Network Stack     |
|  +----------------+  |     |   +----------------+  |
|  |    ns-3 TCP    |  |     |   |       TCP      |  |
|  +----------------+  |     |   +----------------+  |
|  |    ns-3 IP     |  |     |   |       IP       |  |
|  +----------------+  |     |   +----------------+  |
|  |   FdNetDevice  |  |     |   |                |  |
|  |    10.1.1.1    |  |     |   |                |  |
|  +----------------+  |     |   +    ETHERNET    +  |
|  |   raw socket   |  |     |   |                |  |
|--+----------------+--|     |   +----------------+  |
|       | eth0 |       |     |        | eth0 |       |
+-------+------+-------+     +--------+------+-------+

        10.1.1.11                     10.1.1.12

            |                            |
            +----------------------------+

This helper replaces the functionality of the EmuNetDevice found in ns-3 prior to ns-3.17, by bringing this type of device into the common framework of the FdNetDevice. The EmuNetDevice was deprecated in favor of this new helper.

The device is configured to perform MAC spoofing to separate simulation network traffic from other network traffic that may be flowing to and from the host.

One can use this helper in a testbed situation where the host on which the simulation is running has a specific interface of interest which drives the testbed hardware. You would also need to set this specific interface into promiscuous mode and provide an appropriate device name to the ns-3 simulation. Additionally, hardware offloading of segmentation and checksums should be disabled.

The helper only works if the underlying interface is up and in promiscuous mode. Packets will be sent out over the device, but we use MAC spoofing. The MAC addresses will be generated (by default) using the Organizationally Unique Identifier (OUI) 00:00:00 as a base. This vendor code is not assigned to any organization and so should not conflict with any real hardware.

It is always up to the user to determine that using these MAC addresses is okay on your network and won’t conflict with anything else (including another simulation using such devices) on your network. If you are using the emulated FdNetDevice configuration in separate simulations, you must consider global MAC address assignment issues and ensure that MAC addresses are unique across all simulations. The emulated net device respects the MAC address provided in the Address attribute so you can do this manually. For larger simulations, you may want to set the OUI in the MAC address allocation function.

Before invoking the Install method, the correct device name must be configured on the helper using the SetDeviceName method. The device name is required to identify which physical device should be used to open the raw socket.

EmuFdNetDeviceHelper emu;
emu.SetDeviceName(deviceName);
NetDeviceContainer devices = emu.Install(node);
Ptr<NetDevice> device = devices.Get(0);
device->SetAttribute("Address", Mac48AddressValue(Mac48Address::Allocate()));
13.1.2.1.2. TapFdNetDeviceHelper

A Tap device is a special type of Linux device for which one end of the device appears to the kernel as a virtual net_device, and the other end is provided as a file descriptor to user-space. This file descriptor can be passed to the FdNetDevice. Packets forwarded to the TAP device by the kernel will show up in the FdNetDevice in ns-3.

Users should note that this usage of TAP devices is different than that provided by the TapBridge NetDevice found in src/tap-bridge. The model in this helper is as follows:

+-------------------------------------+
|                host                 |
+-------------------------------------+
|    ns-3 simulation   |              |
+----------------------+              |
|      ns-3 Node       |              |
|  +----------------+  |              |
|  |    ns-3 TCP    |  |              |
|  +----------------+  |              |
|  |    ns-3 IP     |  |              |
|  +----------------+  |              |
|  |   FdNetDevice  |  |              |
|--+----------------+--+    +------+  |
|       | TAP  |            | eth0 |  |
|       +------+            +------+  |
|     192.168.0.1               |     |
+-------------------------------|-----+
                                |
                                |
                                ------------ (Internet) -----

In the above, the configuration requires that the host be able to forward traffic generated by the simulation to the Internet.

The model in TapBridge (in another module) is as follows:

+--------+
|  Linux |
|  host  |                    +----------+
| ------ |                    |   ghost  |
|  apps  |                    |   node   |
| ------ |                    | -------- |
|  stack |                    |    IP    |     +----------+
| ------ |                    |   stack  |     |   node   |
|  TAP   |                    |==========|     | -------- |
| device | <----- IPC ------> |   tap    |     |    IP    |
+--------+                    |  bridge  |     |   stack  |
                              | -------- |     | -------- |
                              |   ns-3   |     |   ns-3   |
                              |   net    |     |   net    |
                              |  device  |     |  device  |
                              +----------+     +----------+
                                   ||               ||
                              +---------------------------+
                              |        ns-3 channel       |
                              +---------------------------+

In the above, packets instead traverse ns-3 NetDevices and Channels.

The usage pattern for this example is that the user sets the MAC address and either (or both) the IPv4 and IPv6 addresses and masks on the device, and the PI header if needed. For example:

TapFdNetDeviceHelper helper;
helper.SetDeviceName(deviceName);
helper.SetModePi(modePi);
helper.SetTapIpv4Address(tapIp);
helper.SetTapIpv4Mask(tapMask);
...
helper.Install(node);
13.1.2.2. Attributes

The FdNetDevice provides a number of attributes:

  • Address: The MAC address of the device

  • Start: The simulation start time to spin up the device thread

  • Stop: The simulation start time to stop the device thread

  • EncapsulationMode: Link-layer encapsulation format

  • RxQueueSize: The buffer size of the read queue on the file descriptor

    thread (default of 1000 packets)

Start and Stop do not normally need to be specified unless the user wants to limit the time during which this device is active. Address needs to be set to some kind of unique MAC address if the simulation will be interacting with other real devices somehow using real MAC addresses. Typical code:

device->SetAttribute("Address", Mac48AddressValue(Mac48Address::Allocate()));
13.1.2.3. Output

Ascii and PCAP tracing is provided similar to the other ns-3 NetDevice types, through the helpers, such as (e.g.):

::

EmuFdNetDeviceHelper emu; NetDeviceContainer devices = emu.Install(node); … emu.EnablePcap(“emu-ping”, device, true);

The standard set of Mac-level NetDevice trace sources is provided.

  • MaxTx: Trace source triggered when ns-3 provides the device with a new frame to send

  • MaxTxDrop: Trace source if write to file descriptor fails

  • MaxPromiscRx: Whenever any valid Mac frame is received

  • MaxRx: Whenever a valid Mac frame is received for this device

  • Sniffer: Non-promiscuous packet sniffer

  • PromiscSniffer: Promiscuous packet sniffer (for tcpdump-like traces)

13.1.2.4. Examples

Several examples are provided:

  • dummy-network.cc: This simple example creates two nodes and interconnects them with a Unix pipe by passing the file descriptors from the socketpair into the FdNetDevice objects of the respective nodes.

  • realtime-dummy-network.cc: Same as dummy-network.cc but uses the real time simulator implementnation instead of the default one.

  • fd2fd-onoff.cc: This example is aimed at measuring the throughput of the FdNetDevice in a pure simulation. For this purpose two FdNetDevices, attached to different nodes but in a same simulation, are connected using a socket pair. TCP traffic is sent at a saturating data rate.

  • fd-emu-onoff.cc: This example is aimed at measuring the throughput of the FdNetDevice when using the EmuFdNetDeviceHelper to attach the simulated device to a real device in the host machine. This is achieved by saturating the channel with TCP traffic.

  • fd-emu-ping.cc: This example uses the EmuFdNetDeviceHelper to send ICMP traffic over a real channel.

  • fd-emu-udp-echo.cc: This example uses the EmuFdNetDeviceHelper to send UDP traffic over a real channel.

  • fd-tap-ping.cc: This example uses the TapFdNetDeviceHelper to send ICMP traffic over a real channel.

13.2. Netmap NetDevice

The fd-net-device module provides the NetmapNetDevice class, a class derived from the FdNetDevice which is able to read and write traffic using a netmap file descriptor. This netmap file descriptor must be associated to a real ethernet device in the host machine. The NetmapNetDeviceHelper class supports the configuration of a NetmapNetDevice.

netmap is a fast packet processing capability that bypasses the host networking stack and gains direct access to network device. netmap was developed by Luigi Rizzo [Rizzo2012] and is maintained as an open source project on GitHub at https://github.com/luigirizzo/netmap.

The NetmapNetDevice for ns-3 [Imputato2019] was developed by Pasquale Imputato in the 2017-19 timeframe. The use of NetmapNetDevice requires that the host system has netmap support (and for best performance, the drivers must support netmap and must be using a netmap-enabled device driver). Users can expect that emulation support using Netmap will support higher packets per second than emulation using FdNetDevice with raw sockets (which pass through the Linux networking kernel).

Rizzo2012

Luigi Rizzo, “netmap: A Novel Framework for Fast Packet I/O”, Proceedings of 2012 USENIX Annual Technical Conference, June 2012.

Imputato2019

Pasquale Imputato, Stefano Avallone, Enhancing the fidelity of network emulation through direct access to device buffers, Journal of Network and Computer Applications, Volume 130, 2019, Pages 63-75, (http://www.sciencedirect.com/science/article/pii/S1084804519300220)

13.2.1. Model Description

13.2.1.1. Design

Because netmap uses file descriptor based communication to interact with the real device, the straightforward approach to design a new NetDevice around netmap is to have it inherit from the existing FdNetDevice and implement a specialized version of the operations specific to netmap. The operations that require a specialized implementation are the initialization, because the NIC has to be put in netmap mode, and the read/write methods, which have to make use of the netmap API to coordinate the exchange of packets with the netmap rings.

In the initialization stage, the network device is switched to netmap mode, so that ns-3 is able to send/receive packets to/from the real network device by writing/reading them to/from the netmap rings. Following the design of the FdNetDevice, a separate reading thread is started during the initialization. The task of the reading thread is to wait for new incoming packets in the netmap receiver rings, in order to schedule the events of packet reception. In the initialization of the NetmapNetDevice, an additional thread, the sync thread, is started. The sync thread is required because, in order to reduce the cost of the system calls, netmap does not automatically transfer a packet written to a slot of the netmap ring to the transmission ring or to the installed qdisc. It is up to the user process to periodically request a synchronization of the netmap ring. Therefore, the purpose of the sync thread is to periodically make a TXSYNC ioctl request, so that pending packets in the netmap ring are transferred to the transmission ring, if in native mode, or to the installed qdisc, if in generic mode. Also, as described further below, the sync thread is exploited to perform flow control and notify the BQL library about the amount of bytes that have been transferred to the network device.

The read method is called by the reading thread to retrieve new incoming packets stored in the netmap receiver ring and pass them to the appropriate ns-3 protocol handler for further processing within the simulator’s network stack. After retrieving packets, the reading thread also synchronizes the netmap receiver ring, so that the retrieved packets can be removed from the netmap receiver ring.

The NetmapNetDevice also specializes the write method, i.e., the method used to transmit a packet received from the upper layer (the ns-3 traffic control layer). The write method uses the netmap API to write the packet to a free slot in the netmap transmission ring. After writing a packet, the write method checks whether there is enough room in the netmap transmission ring for another packet. If not, the NetmapNetDevice stops its queue so that the ns-3 traffic control layer does not attempt to send a packet that could not be stored in the netmap transmission ring.

A stopped NetmapNetDevice queue needs to be restarted as soon as some room is made in the netmap transmission ring. The sync thread can be exploited for this purpose, given that it periodically synchronizes the netmap transmission ring. In particular, the sync thread also checks the number of free slots in the netmap transmission ring in case the NetmapNetDevice queue is stopped. If the number of free slots exceeds a configurable value, the sync thread restarts the NetmapNetDevice queue and wakes the associated ns-3 qdisc. The NetmapNetDevice also supports BQL: the write method notifies the BQL library of the amount of bytes that have been written to the netmap transmission ring, while the sync thread notifies the BQL library of the amount of bytes that have been removed from the netmap transmission ring and transferred to the NIC since the previous notification.

13.2.1.2. Scope and Limitations

The main scope of NetmapNetDevice is to support the flow-control between the physical device and the upper layer and using at best the computational resources to process packets. However, the (Linux) system and network device must support netmap to make use of this feature.

13.2.2. Usage

The installation of netmap itself on a host machine is out of scope for this document. Refer to the netmap GitHub README for instructions.

The ns-3 netmap code has only been tested on Linux; it is not clear whether other operating systems can be supported.

If ns-3 is able to detect the presence of netmap on the system, it will report that:

Netmap emulation FdNetDevice  : not enabled

If not, it will report:

Netmap emulation FdNetDevice  : not enabled (needs net/netmap_user.h)

To run FdNetDevice-enabled simulations, one must pass the --enable-sudo option to ./ns3 configure, or else run the simulations with root privileges.

13.2.2.1. Helpers

ns-3 netmap support uses a NetMapNetDeviceHelper helper object to install the NetmapNetDevice. In other respects, the API and use is similar to that of the EmuFdNetDeviceHelper.

13.2.2.2. Attributes

There is one attribute specialized to NetmapNetDevice, named SyncAndNotifyQueuePeriod. This value takes an integer number of microseconds, and is used as the period of time after which the device syncs the netmap ring and notifies queue status. The value should be close to the interrupt coalescence period of the real device. Users may want to tune this parameter for their own system; it should be a compromise between CPU usage and accuracy in the ring sync (if it is too high, the device goes into starvation and lower throughput occurs).

13.2.2.3. Output

The NetmapNetDevice does not provide any specialized output, but supports the FdNetDevice output and traces (such as a promiscuous sniffer trace).

13.2.2.4. Examples

Several examples are provided:

  • fd-emu-onoff.cc: This example is aimed at measuring the throughput of the NetmapNetDevice when using the NetmapNetDeviceHelper to attach the simulated device to a real device in the host machine. This is achieved by saturating the channel with TCP or UDP traffic.

  • fd-emu-ping.cc: This example uses the NetmapNetDevice to send ICMP traffic over a real device.

  • fd-emu-tc.cc: This example configures a router on a machine with two

    interfaces in emulated mode through netmap. The aim is to explore different qdiscs behaviours on the backlog of a device emulated bottleneck side.

  • fd-emu-send.cc: This example builds a node with a device in emulation mode through netmap. The aim is to measure the maximum transmit rate in packets per second (pps) achievable with NetmapNetDevice on a specific machine.

Note that all the examples run in emulation mode through netmap (with NetmapNetDevice) and raw socket (with FdNetDevice).

13.3. DPDK NetDevice

Data Plane Development Kit (DPDK) is a library hosted by The Linux Foundation to accelerate packet processing workloads (https://www.dpdk.org/).

The DpdkNetDevice class provides the implementation of a network device which uses DPDK’s fast packet processing abilities and bypasses the kernel. This class is included in the src/fd-net-device model. The DpdkNetDevice class inherits the FdNetDevice class and overrides the functions which are required by ns-3 to interact with DPDK environment.

The DpdkNetDevice for ns-3 [Patel2019] was developed by Harsh Patel, Hrishikesh Hiraskar and Mohit P. Tahiliani. They were supported by Intel Technology India Pvt. Ltd., Bangalore for this work.

Patel2019

Harsh Patel, Hrishikesh Hiraskar, Mohit P. Tahiliani, “Extending Network Emulation Support in ns-3 using DPDK”, Proceedings of the 2019 Workshop on ns-3, ACM, Pages 17-24, (https://dl.acm.org/doi/abs/10.1145/3321349.3321358)

13.3.1. Model Description

DpdkNetDevice is a network device which provides network emulation capabilities i.e. to allow simulated nodes to interact with real hosts and vice versa. The main feature of the DpdkNetDevice is that is uses the Environment Abstraction Layer (EAL) provided by DPDK to perform fast packet processing. EAL hides the device specific attributes from the applications and provides an interface via which the applications can interact directly with the Network Interface Card (NIC). This allows ns-3 to send/receive packets directly to/from the NIC without the kernel involvement.

13.3.1.1. Design

DpdkNetDevice is designed to act as an interface between ns-3 and DPDK environment. There are 3 main phases in the life cycle of DpdkNetDevice:

  • Initialization

  • Packet Transfer - Read and Write

  • Termination

13.3.1.1.1. Initialization

DpdkNetDeviceHelper model is responsible for the initialization of DpdkNetDevice. After this, the EAL is initialized, a memory pool is allocated, access to the Ethernet port is obtained and it is initialized, reception (Rx) and transmission (Tx) queues are set up on the port, Rx and Tx buffers are set up and LaunchCore method is called which will launch the HandleRx method to handle reading of packets in burst.

13.3.1.1.2. Packet Transfer

DPDK interacts with packet in the form of mbuf, a data structure provided by it, while ns-3 interacts with packets in the form of raw buffer. The packet transfer functions take care of converting DPDK mbufs to ns-3 buffers. The functions are read and write.

  • Read: HandleRx method takes care of reading the packets from NIC and transferring them to ns-3 Internet Stack. This function is called by LaunchCore method which is launched during initialization. It continuously polls the NIC using DPDK API for packets to read. It reads the mbuf packets in burst from NIC Rx ring, which are placed into Rx buffer upon read. For each mbuf packet in Rx buffer, it then converts it to ns-3 raw buffer and then forwards the packet to ns-3 Internet Stack.

  • Write: Write method handles transmission of packets. ns-3 provides this packet in the form of a buffer, which is converted to packet mbuf and then placed in the Tx buffer. These packets are then transferred to NIC Tx ring when the Tx buffer is full, from where they will be transmitted by the NIC. However, there might be a scenario where there are not enough packets to fill the Tx buffer. This will lead to stale packet mbufs in buffer. In such cases, the Write function schedules a manual flush of these stale packet mbufs to NIC Tx ring, which will occur upon a certain timeout period. The default value of this timeout is set to 2 ms.

13.3.1.1.3. Termination

When ns-3 is done using DpdkNetDevice, the DpdkNetDevice will stop polling for Rx, free the allocated mbuf packets and then the mbuf pool. Lastly, it will stop the Ethernet device and close the port.

13.3.1.2. Scope and Limitations

The current implementation supports only one NIC to be bound to DPDK with single Rx and Tx on the NIC. This can be extended to support multiple NICs and multiple Rx/Tx queues simultaneously. Currently there is no support for Jumbo frames, which can be added. Offloading, scheduling features can also be added. Flow control and support for qdisc can be added to provide a more extensive model for network testing.

13.3.2. DPDK Installation

This section contains information on downloading DPDK source code and setting up DPDK for DpdkNetDevice to work.

13.3.2.1. Is my NIC supported by DPDK?

Check Supported Devices.

13.3.2.2. Not supported? Use Virtual Machine instead

Install Oracle VM VirtualBox. Create a new VM and install Ubuntu on it. Open settings, create a network adapter with following configuration:

  • Attached to: Bridged Adapter

  • Name: The host network device you want to use

  • In Advanced
    • Adapter Type: Intel PRO/1000 MT Server (82545EM) or any other DPDK supported NIC

    • Promiscuous Mode: Allow All

    • Select Cable Connected

Then rest of the steps are same as follows.

DPDK can be installed in 2 ways:

  • Install DPDK on Ubuntu

  • Compile DPDK from source

13.3.2.3. Install DPDK on Ubuntu

To install DPDK on Ubuntu, run the following command:

apt-get install dpdk dpdk-dev libdpdk-dev dpdk-igb-uio-dkms

Ubuntu 20.04 has packaged DPDK v19.11 LTS which is tested with this module and DpdkNetDevice will only be enabled if this version is available.

13.3.2.4. Compile from Source

To compile DPDK from source, you need to perform the following 4 steps:

13.3.2.4.1. 1. Download the source

Visit the DPDK Downloads page to download the latest stable source. (This module has been tested with version 19.11 LTS and DpdkNetDevice will only be enabled if this version is available.)

13.3.2.4.2. 2. Configure DPDK as a shared library

In the DPDK directory, edit the config/common_base file to change the following line to compile DPDK as a shared library:

# Compile to share library
CONFIG_RTE_BUILD_SHARED_LIB=y
13.3.2.4.3. 3. Install the source

Refer to Installation for detailed instructions.

For a 64 bit linux machine with gcc, run:

make install T=x86_64-native-linuxapp-gcc DESTDIR=install
13.3.2.4.4. 4. Export DPDK Environment variables

Export the following environment variables:

  • RTE_SDK as the your DPDK source folder.

  • RTE_TARGET as the build target directory.

For example:

export RTE_SDK=/home/username/dpdk/dpdk-stable-19.11.1
export RTE_TARGET=x86_64-native-linuxapp-gcc

(Note: In case DPDK is moved, ns-3 needs to be reconfigured using ./ns3 configure [options])

It is advisable that you export these variables in .bashrc or similar for reusability.

13.3.2.5. Load DPDK Drivers to kernel

Execute the following:

sudo modprobe uio_pci_generic
sudo modprobe uio
sudo modprobe vfio-pci

sudo modprobe igb_uio # for ubuntu package
# OR
sudo insmod $RTE_SDK/$RTE_TARGET/kmod/igb_uio.ko # for dpdk source

These should be done every time you reboot your system.

13.3.2.6. Configure hugepages

Refer System Requirements for detailed instructions.

To allocate hugepages at runtime, write a value such as ‘256’ to the following:

echo 256 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

To allocate hugepages at boot time, edit /etc/default/grub, and following to GRUB_CMDLINE_LINUX_DEFAULT:

hugepages=256

We suggest minimum of number of 256 to run our applications. (This is to test an application run at 1 Gbps on a 1 Gbps NIC.) You can use any number of hugepages based on your system capacity and application requirements.

Then update the grub configurations using:

sudo update-grub

OR

sudo update-grub2

You will need to reboot your system in order to see these changes.

To check allocation of hugepages, run:

cat /proc/meminfo | grep HugePages

You will see the number of hugepages allocated, they should be equal to the number you used above.

Once the hugepage memory is reserved (at either runtime or boot time), to make the memory available for DPDK use, perform the following steps:

sudo mkdir /mnt/huge
sudo mount -t hugetlbfs nodev /mnt/huge

The mount point can be made permanent across reboots, by adding the following line to the /etc/fstab file:

nodev /mnt/huge hugetlbfs defaults 0 0

13.3.3. Usage

The status of DPDK support is shown in the output of ./ns3 configure. If it is found, a user should see:

DPDK NetDevice                : enabled

DpdkNetDeviceHelper class supports the configuration of DpdkNetDevice.

+----------------------+
|         host 1       |
+----------------------+
|   ns-3 simulation    |
+----------------------+
|       ns-3 Node      |
|  +----------------+  |
|  |    ns-3 TCP    |  |
|  +----------------+  |
|  |    ns-3 IP     |  |
|  +----------------+  |
|  |  DpdkNetDevice |  |
|  |    10.1.1.1    |  |
|  +----------------+  |
|  |   raw socket   |  |
|--+----------------+--|
|       | eth0 |       |
+-------+------+-------+

        10.1.1.11

            |
            +-------------- ( Internet ) ----

Initialization of DPDK driver requires initialization of EAL. EAL requires PMD (Poll Mode Driver) Library for using NIC. DPDK supports multiple Poll Mode Drivers and you can use one that works for your NIC. PMD Library can be set via DpdkNetDeviceHelper::SetPmdLibrary, as follows:

DpdkNetDeviceHelper* dpdk = new DpdkNetDeviceHelper();
dpdk->SetPmdLibrary("librte_pmd_e1000.so");

Also, NIC should be bound to DPDK Driver in order to be used with EAL. The default driver used is uio_pci_generic which supports most of the NICs. You can change it using DpdkNetDeviceHelper::SetDpdkDriver, as follows:

DpdkNetDeviceHelper* dpdk = new DpdkNetDeviceHelper();
dpdk->SetDpdkDriver("igb_uio");
13.3.3.1. Attributes

The DpdkNetDevice provides a number of attributes:

  • TxTimeout - The time to wait before transmitting burst from Tx Buffer (in us). (default - 2000) This attribute is only used to flush out buffer in case it is not filled. This attribute can be decrease for low data rate traffic. For high data rate traffic, this attribute needs no change.

  • MaxRxBurst - Size of Rx Burst. (default - 64) This attribute can be increased for higher data rates.

  • MaxTxBurst - Size of Tx Burst. (default - 64) This attribute can be increased for higher data rates.

  • MempoolCacheSize - Size of mempool cache. (default - 256) This attribute can be increased for higher data rates.

  • NbRxDesc - Number of Rx descriptors. (default - 1024) This attribute can be increased for higher data rates.

  • NbTxDesc - Number of Tx descriptors. (default - 1024) This attribute can be increased for higher data rates.

Note: Default values work well with 1Gbps traffic.

13.3.3.2. Output

As DpdkNetDevice is inherited from FdNetDevice, all the output methods provided by FdNetDevice can be used directly.

13.3.3.3. Examples

The following examples are provided:

  • fd-emu-ping.cc: This example can be configured to use the DpdkNetDevice to send ICMP traffic bypassing the kernel over a real channel.

  • fd-emu-onoff.cc: This example can be configured to measure the throughput of the DpdkNetDevice by sending traffic from the simulated node to a real device using the ns3::OnOffApplication while leveraging DPDK’s fast packet processing abilities. This is achieved by saturating the channel with TCP/UDP traffic.

13.4. Tap NetDevice

The Tap NetDevice can be used to allow a host system or virtual machines to interact with a simulation.

13.4.1. TapBridge Model Overview

The Tap Bridge is designed to integrate “real” internet hosts (or more precisely, hosts that support Tun/Tap devices) into ns-3 simulations. The goal is to make it appear to a “real” host node in that it has an ns-3 net device as a local device. The concept of a “real host” is a bit slippery since the “real host” may actually be virtualized using readily available technologies such as VMware, VirtualBox or OpenVZ.

Since we are, in essence, connecting the inputs and outputs of an ns-3 net device to the inputs and outputs of a Linux Tap net device, we call this arrangement a Tap Bridge.

There are three basic operating modes of this device available to users. Basic functionality is essentially identical, but the modes are different in details regarding how the arrangement is created and configured; and what devices can live on which side of the bridge.

We call these three modes the ConfigureLocal, UseLocal and UseBridge modes. The first “word” in the camel case mode identifier indicates who has the responsibility for creating and configuring the taps. For example, the “Configure” in ConfigureLocal mode indicates that it is the TapBridge that has responsibility for configuring the tap. In UseLocal mode and UseBridge modes, the “Use” prefix indicates that the TapBridge is asked to “Use” an existing configuration.

In other words, in ConfigureLocal mode, the TapBridge has the responsibility for creating and configuring the TAP devices. In UseBridge or UseLocal modes, the user provides a configuration and the TapBridge adapts to that configuration.

13.4.1.1. TapBridge ConfigureLocal Mode

In the ConfigureLocal mode, the configuration of the tap device is ns-3 configuration-centric. Configuration information is taken from a device in the ns-3 simulation and a tap device matching the ns-3 attributes is automatically created. In this case, a Linux computer is made to appear as if it was directly connected to a simulated ns-3 network.

This is illustrated below:

+--------+
|  Linux |
|  host  |                    +----------+
| ------ |                    |   ghost  |
|  apps  |                    |   node   |
| ------ |                    | -------- |
|  stack |                    |    IP    |     +----------+
| ------ |                    |   stack  |     |   node   |
|  TAP   |                    |==========|     | -------- |
| device | <----- IPC ------> |   tap    |     |    IP    |
+--------+                    |  bridge  |     |   stack  |
                              | -------- |     | -------- |
                              |   ns-3   |     |   ns-3   |
                              |   net    |     |   net    |
                              |  device  |     |  device  |
                              +----------+     +----------+
                                   ||               ||
                              +---------------------------+
                              |        ns-3 channel       |
                              +---------------------------+

In this case, the “ns-3 net device” in the “ghost node” appears as if it were actually replacing the TAP device in the Linux host. The ns-3 simulation creates the TAP device on the underlying Linux OS and configures the IP and MAC addresses of the TAP device to match the values assigned to the simulated ns-3 net device. The “IPC” link shown above is the network tap mechanism in the underlying OS. The whole arrangement acts as a conventional bridge; but a bridge between devices that happen to have the same shared MAC and IP addresses.

Here, the user is not required to provide any configuration information specific to the tap. A tap device will be created and configured by ns-3 according to its defaults, and the tap device will have its name assigned by the underlying operating system according to its defaults.

If the user has a requirement to access the created tap device, he or she may optionally provide a “DeviceName” attribute. In this case, the created OS tap device will be named accordingly.

The ConfigureLocal mode is the default operating mode of the Tap Bridge.

13.4.1.2. TapBridge UseLocal Mode

The UseLocal mode is quite similar to the ConfigureLocal mode. The significant difference is, as the mode name implies, the TapBridge is going to “Use” an existing tap device previously created and configured by the user. This mode is particularly useful when a virtualization scheme automatically creates tap devices and ns-3 is used to provide simulated networks for those devices.

+--------+
|  Linux |
|  host  |                    +----------+
| ------ |                    |   ghost  |
|  apps  |                    |   node   |
| ------ |                    | -------- |
|  stack |                    |    IP    |     +----------+
| ------ |                    |   stack  |     |   node   |
|  TAP   |                    |==========|     | -------- |
| device | <----- IPC ------> |   tap    |     |    IP    |
| MAC X  |                    |  bridge  |     |   stack  |
+--------+                    | -------- |     | -------- |
                              |   ns-3   |     |   ns-3   |
                              |   net    |     |   net    |
                              |  device  |     |  device  |
                              |  MAC Y   |     |  MAC Z   |
                              +----------+     +----------+
                                   ||               ||
                              +---------------------------+
                              |        ns-3 channel       |
                              +---------------------------+

In this case, the pre-configured MAC address of the “Tap device” (MAC X) will not be the same as that of the bridged “ns-3 net device” (MAC Y) shown in the illustration above. In order to bridge to ns-3 net devices which do not support SendFrom() (especially wireless STA nodes) we impose a requirement that only one Linux device (with one unique MAC address – here X) generates traffic that flows across the IPC link. This is because the MAC addresses of traffic across the IPC link will be “spoofed” or changed to make it appear to Linux and ns-3 that they have the same address. That is, traffic moving from the Linux host to the ns-3 ghost node will have its MAC address changed from X to Y and traffic from the ghost node to the Linux host will have its MAC address changed from Y to X. Since there is a one-to-one correspondence between devices, there may only be one MAC source flowing from the Linux side. This means that Linux bridges with more than one net device added are incompatible with UseLocal mode.

In UseLocal mode, the user is expected to create and configure a tap device completely outside the scope of the ns-3 simulation using something like:

$ sudo ip tuntap add mode tap tap0
$ sudo ip address add 10.1.1.1/24 dev tap0
$ sudo ip link set dev tap0 address 08:00:2e:00:00:01 up

To tell the TapBridge what is going on, the user will set either directly into the TapBridge or via the TapBridgeHelper, the “DeviceName” attribute. In the case of the configuration above, the “DeviceName” attribute would be set to “tap0” and the “Mode” attribute would be set to “UseLocal”.

One particular use case for this mode is in the OpenVZ environment. There it is possible to create a Tap device on the “Hardware Node” and move it into a Virtual Private Server. If the TapBridge is able to use an existing tap device it is then possible to avoid the overhead of an OS bridge in that environment.

13.4.1.3. TapBridge UseBridge Mode

The simplest mode for those familiar with Linux networking is the UseBridge mode. Again, the “Use” prefix indicates that the TapBridge is going to Use an existing configuration. In this case, the TapBridge is going to logically extend a Linux bridge into ns-3.

This is illustrated below:

+---------+
|  Linux  |                             +----------+
| ------- |                             |   ghost  |
|  apps   |                             |   node   |
| ------- |                             | -------- |
|  stack  |                             |    IP    |     +----------+
| ------- | +--------+                  |   stack  |     |   node   |
| Virtual | |  TAP   |                  |==========|     | -------- |
| Device  | | Device | <---- IPC -----> |   tap    |     |    IP    |
+---------+ +--------+                  |  bridge  |     |   stack  |
    ||          ||                      | -------- |     | -------- |
+--------------------+                  |   ns-3   |     |   ns-3   |
|     OS Bridge      |                  |   net    |     |   net    |
+--------------------+                  |  device  |     |  device  |
                                        +----------+     +----------+
                                             ||               ||
                                        +---------------------------+
                                        |        ns-3 channel       |
                                        +---------------------------+

In this case, a computer running Linux applications, protocols, etc., is connected to a ns-3 simulated network in such a way as to make it appear to the Linux host that the TAP device is a real network device participating in the Linux bridge.

In the ns-3 simulation, a TapBridge is created to match each TAP Device. The name of the TAP Device is assigned to the Tap Bridge using the “DeviceName” attribute. The TapBridge then logically extends the OS bridge to encompass the ns-3 net device.

Since this mode logically extends an OS bridge, there may be many Linux net devices on the non-ns-3 side of the bridge. Therefore, like a net device on any bridge, the ns-3 net device must deal with the possibly of many source addresses. Thus, ns-3 devices must support SendFrom() (NetDevice::SupportsSendFrom() must return true) in order to be configured for use in UseBridge mode.

It is expected that the user will do something like the following to configure the bridge and tap completely outside ns-3:

$ sudo ip link add mybridge type bridge
$ sudo ip address add 10.1.1.1/24 dev mybridge
$ sudo ip tuntap add mode tap mytap
$ sudo ip link set dev mytap address 00:00:00:00:00:01 up
$ sudo ip link set dev mytap master mybridge
$ sudo ip link set dev ... master mybridge
$ sudo ip link set dev mybridge up

To tell the TapBridge what is going on, the user will set either directly into the TapBridge or via the TapBridgeHelper, the “DeviceName” attribute. In the case of the configuration above, the “DeviceName” attribute would be set to “mytap” and the “Mode” attribute would be set to “UseBridge”.

This mode is especially useful in the case of virtualization where the configuration of the virtual hosts may be dictated by another system and not be changeable to suit ns-3. For example, a particular VM scheme may create virtual “vethx” or “vmnetx” devices that appear local to virtual hosts. In order to connect to such systems, one would need to manually create TAP devices on the host system and bridge these TAP devices to the existing (VM) virtual devices. The job of the Tap Bridge in this case is to extend the bridge to join a ns-3 net device.

13.4.1.4. TapBridge ConfigureLocal Operation

In ConfigureLocal mode, the TapBridge and therefore its associated ns-3 net device appears to the Linux host computer as a network device just like any arbitrary “eth0” or “ath0” might appear. The creation and configuration of the TAP device is done by the ns-3 simulation and no manual configuration is required by the user. The IP addresses, MAC addresses, gateways, etc., for created TAP devices are extracted from the simulation itself by querying the configuration of the ns-3 device and the TapBridge Attributes.

Since the MAC addresses are identical on the Linux side and the ns-3 side, we can use Send() on the ns-3 device which is available on all ns-3 net devices. Since the MAC addresses are identical there is no requirement to hook the promiscuous callback on the receive side. Therefore there are no restrictions on the kinds of net device that are usable in ConfigureLocal mode.

The TapBridge appears to an ns-3 simulation as a channel-less net device. This device must not have an IP address associated with it, but the bridged (ns-3) net device must have an IP address. Be aware that this is the inverse of an ns-3 BridgeNetDevice (or a conventional bridge in general) which demands that its bridge ports not have IP addresses, but allows the bridge device itself to have an IP address.

The host computer will appear in a simulation as a “ghost” node that contains one TapBridge for each NetDevice that is being bridged. From the perspective of a simulation, the only difference between a ghost node and any other node will be the presence of the TapBridge devices. Note however, that the presence of the TapBridge does affect the connectivity of the net device to the IP stack of the ghost node.

Configuration of address information and the ns-3 devices is not changed in any way if a TapBridge is present. A TapBridge will pick up the addressing information from the ns-3 net device to which it is connected (its “bridged” net device) and use that information to create and configure the TAP device on the real host.

The end result of this is a situation where one can, for example, use the standard ping utility on a real host to ping a simulated ns-3 node. If correct routes are added to the internet host (this is expected to be done automatically in future ns-3 releases), the routing systems in ns-3 will enable correct routing of the packets across simulated ns-3 networks. For an example of this, see the example program, tap-wifi-dumbbell.cc in the ns-3 distribution.

The Tap Bridge lives in a kind of a gray world somewhere between a Linux host and an ns-3 bridge device. From the Linux perspective, this code appears as the user mode handler for a TAP net device. In ConfigureLocal mode, this Tap device is automatically created by the ns-3 simulation. When the Linux host writes to one of these automatically created /dev/tap devices, the write is redirected into the TapBridge that lives in the ns-3 world; and from this perspective, the packet write on Linux becomes a packet read in the Tap Bridge. In other words, a Linux process writes a packet to a tap device and this packet is redirected by the network tap mechanism toan ns-3 process where it is received by the TapBridge as a result of a read operation there. The TapBridge then writes the packet to the ns-3 net device to which it is bridged; and therefore it appears as if the Linux host sent a packet directly through an ns-3 net device onto an ns-3 network.

In the other direction, a packet received by the ns-3 net device connected to the Tap Bridge is sent via a receive callback to the TapBridge. The TapBridge then takes that packet and writes it back to the host using the network tap mechanism. This write to the device will appear to the Linux host as if a packet has arrived on a net device; and therefore as if a packet received by the ns-3 net device during a simulation has appeared on a real Linux net device.

The upshot is that the Tap Bridge appears to bridge a tap device on a Linux host in the “real world” to an ns-3 net device in the simulation. Because the TAP device and the bridged ns-3 net device have the same MAC address and the network tap IPC link is not externalized, this particular kind of bridge makes it appear that a ns-3 net device is actually installed in the Linux host.

In order to implement this on the ns-3 side, we need a “ghost node” in the simulation to hold the bridged ns-3 net device and the TapBridge. This node should not actually do anything else in the simulation since its job is simply to make the net device appear in Linux. This is not just arbitrary policy, it is because:

  • Bits sent to the TapBridge from higher layers in the ghost node (using the TapBridge Send method) are completely ignored. The TapBridge is not, itself, connected to any network, neither in Linux nor in ns-3. You can never send nor receive data over a TapBridge from the ghost node.

  • The bridged ns-3 net device has its receive callback disconnected from the ns-3 node and reconnected to the Tap Bridge. All data received by a bridged device will then be sent to the Linux host and will not be received by the node. From the perspective of the ghost node, you can send over this device but you cannot ever receive.

Of course, if you understand all of the issues you can take control of your own destiny and do whatever you want – we do not actively prevent you from using the ghost node for anything you decide. You will be able to perform typical ns-3 operations on the ghost node if you so desire. The internet stack, for example, must be there and functional on that node in order to participate in IP address assignment and global routing. However, as mentioned above, interfaces talking to any TapBridge or associated bridged net devices will not work completely. If you understand exactly what you are doing, you can set up other interfaces and devices on the ghost node and use them; or take advantage of the operational send side of the bridged devices to create traffic generators. We generally recommend that you treat this node as a ghost of the Linux host and leave it to itself, though.

13.4.1.5. TapBridge UseLocal Mode Operation

As described in above, the TapBridge acts like a bridge from the “real” world into the simulated ns-3 world. In the case of the ConfigureLocal mode, life is easy since the IP address of the Tap device matches the IP address of the ns-3 device and the MAC address of the Tap device matches the MAC address of the ns-3 device; and there is a one-to-one relationship between the devices.

Things are slightly complicated when a Tap device is externally configured with a different MAC address than the ns-3 net device. The conventional way to deal with this kind of difference is to use promiscuous mode in the bridged device to receive packets destined for the different MAC address and forward them off to Linux. In order to move packets the other way, the conventional solution is SendFrom() which allows a caller to “spoof” or change the source MAC address to match the different Linux MAC address.

We do have a specific requirement to be able to bridge Linux Virtual Machines onto wireless STA nodes. Unfortunately, the 802.11 spec doesn’t provide a good way to implement SendFrom(), so we have to work around that problem.

To this end, we provided the UseLocal mode of the Tap Bridge. This mode allows you approach the problem as if you were creating a bridge with a single net device. A single allowed address on the Linux side is remembered in the TapBridge, and all packets coming from the Linux side are repeated out the ns-3 side using the ns-3 device MAC source address. All packets coming in from the ns-3 side are repeated out the Linux side using the remembered MAC address. This allows us to use Send() on the ns-3 device side which is available on all ns-3 net devices.

UseLocal mode is identical to the ConfigureLocal mode except for the creation and configuration of the tap device and the MAC address spoofing.

13.4.1.6. TapBridge UseBridge Operation

As described in the ConfigureLocal mode section, when the Linux host writes to one of the /dev/tap devices, the write is redirected into the TapBridge that lives in the ns-3 world. In the case of the UseBridge mode, these packets will need to be sent out on the ns-3 network as if they were sent on a device participating in the Linux bridge. This means calling the SendFrom() method on the bridged device and providing the source MAC address found in the packet.

In the other direction, a packet received by an ns-3 net device is hooked via callback to the TapBridge. This must be done in promiscuous mode since the goal is to bridge the ns-3 net device onto the OS bridge of which the TAP device is a part.

For these reasons, only ns-3 net devices that support SendFrom() and have a hookable promiscuous receive callback are allowed to participate in UseBridge mode TapBridge configurations.

13.4.2. Tap Bridge Channel Model

There is no channel model associated with the Tap Bridge. In fact, the intention is make it appear that the real internet host is connected to the channel of the bridged net device.

13.4.3. Tap Bridge Tracing Model

Unlike most ns-3 devices, the TapBridge does not provide any standard trace sources. This is because the bridge is an intermediary that is essentially one function call away from the bridged device. We expect that the trace hooks in the bridged device will be sufficient for most users,

13.4.4. Using the TapBridge

We expect that most users will interact with the TapBridge device through the TapBridgeHelper. Users of other helper classes, such as CSMA or Wifi, should be comfortable with the idioms used there.

14. Energy Framework

Energy is a key issue for wireless devices, and network researchers often need to investigate the energy consumption at a node or in the overall network while running network simulations in ns-3. This requires ns-3 to support an energy framework. Further, as concepts such as fuel cells and energy scavenging are becoming viable for low power wireless devices, incorporating the effect of these emerging technologies into simulations requires support for modeling diverse energy models in ns-3. The ns-3 energy framework provides the basis for energy storing consumption and harvesting.

14.1. Model Description

The framework is implemented into the src/energy/ folder.

The ns-3 energy framework is composed of 3 essential parts:

  • Energy source models. Represent storing energy sources such as batteries or capacitors.

  • Energy consumption models. Represent a portion of a device that draws energy from energy sources. Examples of this include sensors, radio transceivers, vehicles, UAV, etc.

  • Energy harvesting models. Represent devices that provide energy to energy sources. For example, solar panels and chargers.

_images/energyFramework.png

ns-3 energy framework

14.1.1. Energy Source Models

An energy source represents a power supply. In ns-3, nodes can have one or more energy sources. Likewise, energy sources can be connected to multiple energy consumption models (Device energy models). Connecting an energy source to a device energy model implies that the corresponding device draws power from the source. When energy is completely drained from the energy source, it notifies to the device energy models on the node such that each device energy model can react to this event. Further, each node can access the energy source objects for information such as remaining capacity , voltage or state of charge (SoC). This enables the implementation of energy aware protocols in ns-3.

In order to model a wide range of power supplies such as batteries, the energy source must be able to capture characteristics of these supplies. There are 2 important characteristics or effects related to practical batteries:

  • Rate Capacity Effect. Decrease of battery lifetime when the current draw is higher than the rated value of the battery.

  • Recovery Effect. Increase of battery lifetime when the battery is alternating between discharge and idle states.

In order to incorporate the Rate Capacity Effect, the Energy Source uses current draw from all the devices on the same node to calculate energy consumption. Moreover, multiple Energy Harvesters can be connected to the Energy Source in order to replenish its energy. The Energy Source periodically polls all the devices and energy harvesters on the same node to calculate the total current drain and hence the energy consumption. When a device changes state, its corresponding Device Energy Model will notify the Energy Source of this change and new total current draw will be calculated. Similarly, every Energy Harvester update triggers an update to the connected Energy Source.

The EnergySource base class keeps a list of devices (DeviceEnergyModel objects) and energy harvesters (EnergyHarvester objects) that are using the particular Energy Source as power supply. When energy is completely drained, the Energy Source will notify all devices on this list. Each device can then handle this event independently, based on the desired behavior that should be followed in case of power outage.

14.1.1.1. Generic Battery Model

The Generic battery model is able to represent 4 basic types of batteries chemistries: Lithium-Ion (LiIon) or Lithium Polymer (LiPo), Nickel Cadmium (NiCd), Lead Acid, and Nickel-metal hydride (NiMH). The main difference between these batteries is the shape of the discharge curves when using constant discharge current and that NiCd and NiMh batteries hysteresis phenomenon is also modeled. Peurket effect, aging, temperature and variable battery impedance is not considered for all batteries. batteries with similar discharge behavior might be also represented but one of the 4 basic arqueotipes must be chosen.

The Generic Battery Model is directly based by the works of Trembley et al. Tremblay’s model on itself is based on a popular battery model created by Shepherd. Tremblay’s model consist in visually identify a set of points from batteries manufacturers’ discharge curves datasheets.

_images/dischargeCurve.png

ns-3 Generic Battery Model Points in battery discharge curve.

The 3 basic set of points that require identification in a datasheet are:

  • V_{full}: The full battery voltage

  • Q: The maximum battery capacity

  • V_{exp}: The voltage at the end of the exponential zone

  • Q_{exp}: The capacity at the end of the exponential zone

  • V_{nom}: The voltage at the end of the exponential zone

  • Q_{nom}: The capacity at the end of the exponential zone

Additionally, it is also necessary to set the values of:

  • R: The battery impedance (The battery internal resistance)

  • i_{typical}: The typical current value used to discharge the battery, this value is used to calculate some of the constants used in the model.

  • cutoffVoltage: Required if we desired to inform connected energy consumption models that the battery has reached its discharged point.

  • i: The discharge current used to discharge the battery. This value is provided by the energy consumption model attached to the battery.

The value of R is typically included in the datasheets, however, because R variability is not modeled in ns-3 (The resistance is fixed), it is necessary to discretely adjust its value to obtain the desired discharge curves. The value i_{typical} can be obtained by inferring its value from the discharged curves shown in datasheets. When modeling the behavior of a new battery, it is important to chose values that satisfies more than one curve, trial an error adjustments might be necessary to obtain the desired results.

Attributes:

  • FullVoltage: Represents the V_{full} value.

  • MaxCapacity: Represents the Q value.

  • ExponentialVoltage: Represents the V_{exp} value.

  • ExponentialCapacity: Represents the Q_{exp} value.

  • NominalVoltage: Represents the V_{nom} value.

  • NominalCapacity: Represents the Q_{nom} value.

  • InternalResistance: Represents the R value.

  • TypicalDischargeCurrent: Represents the i_{typical} value.

  • CutoffVoltage: The voltage where the battery is considered depleted.

  • BatteryType: Indicates the battery type used.

  • PeriodicEnergyUpdateInterval: Indicates how often the update values are obtained.

  • LowBatteryThreshold: Additional voltage threshold to indicate when the battery has low energy.

The process described above can be simplified by installing batteries presents of previously tested batteries using helpers. Details on helpers usage are detailed in the following sections.

14.1.1.2. RV Battery Model

Attributes:

  • RvBatteryModelPeriodicEnergyUpdateInterval: RV battery model sampling interval.

  • RvBatteryModelOpenCircuitVoltage: RV battery model open circuit voltage.

  • RvBatteryModelCutoffVoltage: RV battery model cutoff voltage.

  • RvBatteryModelAlphaValue: RV battery model alpha value.

  • RvBatteryModelBetaValue: RV battery model beta value.

  • RvBatteryModelNumOfTerms: The number of terms of the infinite sum for estimating battery level.

14.1.1.3. Basic Energy Source

Attributes:

  • BasicEnergySourceInitialEnergyJ: Initial energy stored in basic energy source.

  • BasicEnergySupplyVoltageV: Initial supply voltage for basic energy source.

  • PeriodicEnergyUpdateInterval: Time between two consecutive periodic energy updates.

14.1.2. Energy Consumption Models

A DeviceEnergyModel is the energy consumption model of a device installed on the node. It is designed to be a state based model where each device is assumed to have a number of states, and each state is associated with a power consumption value. Whenever the state of the device changes, the corresponding DeviceEnergyModel will notify the associated EnergySourceModel of the new current draw of the device. The EnergySourceModel will then calculate the new total current draw and update the remaining energy. A DeviceEnergyModel can also be used for devices that do not have finite number of states. For example, in an electric vehicle, the current draw of the motor is determined by its speed. Since the vehicle’s speed can take continuous values within a certain range, it is infeasible to define a set of discrete states of operation. However, by converting the speed value into current draw directly, the same set of DeviceEnergyModel APIs can still be used.

14.1.2.1. WiFi Radio Energy Model

The WiFi Radio Energy Model is the energy consumption model of a Wifi net device. It provides a state for each of the available states of the PHY layer: Idle, CcaBusy, Tx, Rx, ChannelSwitch, Sleep, Off. Each of such states is associated with a value (in Ampere) of the current draw (see below for the corresponding attribute names). A Wifi Radio Energy Model PHY Listener is registered to the Wifi PHY in order to be notified of every Wifi PHY state transition. At every transition, the energy consumed in the previous state is computed and the energy source is notified in order to update its remaining energy.

The Wifi Tx Current Model gives the possibility to compute the current draw in the transmit state as a function of the nominal tx power (in dBm), as observed in several experimental measurements. To this purpose, the Wifi Radio Energy Model PHY Listener is notified of the nominal tx power used to transmit the current frame and passes such a value to the Wifi Tx Current Model which takes care of updating the current draw in the Tx state. Hence, the energy consumption is correctly computed even if the Wifi Remote Station Manager performs per-frame power control. Currently, a Linear Wifi Tx Current Model is implemented which computes the tx current as a linear function (according to parameters that can be specified by the user) of the nominal tx power in dBm.

The Wifi Radio Energy Model offers the possibility to specify a callback that is invoked when the energy source is depleted. If such a callback is not specified when the Wifi Radio Energy Model Helper is used to install the model on a device, a callback is implicitly made so that the Wifi PHY is put in the OFF mode (hence no frame is transmitted nor received afterwards) when the energy source is depleted. Likewise, it is possible to specify a callback that is invoked when the energy source is recharged (which might occur in case an energy harvester is connected to the energy source). If such a callback is not specified when the Wifi Radio Energy Model Helper is used to install the model on a device, a callback is implicitly made so that the Wifi PHY is resumed from the OFF mode when the energy source is recharged.

Attributes

  • IdleCurrentA: The default radio Idle current in Ampere.

  • CcaBusyCurrentA: The default radio CCA Busy State current in Ampere.

  • TxCurrentA: The radio Tx current in Ampere.

  • RxCurrentA: The radio Rx current in Ampere.

  • SwitchingCurrentA: The default radio Channel Switch current in Ampere.

  • SleepCurrentA: The radio Sleep current in Ampere.

  • TxCurrentModel: A pointer to the attached tx current model.

14.1.3. Energy Harvesting Models

The energy harvester represents the elements that supply energy from the environment and recharge an energy source to which it is connected. The energy harvester includes the complete implementation of the actual energy harvesting device (e.g., a solar panel) and the environment (e.g., the solar radiation). This means that in implementing an energy harvester, the energy contribution of the environment and the additional energy requirements of the energy harvesting device such as the conversion efficiency and the internal power consumption of the device needs to be jointly modeled.

14.2. Usage

The main way that ns-3 users will typically interact with the Energy Framework is through the helper API and through the publicly visible attributes of the framework. The helper API is defined in src/energy/helper/*.h.

In order to use the energy framework, the user must install an Energy Source for the node of interest, the corresponding Device Energy Model for the network devices and, if necessary, the one or more Energy Harvester. Energy Source (objects) are aggregated onto each node by the Energy Source Helper. In order to allow multiple energy sources per node, we aggregate an Energy Source Container rather than directly aggregating a source object.

The Energy Source object keeps a list of Device Energy Model and Energy Harvester objects using the source as power supply. Device Energy Model objects are installed onto the Energy Source by the Device Energy Model Helper, while Energy Harvester object are installed by the Energy Harvester Helper. User can access the Device Energy Model objects through the Energy Source object to obtain energy consumption information of individual devices. Moreover, the user can access to the Energy Harvester objects in order to gather information regarding the current harvestable power and the total energy harvested by the harvester.

14.2.1. Helpers

Energy Source Helper:

Base helper class for Energy Source objects, this helper Aggregates Energy Source object onto a node. Child implementation of this class creates the actual Energy Source object.

Device Energy Model Helper:

Base helper class for Device Energy Model objects, this helper attaches Device Energy Model objects onto Energy Source objects. Child implementation of this class creates the actual Device Energy Model object.

Energy Harvesting Helper:

Base helper class for Energy Harvester objects, this helper attaches Energy Harvester objects onto Energy Source objects. Child implementation of this class creates the actual Energy Harvester object.

Generic Battery Model Helper:

The GenericBatteryModelHelper can be used to easily install an energy source into a node or node container of one of four types of chemistries (Li-Ion,Lead Acid, NiCd,NiMH). Users must use one of the available presets that represent an specific battery.

GenericBatteryModelHelper batteryHelper;
EnergySourceContainer
energySourceContainer = batteryHelper.Install(nodeContainer,
                                              PANASONIC_CGR18650DA_LION);
batteryHelper.SetCellPack(energySourceContainer,2,2);

In the previous example, the GenericBatteryModelHelper was used to install a Panasonic CGR18650DA Li-Ion battery. Likewise, the helper is used to define a cell pack of 4 batteries. 2 batteries connected in series and 2 connected in parallel (2S,2P).

Another option is to manually configure the values that makes the preset:

auto node = CreateObject<Node>();
auto devicesEnergyModel = CreateObject<SimpleDeviceEnergyModel>();
batteryModel = CreateObject<GenericBatteryModel>();
batteryModel->SetAttribute("FullVoltage", DoubleValue(1.39));            // Qfull
batteryModel->SetAttribute("MaxCapacity", DoubleValue(7.0));             // Q
batteryModel->SetAttribute("NominalVoltage", DoubleValue(1.18));         // Vnom
batteryModel->SetAttribute("NominalCapacity", DoubleValue(6.25));        // QNom
batteryModel->SetAttribute("ExponentialVoltage", DoubleValue(1.28));     // Vexp
batteryModel->SetAttribute("ExponentialCapacity", DoubleValue(1.3));     // Qexp
batteryModel->SetAttribute("InternalResistance", DoubleValue(0.0046));   // R
batteryModel->SetAttribute("TypicalDischargeCurrent", DoubleValue(1.3)); // i typical
batteryModel->SetAttribute("CutoffVoltage", DoubleValue(1.0));           // End of charge.
batteryModel->SetAttribute("BatteryType", EnumValue(NIMH_NICD));         // General battery type
batteryModel = DynamicCast<GenericBatteryModel>
               (batteryHelper.Install(node,PANASONIC_HHR650D_NIMH));
devicesEnergyModel->SetEnergySource(batteryModel);
batteryModel->AppendDeviceEnergyModel(devicesEnergyModel);
devicesEnergyModel->SetNode(node);

Usage of both of these type of configurations are shown in generic-battery-discharge-example.cc. The following table is a list of the available presents in ns-3:

Preset Name

Description

PANASONIC_CGR18650DA_LION

Panasonic Li-Ion (3.6V, 2450Ah, Size A)

PANASONIC_HHR650D_NIMH

Panasonic NiMh HHR550D (1.2V 6.5Ah, Size D)

CSB_GP1272_LEADACID

CSB Lead Acid GP1272 (12V,7.2Ah)

PANASONIC_N700AAC_NICD

Panasonic NiCd N-700AAC (1.2V 700mAh, Size: AA)

RSPRO_LGP12100_LEADACID

Rs Pro Lead Acid LGP12100 (12V, 100Ah)

14.2.2. Tracing

Traced values differ between Energy Sources, Devices Energy Models and Energy Harvesters implementations, please look at the specific child class for details.

Basic Energy Source

  • RemainingEnergy: Remaining energy at BasicEnergySource.

RV Battery Model

  • RvBatteryModelBatteryLevel: RV battery model battery level.

  • RvBatteryModelBatteryLifetime: RV battery model battery lifetime.

WiFi Radio Energy Model

  • TotalEnergyConsumption: Total energy consumption of the radio device.

Basic Energy Harvester

  • HarvestedPower: Current power provided by the BasicEnergyHarvester.

  • TotalEnergyHarvested: Total energy harvested by the BasicEnergyHarvester.

14.2.3. Examples

The following examples have been written.

Examples in src/energy/examples:

  • basic-energy-model-test.cc: Demonstrates the use of a Basic energy source with a Wifi radio model.

  • generic-battery-discharge-example.cc: Demonstrates the installation of battery energy sources. The output of this example shows the discharge curve of 5 different batteries.

  • generic-battery-discharge-example.py: A simplified version of the previous example but using python bindings.

  • generic-battery-wifiradio-example.cc: Demonstrates the use and installation of the Generic Battery Model with the WifiRadio model.

  • rv-battery-model-test.cc: Discharge example of a RV energy source model.

Examples in examples/energy:

  • energy-model-example.cc

  • energy-model-with-harvesting-example.cc: Shows the harvesting model usage. Only usable with basicEnergySources.

14.2.4. Tests

The following tests have been written, which can be found in src/energy/tests/:

14.3. Validation

The RV battery model is validated by comparing results with what was presented in the original RV battery model paper. The generic battery model is validated by superimposing the obtained discharge curves with manufacturer’s datasheets plots. The following shows the results of the generic-battery-discharge-example.cc superimposed to manufacturer’s datasheets charts:

_images/leadacid.png

Lead acid battery discharge curve (CSB GP1272)

_images/liion.png

Li-Ion battery discharge curve (Panasonic CGR18650DA)

_images/nicd.png

NiCd battery discharge curve (Panasonic N-700AAC)

_images/nimh.png

NiMH battery discharge curve (Panasonic HHR650D)

14.3.1. Scope and Limitations

  • In the GenericBatteryModel charging behavior (voltage as a function of SoC) is included but is not been thoroughly tested. Testing requires the implementation of a harvesting device (A charger) capable of providing a CCCV charging method typically used in batteries.

  • In the GenericBatteryModel impedance (battery resistance) is constant, battery aging or temperature effects are not considered.

  • The Rv battery model has some reported issues (See: issue #164)

  • The harvesting mode can only be used with basic energy sources because it does not consider the current capacity or voltage of the battery.

14.3.2. Future Work

  • Support of device energy models for PHY layers (lr-wpan, WiMax, etc) and other pieces of hardware (UAV, sensors, CPU).

  • Support for realistic charging batteries in the GenericBatteryModule.

  • Support for device capable of charging batteries (e.g. chargers with CCCV capabilities).

  • Implement an energy harvester that recharges the energy sources according to the power levels defined in a user customizable dataset of real measurements.

14.3.3. References

Energy source models and energy consumption models:

[1] H. Wu, S. Nabar and R. Poovendran. An Energy Framework for the Network Simulator 3 (ns-3).

[2] M. Handy and D. Timmermann. Simulation of mobile wireless networks with accurate modelling of non-linear battery effects. In Proc. of Applied simulation and Modeling (ASM), 2003.

[3] D. N. Rakhmatov and S. B. Vrudhula. An analytical high-level battery model for use in energy management of portable electronic systems. In Proc. of IEEE/ACM International Conference on Computer Aided Design (ICCAD’01), pages 488-493, November 2001.

[4] D. N. Rakhmatov, S. B. Vrudhula, and D. A. Wallach. Battery lifetime prediction for energy-aware computing. In Proc. of the 2002 International Symposium on Low Power Electronics and Design (ISLPED’02), pages 154-159, 2002.

[5] Olivier Tremblay and Louis-A. Dessaint. 2009. Experimental Validation of a Battery Dynamic Model for EV Applications. World Electric Vehicle Journal 3, 2 (2009), 289–298. https://doi.org/10.3390/wevj3020289

[6] Olivier Tremblay, Louis-A. Dessaint, and Abdel-Illah Dekkiche. 2007. A Generic Battery Model for the Dynamic Simulation of Hybrid Electric Vehicles. In 2007 IEEE Vehicle Power and Propulsion Conference. 284–289. https://doi.org/10.1109/VPPC.2007.4544139

[7] MatWorks SimuLink Generic Battery Model

[8] C. M. Shepherd. 1965. Design of Primary and Secondary Cells: II . An Equation Describing Battery Discharge. Journal of The Electrochemical Society 112, 7 (jul 1965), 657. https://doi.org/10.1149/1.2423659

[9] Alberto Gallegos Ramonet, Alexander Guzman Urbina, and Kazuhiko Kinoshita. 2023. Evaluation and Extension of ns-3 Battery Framework. In Proceedings of the 2023 Workshop on ns-3 (WNS3 ‘23). Association for Computing Machinery, New York, NY, USA, 102–108. https://doi.org/10.1145/3592149.3592156

Energy Harvesting Models:

[10] C. Tapparello, H. Ayatollahi and W. Heinzelman. Extending the Energy Framework for Network Simulator 3 (ns-3). Workshop on ns-3 (WNS3), Poster Session, Atlanta, GA, USA. May, 2014.

[11] C. Tapparello, H. Ayatollahi and W. Heinzelman. Energy Harvesting Framework for Network Simulator 3 (ns-3). 2nd International Workshop on Energy Neutral Sensing Systems (ENSsys), Memphis, TN, USA. November 6, 2014.

15. Flow Monitor

15.1. Model Description

The source code for the module lives in the directory src/flow-monitor.

The Flow Monitor module goal is to provide a flexible system to measure the performance of network protocols. The module uses probes, installed in network nodes, to track the packets exchanged by the nodes, and it will measure a number of parameters. Packets are divided according to the flow they belong to, where each flow is defined according to the probe’s characteristics (e.g., for IP, a flow is defined as the packets with the same {protocol, source (IP, port), destination (IP, port)} tuple.

The statistics are collected for each flow can be exported in XML format. Moreover, the user can access the probes directly to request specific stats about each flow.

15.1.1. Design

Flow Monitor module is designed in a modular way. It can be extended by subclassing ns3::FlowProbe and ns3::FlowClassifier. Typically, a subclass of ns3::FlowProbe works by listening to the appropriate class Traces, and then uses its own ns3::FlowClassifier subclass to classify the packets passing though each node.

Each Probe can try to listen to other classes traces (e.g., ns3::Ipv4FlowProbe will try to use any ns3::NetDevice trace named TxQueue/Drop) but this is something that the user should not rely into blindly, because the trace is not guaranteed to be in every type of ns3::NetDevice. As an example, CsmaNetDevice and PointToPointNetDevice have a TxQueue/Drop trace, while WiFiNetDevice does not.

The full module design is described in [FlowMonitor]

15.1.2. Scope and Limitations

At the moment, probes and classifiers are available only for IPv4 and IPv6.

IPv4 and IPv6 probes will classify packets in four points:

  • When a packet is sent (SendOutgoing IPv[4,6] traces)

  • When a packet is forwarded (UnicastForward IPv[4,6] traces)

  • When a packet is received (LocalDeliver IPv[4,6] traces)

  • When a packet is dropped (Drop IPv[4,6] traces)

Since the packets are tracked at IP level, any retransmission caused by L4 protocols (e.g., TCP) will be seen by the probe as a new packet.

A Tag will be added to the packet (ns3::Ipv[4,6]FlowProbeTag). The tag will carry basic packet’s data, useful for the packet’s classification.

It must be underlined that only L4 (TCP, UDP) packets are, so far, classified. Moreover, only unicast packets will be classified. These limitations may be removed in the future.

The data collected for each flow are:

  • timeFirstTxPacket: when the first packet in the flow was transmitted;

  • timeLastTxPacket: when the last packet in the flow was transmitted;

  • timeFirstRxPacket: when the first packet in the flow was received by an end node;

  • timeLastRxPacket: when the last packet in the flow was received;

  • delaySum: the sum of all end-to-end delays for all received packets of the flow;

  • jitterSum: the sum of all end-to-end delay jitter (delay variation) values for all received packets of the flow, as defined in RFC 3393;

  • txBytes, txPackets: total number of transmitted bytes / packets for the flow;

  • rxBytes, rxPackets: total number of received bytes / packets for the flow;

  • lostPackets: total number of packets that are assumed to be lost (not reported over 10 seconds);

  • timesForwarded: the number of times a packet has been reportedly forwarded;

  • delayHistogram, jitterHistogram, packetSizeHistogram: histogram versions for the delay, jitter, and packet sizes, respectively;

  • packetsDropped, bytesDropped: the number of lost packets and bytes, divided according to the loss reason code (defined in the probe).

It is worth pointing out that the probes measure the packet bytes including IP headers. The L2 headers are not included in the measure.

These stats will be written in XML form upon request (see the Usage section).

Due to the above design, FlowMonitor can not generate statistics when used with DSR routing protocol (because DSR forwards packets using broadcast addresses)

15.1.2.1. The “lost” packets problem

At the end of a simulation, Flow Monitor could report about “lost” packets, i.e., packets that Flow Monitor have lost track of.

It is important to keep in mind that Flow Monitor records the packets statistics by intercepting them at a given network level - let’s say at IP level. When the simulation ends, any packet queued for transmission below the IP level will be considered as lost.

It is strongly suggested to consider this point when using Flow Monitor. The user can choose to:

  • Ignore the lost packets (if their number is a statistically irrelevant quantity), or

  • Stop the Applications before the actual Simulation End time, leaving enough time between the two for the queued packets to be processed.

The second method is the suggested one. Usually a few seconds are enough (the exact value depends on the network type).

It is important to stress that “lost” packets could be anywhere in the network, and could count toward the received packets or the dropped ones. Ideally, their number should be zero or a minimal fraction of the other ones, i.e., they should be “statistically irrelevant”.

15.1.3. References

FlowMonitor
  1. Carneiro, P. Fortuna, and M. Ricardo. 2009. FlowMonitor: a network monitoring framework for the network simulator 3 (NS-3). In Proceedings of the Fourth International ICST Conference on Performance Evaluation Methodologies and Tools (VALUETOOLS ‘09). http://dx.doi.org/10.4108/ICST.VALUETOOLS2009.7493 (Full text: https://dl.acm.org/doi/abs/10.4108/ICST.VALUETOOLS2009.7493)

15.2. Usage

The module usage is extremely simple. The helper will take care of about everything.

The typical use is:

// Flow monitor
Ptr<FlowMonitor> flowMonitor;
FlowMonitorHelper flowHelper;
flowMonitor = flowHelper.InstallAll();

-yourApplicationsContainer-.Stop(Seconds(stop_time));;
Simulator::Stop(Seconds(stop_time+cleanup_time));
Simulator::Run();

flowMonitor->SerializeToXmlFile("NameOfFile.xml", true, true);

the SerializeToXmlFile() function 2nd and 3rd parameters are used respectively to activate/deactivate the histograms and the per-probe detailed stats. Other possible alternatives can be found in the Doxygen documentation, while cleanup_time is the time needed by in-flight packets to reach their destinations.

15.2.1. Helpers

The helper API follows the pattern usage of normal helpers. Through the helper you can install the monitor in the nodes, set the monitor attributes, and print the statistics.

One important thing is: the ns3::FlowMonitorHelper must be instantiated only once in the main.

15.2.2. Attributes

The module provides the following attributes in ns3::FlowMonitor:

  • MaxPerHopDelay (Time, default 10s): The maximum per-hop delay that should be considered;

  • StartTime (Time, default 0s): The time when the monitoring starts;

  • DelayBinWidth (double, default 0.001): The width used in the delay histogram;

  • JitterBinWidth (double, default 0.001): The width used in the jitter histogram;

  • PacketSizeBinWidth (double, default 20.0): The width used in the packetSize histogram;

  • FlowInterruptionsBinWidth (double, default 0.25): The width used in the flowInterruptions histogram;

  • FlowInterruptionsMinTime (double, default 0.5): The minimum inter-arrival time that is considered a flow interruption.

15.2.3. Output

The main model output is an XML formatted report about flow statistics. An example is:

<?xml version="1.0" ?>
<FlowMonitor>
  <FlowStats>
  <Flow flowId="1" timeFirstTxPacket="+0.0ns" timeFirstRxPacket="+20067198.0ns" timeLastTxPacket="+2235764408.0ns" timeLastRxPacket="+2255831606.0ns" delaySum="+138731526300.0ns" jitterSum="+1849692150.0ns" lastDelay="+20067198.0ns" txBytes="2149400" rxBytes="2149400" txPackets="3735" rxPackets="3735" lostPackets="0" timesForwarded="7466">
  </Flow>
  </FlowStats>
  <Ipv4FlowClassifier>
  <Flow flowId="1" sourceAddress="10.1.3.1" destinationAddress="10.1.2.2" protocol="6" sourcePort="49153" destinationPort="50000" />
  </Ipv4FlowClassifier>
  <Ipv6FlowClassifier>
  </Ipv6FlowClassifier>
  <FlowProbes>
  <FlowProbe index="0">
    <FlowStats  flowId="1" packets="3735" bytes="2149400" delayFromFirstProbeSum="+0.0ns" >
    </FlowStats>
  </FlowProbe>
  <FlowProbe index="2">
    <FlowStats  flowId="1" packets="7466" bytes="2224020" delayFromFirstProbeSum="+199415389258.0ns" >
    </FlowStats>
  </FlowProbe>
  <FlowProbe index="4">
    <FlowStats  flowId="1" packets="3735" bytes="2149400" delayFromFirstProbeSum="+138731526300.0ns" >
    </FlowStats>
  </FlowProbe>
  </FlowProbes>
</FlowMonitor>

The output was generated by a TCP flow from 10.1.3.1 to 10.1.2.2.

It is worth noticing that the index 2 probe is reporting more packets and more bytes than the other probes. That’s a perfectly normal behaviour, as packets are fragmented at IP level in that node.

It should also be observed that the receiving node’s probe (index 4) doesn’t count the fragments, as the reassembly is done before the probing point.

15.2.4. Examples

The examples are located in src/flow-monitor/examples.

Moreover, the following examples use the flow-monitor module:

  • examples/matrix-topology/matrix-topology.cc

  • examples/routing/manet-routing-compare.cc

  • examples/routing/simple-global-routing.cc

  • examples/tcp/tcp-variants-comparison.cc

  • examples/wireless/wifi-multirate.cc

  • examples/wireless/wifi-hidden-terminal.cc

15.2.5. Troubleshooting

Do not define more than one ns3::FlowMonitorHelper in the simulation.

15.3. Validation

The paper in the references contains a full description of the module validation against a test network.

Tests are provided to ensure the Histogram correct functionality.

16. Internet Models (IP, TCP, Routing, UDP)

16.1. Internet Stack

16.1.1. Internet stack aggregation

A bare class Node is not very useful as-is; other objects must be aggregated to it to provide useful node functionality.

The ns-3 source code directory src/internet provides implementation of TCP/IPv4- and IPv6-related components. These include IPv4, ARP, UDP, TCP, IPv6, Neighbor Discovery, and other related protocols.

Internet Nodes are not subclasses of class Node; they are simply Nodes that have had a bunch of IP-related objects aggregated to them. They can be put together by hand, or via a helper function InternetStackHelper::Install () which does the following to all nodes passed in as arguments:

void
InternetStackHelper::Install(Ptr<Node> node) const
{
  if (m_ipv4Enabled)
    {
      /* IPv4 stack */
      if (node->GetObject<Ipv4>() != 0)
        {
          NS_FATAL_ERROR("InternetStackHelper::Install(): Aggregating "
                         "an InternetStack to a node with an existing Ipv4 object");
          return;
        }

      CreateAndAggregateObjectFromTypeId(node, "ns3::ArpL3Protocol");
      CreateAndAggregateObjectFromTypeId(node, "ns3::Ipv4L3Protocol");
      CreateAndAggregateObjectFromTypeId(node, "ns3::Icmpv4L4Protocol");
      // Set routing
      Ptr<Ipv4> ipv4 = node->GetObject<Ipv4>();
      Ptr<Ipv4RoutingProtocol> ipv4Routing = m_routing->Create(node);
      ipv4->SetRoutingProtocol(ipv4Routing);
    }

  if (m_ipv6Enabled)
    {
      /* IPv6 stack */
      if (node->GetObject<Ipv6>() != 0)
        {
          NS_FATAL_ERROR("InternetStackHelper::Install(): Aggregating "
                         "an InternetStack to a node with an existing Ipv6 object");
          return;
        }

      CreateAndAggregateObjectFromTypeId(node, "ns3::Ipv6L3Protocol");
      CreateAndAggregateObjectFromTypeId(node, "ns3::Icmpv6L4Protocol");
      // Set routing
      Ptr<Ipv6> ipv6 = node->GetObject<Ipv6>();
      Ptr<Ipv6RoutingProtocol> ipv6Routing = m_routingv6->Create(node);
      ipv6->SetRoutingProtocol(ipv6Routing);

      /* register IPv6 extensions and options */
      ipv6->RegisterExtensions();
      ipv6->RegisterOptions();
    }

  if (m_ipv4Enabled || m_ipv6Enabled)
    {
      /* UDP and TCP stacks */
      CreateAndAggregateObjectFromTypeId(node, "ns3::UdpL4Protocol");
      node->AggregateObject(m_tcpFactory.Create<Object>());
      Ptr<PacketSocketFactory> factory = CreateObject<PacketSocketFactory>();
      node->AggregateObject(factory);
    }
}

Where multiple implementations exist in ns-3 (TCP, IP routing), these objects are added by a factory object (TCP) or by a routing helper (m_routing).

Note that the routing protocol is configured and set outside this function. By default, the following protocols are added:

void InternetStackHelper::Initialize()
{
  SetTcp("ns3::TcpL4Protocol");
  Ipv4StaticRoutingHelper staticRouting;
  Ipv4GlobalRoutingHelper globalRouting;
  Ipv4ListRoutingHelper listRouting;
  Ipv6ListRoutingHelper listRoutingv6;
  Ipv6StaticRoutingHelper staticRoutingv6;
  listRouting.Add(staticRouting, 0);
  listRouting.Add(globalRouting, -10);
  listRoutingv6.Add(staticRoutingv6, 0);
  SetRoutingHelper(listRouting);
  SetRoutingHelper(listRoutingv6);
}

By default, IPv4 and IPv6 are enabled.

16.1.1.1. Internet Node structure

An IP-capable Node (an ns-3 Node augmented by aggregation to have one or more IP stacks) has the following internal structure.

16.1.1.1.1. Layer-3 protocols

At the lowest layer, sitting above the NetDevices, are the “layer 3” protocols, including IPv4, IPv6, ARP and so on. The class Ipv4L3Protocol is an implementation class whose public interface is typically class Ipv4, but the Ipv4L3Protocol public API is also used internally at present.

In class Ipv4L3Protocol, one method described below is Receive ():

/**
  * Lower layer calls this method after calling L3Demux::Lookup
  * The ARP subclass needs to know from which NetDevice this
  * packet is coming to:
  *    - implement a per-NetDevice ARP cache
  *    - send back arp replies on the right device
  */
void Receive( Ptr<NetDevice> device, Ptr<const Packet> p, uint16_t protocol,
const Address &from, const Address &to, NetDevice::PacketType packetType);

First, note that the Receive () function has a matching signature to the ReceiveCallback in the class Node. This function pointer is inserted into the Node’s protocol handler when AddInterface () is called. The actual registration is done with a statement such as follows:

RegisterProtocolHandler( MakeCallback(&Ipv4Protocol::Receive, ipv4),
                         Ipv4L3Protocol::PROT_NUMBER, 0);

The Ipv4L3Protocol object is aggregated to the Node; there is only one such Ipv4L3Protocol object. Higher-layer protocols that have a packet to send down to the Ipv4L3Protocol object can call GetObject<Ipv4L3Protocol>() to obtain a pointer, as follows:

Ptr<Ipv4L3Protocol> ipv4 = m_node->GetObject<Ipv4L3Protocol>();
if (ipv4 != 0)
  {
    ipv4->Send(packet, saddr, daddr, PROT_NUMBER);
  }

This class nicely demonstrates two techniques we exploit in ns-3 to bind objects together: callbacks, and object aggregation.

Once IPv4 routing has determined that a packet is for the local node, it forwards it up the stack. This is done with the following function:

void
Ipv4L3Protocol::LocalDeliver(Ptr<const Packet> packet, Ipv4Header const&ip, uint32_t iif)

The first step is to find the right Ipv4L4Protocol object, based on IP protocol number. For instance, TCP is registered in the demux as protocol number 6. Finally, the Receive() function on the Ipv4L4Protocol (such as TcpL4Protocol::Receive is called.

We have not yet introduced the class Ipv4Interface. Basically, each NetDevice is paired with an IPv4 representation of such device. In Linux, this class Ipv4Interface roughly corresponds to the struct in_device; the main purpose is to provide address-family specific information (addresses) about an interface.

All the classes have appropriate traces in order to track sent, received and lost packets. The users is encouraged to use them so to find out if (and where) a packet is dropped. A common mistake is to forget the effects of local queues when sending packets, e.g., the ARP queue. This can be particularly puzzling when sending jumbo packets or packet bursts using UDP. The ARP cache pending queue is limited (3 datagrams) and IP packets might be fragmented, easily overfilling the ARP cache queue size. In those cases it is useful to increase the ARP cache pending size to a proper value, e.g.:

Config::SetDefault("ns3::ArpCache::PendingQueueSize", UintegerValue(MAX_BURST_SIZE/L2MTU*3));

The IPv6 implementation follows a similar architecture. Dual-stacked nodes (one with support for both IPv4 and IPv6) will allow an IPv6 socket to receive IPv4 connections as a standard dual-stacked system does. A socket bound and listening to an IPv6 endpoint can receive an IPv4 connection and will return the remote address as an IPv4-mapped address. Support for the IPV6_V6ONLY socket option does not currently exist.

16.1.1.1.2. Layer-4 protocols and sockets

We next describe how the transport protocols, sockets, and applications tie together. In summary, each transport protocol implementation is a socket factory. An application that needs a new socket

For instance, to create a UDP socket, an application would use a code snippet such as the following:

Ptr<Udp> udpSocketFactory = GetNode()->GetObject<Udp>();
Ptr<Socket> m_socket = socketFactory->CreateSocket();
m_socket->Bind(m_local_address);
...

The above will query the node to get a pointer to its UDP socket factory, will create one such socket, and will use the socket with an API similar to the C-based sockets API, such as Connect() and Send(). The address passed to the Bind(), Connect(), or Send() functions may be a Ipv4Address, Ipv6Address, or Address. If a Address is passed in and contains anything other than a Ipv4Address or Ipv6Address, these functions will return an error. The Bind() and Bind6() functions bind to “0.0.0.0” and “::” respectively.

The socket can also be bound to a specific NetDevice though the BindToNetDevice(Ptr<NetDevice> netdevice) function. BindToNetDevice(Ptr<NetDevice> netdevice) will bind the socket to “0.0.0.0” and “::”(equivalent to calling Bind() and Bind6(), unless the socket has been already bound to a specific address. Summarizing, the correct sequence is:

 Ptr<Udp> udpSocketFactory = GetNode()->GetObject<Udp>();
 Ptr<Socket> m_socket = socketFactory->CreateSocket();
 m_socket->BindToNetDevice(n_netDevice);
...

or:

Ptr<Udp> udpSocketFactory = GetNode()->GetObject<Udp>();
Ptr<Socket> m_socket = socketFactory->CreateSocket();
m_socket->Bind(m_local_address);
m_socket->BindToNetDevice(n_netDevice);
...

The following raises an error:

Ptr<Udp> udpSocketFactory = GetNode()->GetObject<Udp>();
Ptr<Socket> m_socket = socketFactory->CreateSocket();
m_socket->BindToNetDevice(n_netDevice);
m_socket->Bind(m_local_address);
...

See the chapter on ns-3 sockets for more information.

We have described so far a socket factory (e.g. class Udp) and a socket, which may be specialized (e.g., class UdpSocket). There are a few more key objects that relate to the specialized task of demultiplexing a packet to one or more receiving sockets. The key object in this task is class Ipv4EndPointDemux. This demultiplexer stores objects of class Ipv4EndPoint. This class holds the addressing/port tuple (local port, local address, destination port, destination address) associated with the socket, and a receive callback. This receive callback has a receive function registered by the socket. The Lookup() function to Ipv4EndPointDemux returns a list of Ipv4EndPoint objects(there may be a list since more than one socket may match the packet). The layer-4 protocol copies the packet to each Ipv4EndPoint and calls its ForwardUp() method, which then calls the Receive() function registered by the socket.

An issue that arises when working with the sockets API on real systems is the need to manage the reading from a socket, using some type of I/O (e.g., blocking, non-blocking, asynchronous, …). ns-3 implements an asynchronous model for socket I/O; the application sets a callback to be notified of received data ready to be read, and the callback is invoked by the transport protocol when data is available. This callback is specified as follows:

void Socket::SetRecvCallback(Callback<void, Ptr<Socket>,
                             Ptr<Packet>,
                             const Address&> receivedData);

The data being received is conveyed in the Packet data buffer. An example usage is in class PacketSink:

m_socket->SetRecvCallback(MakeCallback(&PacketSink::HandleRead, this));

To summarize, internally, the UDP implementation is organized as follows:

  • a UdpImpl class that implements the UDP socket factory functionality

  • a UdpL4Protocol class that implements the protocol logic that is socket-independent

  • a UdpSocketImpl class that implements socket-specific aspects of UDP

  • a class called Ipv4EndPoint that stores the addressing tuple (local port, local address, destination port, destination address) associated with the socket, and a receive callback for the socket.

16.1.1.2. IP-capable node interfaces

Many of the implementation details, or internal objects themselves, of IP-capable Node objects are not exposed at the simulator public API. This allows for different implementations; for instance, replacing the native ns-3 models with ported TCP/IP stack code.

The C++ public APIs of all of these objects is found in the src/network directory, including principally:

  • address.h

  • socket.h

  • node.h

  • packet.h

These are typically base class objects that implement the default values used in the implementation, implement access methods to get/set state variables, host attributes, and implement publicly-available methods exposed to clients such as CreateSocket.

16.1.1.3. Example path of a packet

These two figures show an example stack trace of how packets flow through the Internet Node objects.

_images/internet-node-send.png

Send path of a packet.

_images/internet-node-recv.png

Receive path of a packet.

16.2. IPv4

This chapter describes the ns-3 IPv4 address assignment and basic components tracking.

16.2.1. IPv4 addresses assignment

In order to use IPv4 on a network, the first thing to do is assigning IPv4 addresses.

Any IPv4-enabled ns-3 node will have at least one NetDevice: the ns3::LoopbackNetDevice. The loopback device address is 127.0.0.1. All the other NetDevices will have one (or more) IPv4 addresses.

Note that, as today, ns-3 does not have a NAT module, and it does not follows the rules about filtering private addresses (RFC 1918): 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16. These addresses are routed as any other address. This behaviour could change in the future.

IPv4 global addresses can be:

  • manually assigned

  • assigned though DHCP

ns-3 can use both methods, and it’s quite important to understand the implications of both.

16.2.1.1. Manually assigned IPv4 addresses

This is probably the easiest and most used method. As an example:

Ptr<Node> n0 = CreateObject<Node>();
Ptr<Node> n1 = CreateObject<Node>();
NodeContainer net(n0, n1);
CsmaHelper csma;
NetDeviceContainer ndc = csma.Install(net);

NS_LOG_INFO("Assign IPv4 Addresses.");
Ipv4AddressHelper ipv4;
ipv4.SetBase(Ipv4Address("192.168.1.0"), NetMask("/24"));
Ipv4InterfaceContainer ic = ipv4.Assign(ndc);

This method will add two global IPv4 addresses to the nodes.

Note that the addresses are assigned in sequence. As a consequence, the first Node / NetDevice will have “192.168.1.1”, the second “192.168.1.2” and so on.

It is possible to repeat the above to assign more than one address to a node. However, due to the Ipv4AddressHelper singleton nature, one should first assign all the addresses of a network, then change the network base (SetBase), then do a new assignment.

Alternatively, it is possible to assign a specific address to a node:

Ptr<Node> n0 = CreateObject<Node>();
NodeContainer net(n0);
CsmaHelper csma;
NetDeviceContainer ndc = csma.Install(net);

NS_LOG_INFO("Specifically Assign an IPv4 Address.");
Ipv4AddressHelper ipv4;
Ptr<NetDevice> device = ndc.Get(0);
Ptr<Node> node = device->GetNode();
Ptr<Ipv4> ipv4proto = node->GetObject<Ipv4>();
int32_t ifIndex = 0;
ifIndex = ipv4proto->GetInterfaceForDevice(device);
Ipv4InterfaceAddress ipv4Addr = Ipv4InterfaceAddress(Ipv4Address("192.168.1.42"), NetMask("/24"));
ipv4proto->AddAddress(ifIndex, ipv4Addr);
16.2.1.2. DHCP assigned IPv4 addresses

DHCP is available in the internet-apps module. In order to use DHCP you have to have a DhcpServer application in a node (the DHC server node) and a DhcpClient application in each of the nodes. Note that it not necessary that all the nodes in a subnet use DHCP. Some nodes can have static addresses.

All the DHCP setup is performed though the DhcpHelper class. A complete example is in src/internet-apps/examples/dhcp-example.cc.

Further info about the DHCP functionalities can be found in the internet-apps model documentation.

16.2.2. Tracing in the IPv4 Stack

The internet stack provides a number of trace sources in its various protocol implementations. These trace sources can be hooked using your own custom trace code, or you can use our helper functions in some cases to arrange for tracing to be enabled.

16.2.2.1. Tracing in ARP

ARP provides two trace hooks, one in the cache, and one in the layer three protocol. The trace accessor in the cache is given the name “Drop.” When a packet is transmitted over an interface that requires ARP, it is first queued for transmission in the ARP cache until the required MAC address is resolved. There are a number of retries that may be done trying to get the address, and if the maximum retry count is exceeded the packet in question is dropped by ARP. The single trace hook in the ARP cache is called,

  • If an outbound packet is placed in the ARP cache pending address resolution and no resolution can be made within the maximum retry count, the outbound packet is dropped and this trace is fired;

A second trace hook lives in the ARP L3 protocol (also named “Drop”) and may be called for a number of reasons.

  • If an ARP reply is received for an entry that is not waiting for a reply, the ARP reply packet is dropped and this trace is fired;

  • If an ARP reply is received for a non-existent entry, the ARP reply packet is dropped and this trace is fired;

  • If an ARP cache entry is in the DEAD state (has timed out) and an ARP reply packet is received, the reply packet is dropped and this trace is fired.

  • Each ARP cache entry has a queue of pending packets. If the size of the queue is exceeded, the outbound packet is dropped and this trace is fired.

16.2.2.2. Tracing in IPv4

The IPv4 layer three protocol provides three trace hooks. These are the “Tx” (ns3::Ipv4L3Protocol::m_txTrace), “Rx” (ns3::Ipv4L3Protocol::m_rxTrace) and “Drop” (ns3::Ipv4L3Protocol::m_dropTrace) trace sources.

The “Tx” trace is fired in a number of situations, all of which indicate that a given packet is about to be sent down to a given ns3::Ipv4Interface.

  • In the case of a packet destined for the broadcast address, the Ipv4InterfaceList is iterated and for every interface that is up and can fragment the packet or has a large enough MTU to transmit the packet, the trace is hit. See ns3::Ipv4L3Protocol::Send.

  • In the case of a packet that needs routing, the “Tx” trace may be fired just before a packet is sent to the interface appropriate to the default gateway. See ns3::Ipv4L3Protocol::SendRealOut.

  • Also in the case of a packet that needs routing, the “Tx” trace may be fired just before a packet is sent to the outgoing interface appropriate to the discovered route. See ns3::Ipv4L3Protocol::SendRealOut.

The “Rx” trace is fired when a packet is passed from the device up to the ns3::Ipv4L3Protocol::Receive function.

  • In the receive function, the Ipv4InterfaceList is iterated, and if the Ipv4Interface corresponding to the receiving device is found to be in the UP state, the trace is fired.

The “Drop” trace is fired in any case where the packet is dropped (in both the transmit and receive paths).

  • In the ns3::Ipv4Interface::Receive function, the packet is dropped and the drop trace is hit if the interface corresponding to the receiving device is in the DOWN state.

  • Also in the ns3::Ipv4Interface::Receive function, the packet is dropped and the drop trace is hit if the checksum is found to be bad.

  • In ns3::Ipv4L3Protocol::Send, an outgoing packet bound for the broadcast address is dropped and the “Drop” trace is fired if the “don’t fragment” bit is set and fragmentation is available and required.

  • Also in ns3::Ipv4L3Protocol::Send, an outgoing packet destined for the broadcast address is dropped and the “Drop” trace is hit if fragmentation is not available and is required (MTU < packet size).

  • In the case of a broadcast address, an outgoing packet is cloned for each outgoing interface. If any of the interfaces is in the DOWN state, the “Drop” trace event fires with a reference to the copied packet.

  • In the case of a packet requiring a route, an outgoing packet is dropped and the “Drop” trace event fires if no route to the remote host is found.

  • In ns3::Ipv4L3Protocol::SendRealOut, an outgoing packet being routed is dropped and the “Drop” trace is fired if the “don’t fragment” bit is set and fragmentation is available and required.

  • Also in ns3::Ipv4L3Protocol::SendRealOut, an outgoing packet being routed is dropped and the “Drop” trace is hit if fragmentation is not available and is required (MTU < packet size).

  • An outgoing packet being routed is dropped and the “Drop” trace event fires if the required Ipv4Interface is in the DOWN state.

  • If a packet is being forwarded, and the TTL is exceeded (see ns3::Ipv4L3Protocol::DoForward), the packet is dropped and the “Drop” trace event is fired.

16.2.3. Explicit Congestion Notification (ECN) bits

  • In IPv4, ECN bits are the last 2 bits in TOS field and occupy 14th and 15th bits in the header.

  • The IPv4 header class defines an EcnType enum with all four ECN codepoints (ECN_NotECT, ECN_ECT1, ECN_ECT0, ECN_CE) mentioned in RFC 3168, and also a setter and getter method to handle ECN values in the TOS field.

16.2.4. Ipv4QueueDiscItem

The traffic control sublayer in ns-3 handles objects of class QueueDiscItem which are used to hold an ns3::Packet and an ns3::Header. This is done to facilitate the marking of packets for Explicit Congestion Notification. The Mark () method is implemented in Ipv4QueueDiscItem. It returns true if marking the packet is successful, i.e., it successfully sets the CE bit in the IPv4 header. The Mark () method will return false, however, if the IPv4 header indicates the ECN_NotECT codepoint.

16.2.5. RFC 6621 duplicate packet detection

To support mesh network protocols over broadcast-capable networks (e.g. Wi-Fi), it is useful to have support for duplicate packet detection and filtering, since nodes in a network may receive multiple copies of flooded multicast packets arriving on different paths. The Ipv4L3Protocol model in ns-3 has a model for hash-based duplicate packet detection (DPD) based on Section 6.2.2 of (RFC 6621). The model, disabled by default, must be enabled by setting EnableRFC6621 to true. A second attribute, DuplicateExpire, sets the expiration delay for erasing the cache entry of a packet in the duplicate cache; the delay value defaults to 1ms.

16.2.6. NeighborCache

NeighborCacheHelper provides a way to generate ARP cache automatically. It generates needed ARP cache before simulation start to avoid the delay and message overhead of address resolution in simulations that are focused on other performance aspects. The state of entries which are generated by NeighborCacheHelper is STATIC_AUTOGENERATED, which is similar to PERMANENT, but they are not manually added or removed by user, they will be managed by NeighborCacheHelper when user need pre-generate cache. When user is generating neighbor caches globally, neighbor caches will update dynamically when IPv4 addresses are removed or added; when user is generating neighbor caches partially, NeighborCacheHelper will take care of address removal, for adding address user may rerun a reduced-scope PopulateNeighbor() again to pick up the new IP address or manually add an entry to keep the neighbor cache up-to-date, the reason is that: when PopulateNeighborCache() has previously been run with a scope less than global, the code does not know whether it was previously run with a scope of Channel, NetDeviceContainer, or Ip interface container. The source code for NeighborCache is located in src/internet/helper/neighbor-cache-helper A complete example is in src/internet/examples/neighbor-cache-example.cc.

16.2.6.1. Usage

The typical usages are:

* Populate neighbor ARP caches for all devices:
NeighborCacheHelper neighborCache;
neighborCache.PopulateNeighborCache();
  • Populate neighbor ARP caches for a given channel:

NeighborCacheHelper neighborCache;
neighborCache.PopulateNeighborCache(channel);     // channel is the Ptr<Channel> want to generate ARP caches
  • Populate neighbor ARP caches for devices in a given NetDeviceContainer:

NeighborCacheHelper neighborCache;
neighborCache.PopulateNeighborCache(netDevices);   // netDevices is the NetDeviceContainer want to generate ARP caches
  • Populate neighbor ARP caches for a given Ipv4InterfaceContainer:

NeighborCacheHelper neighborCache;
neighborCache.PopulateNeighborCache(interfaces);    // interfaces is the Ipv4InterfaceContainer want to generate ARP caches

16.3. IPv6

This chapter describes the ns-3 IPv6 model capabilities and limitations along with its usage and examples.

16.3.1. IPv6 model description

The IPv6 model is loosely patterned after the Linux implementation; the implementation is not complete as some features of IPv6 are not of much interest to simulation studies, and some features of IPv6 are simply not modeled yet in ns-3.

The base class Ipv6 defines a generic API, while the class Ipv6L3Protocol is the actual class implementing the protocol. The actual classes used by the IPv6 stack are located mainly in the directory src/internet.

The implementation of IPv6 is contained in the following files:

src/internet/model/icmpv6-header.{cc,h}
src/internet/model/icmpv6-l4-protocol.{cc,h}
src/internet/model/ipv6.{cc,h}
src/internet/model/ipv6-address-generator.{cc,h}
src/internet/model/ipv6-autoconfigured-prefix.{cc,h}
src/internet/model/ipv6-end-point.{cc,h}
src/internet/model/ipv6-end-point-demux.{cc,h}
src/internet/model/ipv6-extension.{cc,h}
src/internet/model/ipv6-extension-demux.{cc,h}
src/internet/model/ipv6-extension-header.{cc,h}
src/internet/model/ipv6-header.{cc,h}
src/internet/model/ipv6-interface.{cc,h}
src/internet/model/ipv6-interface-address.{cc,h}
src/internet/model/ipv6-l3-protocol.{cc,h}
src/internet/model/ipv6-list-routing.{cc,h}
src/internet/model/ipv6-option.{cc,h}
src/internet/model/ipv6-option-demux.{cc,h}
src/internet/model/ipv6-option-header.{cc,h}
src/internet/model/ipv6-packet-info-tag.{cc,h}
src/internet/model/ipv6-pmtu-cache.{cc,h}
src/internet/model/ipv6-raw-socket-factory.{cc,h}
src/internet/model/ipv6-raw-socket-factory-impl.{cc,h}
src/internet/model/ipv6-raw-socket-impl.{cc,h}
src/internet/model/ipv6-route.{cc,h}
src/internet/model/ipv6-routing-protocol.{cc,h}
src/internet/model/ipv6-routing-table-entry.{cc,h}
src/internet/model/ipv6-static-routing.{cc,h}
src/internet/model/ndisc-cache.{cc,h}
src/network/utils/inet6-socket-address.{cc,h}
src/network/utils/ipv6-address.{cc,h}

Also some helpers are involved with IPv6:

src/internet/helper/internet-stack-helper.{cc,h}
src/internet/helper/ipv6-address-helper.{cc,h}
src/internet/helper/ipv6-interface-container.{cc,h}
src/internet/helper/ipv6-list-routing-helper.{cc,h}
src/internet/helper/ipv6-routing-helper.{cc,h}
src/internet/helper/ipv6-static-routing-helper.{cc,h}

The model files can be roughly divided into:

  • protocol models (e.g., ipv6, ipv6-l3-protocol, icmpv6-l4-protocol, etc.)

  • routing models (i.e., anything with ‘routing’ in its name)

  • sockets and interfaces (e.g., ipv6-raw-socket, ipv6-interface, ipv6-end-point, etc.)

  • address-related things

  • headers, option headers, extension headers, etc.

  • accessory classes (e.g., ndisc-cache)

16.3.2. Usage

The following description is based on using the typical helpers found in the example code.

IPv6 does not need to be activated in a node. it is automatically added to the available protocols once the Internet Stack is installed.

In order to not install IPv6 along with IPv4, it is possible to use ns3::InternetStackHelper method SetIpv6StackInstall (bool enable) before installing the InternetStack in the nodes.

Note that to have an IPv6-only network (i.e., to not install the IPv4 stack in a node) one should use ns3::InternetStackHelper method SetIpv4StackInstall (bool enable) before the stack installation.

As an example, in the following code node 0 will have both IPv4 and IPv6, node 1 only only IPv6 and node 2 only IPv4:

NodeContainer n;
n.Create(3);

InternetStackHelper internet;
InternetStackHelper internetV4only;
InternetStackHelper internetV6only;

internetV4only.SetIpv6StackInstall(false);
internetV6only.SetIpv4StackInstall(false);

internet.Install(n.Get(0));
internetV6only.Install(n.Get(1));
internetV4only.Install(n.Get(2));
16.3.2.1. IPv6 addresses assignment

In order to use IPv6 on a network, the first thing to do is assigning IPv6 addresses.

Any IPv6-enabled ns-3 node will have at least one NetDevice: the ns3::LoopbackNetDevice. The loopback device address is ::1. All the other NetDevices will have one or more IPv6 addresses:

  • One link-local address: fe80::interface ID, where interface ID is derived from the NetDevice MAC address.

  • Zero or more global addresses, e.g., 2001:db8::1.

Typically the first address on an interface will be the link-local one, with the global address(es) being the following ones.

IPv6 global addresses might be:

  • manually assigned

  • auto-generated

ns-3 can use both methods, and it’s quite important to understand the implications of both.

16.3.2.1.1. Manually assigned IPv6 addresses

This is probably the easiest and most used method. As an example:

Ptr<Node> n0 = CreateObject<Node>();
Ptr<Node> n1 = CreateObject<Node>();
NodeContainer net(n0, n1);
CsmaHelper csma;
NetDeviceContainer ndc = csma.Install(net);

NS_LOG_INFO("Assign IPv6 Addresses.");
Ipv6AddressHelper ipv6;
ipv6.SetBase(Ipv6Address("2001:db8::"), Ipv6Prefix(64));
Ipv6InterfaceContainer ic = ipv6.Assign(ndc);

This method will add two global IPv6 addresses to the nodes. Note that, as usual for IPv6, all the nodes will also have a link-local address. Typically the first address on an interface will be the link-local one, with the global address(es) being the following ones.

Note that the global addresses will be derived from the MAC address. As a consequence, expect to have addresses similar to 2001:db8::200:ff:fe00:1.

It is possible to repeat the above to assign more than one global address to a node. However, due to the Ipv6AddressHelper singleton nature, one should first assign all the addresses of a network, then change the network base (SetBase), then do a new assignment.

Alternatively, it is possible to assign a specific address to a node:

Ptr<Node> n0 = CreateObject<Node>();
NodeContainer net(n0);
CsmaHelper csma;
NetDeviceContainer ndc = csma.Install(net);

NS_LOG_INFO("Specifically Assign an IPv6 Address.");
Ipv6AddressHelper ipv6;
Ptr<NetDevice> device = ndc.Get(0);
Ptr<Node> node = device->GetNode();
Ptr<Ipv6> ipv6proto = node->GetObject<Ipv6>();
int32_t ifIndex = 0;
ifIndex = ipv6proto->GetInterfaceForDevice(device);
Ipv6InterfaceAddress ipv6Addr = Ipv6InterfaceAddress(Ipv6Address("2001:db8:f00d:cafe::42"), Ipv6Prefix(64));
ipv6proto->AddAddress(ifIndex, ipv6Addr);
16.3.2.1.2. Auto-generated IPv6 addresses

This is accomplished by relying on the RADVD protocol, implemented by the class Radvd. A helper class is available, which can be used to ease the most common tasks, e.g., setting up a prefix on an interface, if it is announced periodically, and if the router is the default router for that interface.

A fine grain configuration is possible though the RadvdInterface class, which allows to setup every parameter of the announced router advertisement on a given interface.

It is worth mentioning that the configurations must be set up before installing the application in the node.

Upon using this method, the nodes will acquire dynamically (i.e., during the simulation) one (or more) global address(es) according to the RADVD configuration. These addresses will be bases on the RADVD announced prefix and the node’s EUI-64.

Examples of RADVD use are shown in examples/ipv6/radvd.cc and examples/ipv6/radvd-two-prefix.cc.

Note that the router (i.e., the node with Radvd) will have to have a global address, while the nodes using the auto-generated addresses (SLAAC) will have to have a link-local address. This is accomplished using Ipv6AddressHelper::AssignWithoutAddress, e.g.:

Ipv6AddressHelper ipv6;
NetDeviceContainer tmp;
tmp.Add (d1.Get(0)); /* n0 */
Ipv6InterfaceContainer iic1 = ipv6.AssignWithoutAddress(tmp); /* n0 interface */
16.3.2.1.2.1. Random-generated IPv6 addresses

While IPv6 real nodes will use randomly generated addresses to protect privacy, ns-3 does NOT have this capability. This feature haven’t been so far considered as interesting for simulation.

16.3.2.1.4. Duplicate Address Detection (DAD)

Nodes will perform DAD (it can be disabled using an Icmpv6L4Protocol attribute). Upon receiving a DAD, however, nodes will not react to it. As is: DAD reaction is incomplete so far. The main reason relies on the missing random-generated address capability. Moreover, since ns-3 nodes will usually be well-behaving, there shouldn’t be any Duplicate Address. This might be changed in the future, so as to avoid issues with real-world integrated simulations.

16.3.2.2. Explicit Congestion Notification (ECN) bits in IPv6
  • In IPv6, ECN bits are the last 2 bits of the Traffic class and occupy 10th and 11th bit in the header.

  • The IPv6 header class defines an EcnType enum with all four ECN codepoints (ECN_NotECT, ECN_ECT1, ECN_ECT0, ECN_CE) mentioned in RFC 3168, and also a setter and getter method to handle ECN values in the Traffic Class field.

16.3.3. Ipv6QueueDiscItem

The traffic control sublayer in ns-3 handles objects of class QueueDiscItem which are used to hold an ns3::Packet and an ns3::Header. This is done to facilitate the marking of packets for Explicit Congestion Notification. The Mark () method is implemented in Ipv6QueueDiscItem. It returns true if marking the packet is successful, i.e., it successfully sets the CE bit in the IPv6 header. The Mark () method will return false, however, if the IPv6 header indicates the ECN_NotECT codepoint.

16.3.3.1. Host and Router behaviour in IPv6 and ns-3

In IPv6 there is a clear distinction between routers and hosts. As one might expect, routers can forward packets from an interface to another interface, while hosts drop packets not directed to them.

Unfortunately, forwarding is not the only thing affected by this distinction, and forwarding itself might be fine-tuned, e.g., to forward packets incoming from an interface and drop packets from another interface.

In ns-3 a node is configured to be an host by default. There are two main ways to change this behaviour:

  • Using ns3::Ipv6InterfaceContainer SetForwarding(uint32_t i, bool router) where i is the interface index in the container.

  • Changing the ns3::Ipv6 attribute IpForward.

Either one can be used during the simulation.

A fine-grained setup can be accomplished by using ns3::Ipv6Interface SetForwarding (bool forward); which allows to change the behaviour on a per-interface-basis.

Note that the node-wide configuration only serves as a convenient method to enable/disable the ns3::Ipv6Interface specific setting. An Ipv6Interface added to a node with forwarding enabled will be set to be forwarding as well. This is really important when a node has interfaces added during the simulation.

According to the ns3::Ipv6Interface forwarding state, the following happens:

  • Forwarding OFF

  • The node will NOT reply to Router Solicitation

  • The node will react to Router Advertisement

  • The node will periodically send Router Solicitation

  • Routing protocols MUST DROP packets not directed to the node

  • Forwarding ON

  • The node will reply to Router Solicitation

  • The node will NOT react to Router Advertisement

  • The node will NOT send Router Solicitation

  • Routing protocols MUST forward packets

The behaviour is matching ip-sysctl.txt (http://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt) with the difference that it’s not possible to override the behaviour using esoteric settings (e.g., forwarding but accept router advertisements, accept_ra=2, or forwarding but send router solicitations forwarding=2).

Consider carefully the implications of packet forwarding. As an example, a node will NOT send ICMPv6 PACKET_TOO_BIG messages from an interface with forwarding off. This is completely normal, as the Routing protocol will drop the packet before attempting to forward it.

16.3.3.2. Helpers

Typically the helpers used in IPv6 setup are:

  • ns3::InternetStackHelper

  • ns3::Ipv6AddressHelper

  • ns3::Ipv6InterfaceContainer

The use is almost identical to the corresponding IPv4 case, e.g.:

NodeContainer n;
n.Create(4);

NS_LOG_INFO("Create IPv6 Internet Stack");
InternetStackHelper internetv6;
internetv6.Install(n);

NS_LOG_INFO("Create channels.");
CsmaHelper csma;
NetDeviceContainer d = csma.Install(n);

NS_LOG_INFO("Create networks and assign IPv6 Addresses.");
Ipv6AddressHelper ipv6;
ipv6.SetBase(Ipv6Address("2001:db8::"), Ipv6Prefix(64));
Ipv6InterfaceContainer iic = ipv6.Assign(d);

Additionally, a common task is to enable forwarding on one of the nodes and to setup a default route toward it in the other nodes, e.g.:

iic.SetForwarding(0, true);
iic.SetDefaultRouteInAllNodes(0);

This will enable forwarding on the node 0 and will setup a default route in ns3::Ipv6StaticRouting on all the other nodes. Note that this requires that Ipv6StaticRouting is present in the nodes.

The IPv6 routing helpers enable the user to perform specific tasks on the particular routing algorithm and to print the routing tables.

16.3.3.3. Attributes

Many classes in the ns-3 IPv6 implementation contain attributes. The most useful ones are:

  • ns3::Ipv6

  • IpForward, boolean, default false. Globally enable or disable IP forwarding for all current and future IPv6 devices.

  • MtuDiscover, boolean, default true. If disabled, every interface will have its MTU set to 1280 bytes.

  • ns3::Ipv6L3Protocol

  • DefaultTtl, uint8_t, default 64. The TTL value set by default on all outgoing packets generated on this node.

  • SendIcmpv6Redirect, boolean, default true. Send the ICMPv6 Redirect when appropriate.

  • ns3::Icmpv6L4Protocol

  • DAD, boolean, default true. Always do DAD (Duplicate Address Detection) check.

  • ns3::NdiscCache

  • UnresolvedQueueSize, uint32_t, default 3. Size of the queue for packets pending an NA reply.

16.3.3.4. Output

The IPv6 stack provides some useful trace sources:

  • ns3::Ipv6L3Protocol

  • Tx, Send IPv6 packet to outgoing interface.

  • Rx, Receive IPv6 packet from incoming interface.

  • Drop, Drop IPv6 packet.

  • ns3::Ipv6Extension

  • Drop, Drop IPv6 packet.

The latest trace source is generated when a packet contains an unknown option blocking its processing.

Mind that ns3::NdiscCache could drop packets as well, and they are not logged in a trace source (yet). This might generate some confusion in the sent/received packets counters.

16.3.3.5. Advanced Usage
16.3.3.5.1. IPv6 maximum transmission unit (MTU) and fragmentation

ns-3 NetDevices define the MTU according to the L2 simulated Device. IPv6 requires that the minimum MTU is 1280 bytes, so all NetDevices are required to support at least this MTU. This is the link-MTU.

In order to support different MTUs in a source-destination path, ns-3 IPv6 model can perform fragmentation. This can be either triggered by receiving a packet bigger than the link-MTU from the L4 protocols (UDP, TCP, etc.), or by receiving an ICMPv6 PACKET_TOO_BIG message. The model mimics RFC 1981, with the following notable exceptions:

  • L4 protocols are not informed of the Path MTU change

  • TCP can not change its Segment Size according to the Path-MTU.

Both limitations are going to be removed in due time.

The Path-MTU cache is currently based on the source-destination IPv6 addresses. Further classifications (e.g., flow label) are possible but not yet implemented.

The Path-MTU default validity time is 10 minutes. After the cache entry expiration, the Path-MTU information is removed and the next packet will (eventually) trigger a new ICMPv6 PACKET_TOO_BIG message. Note that 1) this is consistent with the RFC specification and 2) L4 protocols are responsible for retransmitting the packets.

16.3.3.6. Examples

The examples for IPv6 are in the directory examples/ipv6. These examples focus on the most interesting IPv6 peculiarities, such as fragmentation, redirect and so on.

Moreover, most TCP and UDP examples located in examples/udp, examples/tcp, etc. have a command-line option to use IPv6 instead of IPv4.

16.3.3.7. Troubleshooting

There are just a few pitfalls to avoid while using ns-3 IPv6.

16.3.3.7.1. Routing loops

Since the only (so far) routing scheme available for IPv6 is ns3::Ipv6StaticRouting, default router have to be setup manually. When there are two or more routers in a network (e.g., node A and node B), avoid using the helper function SetDefaultRouteInAllNodes for more than one router.

The consequence would be to install a default route to B in A and a default route pointing to A in B, generating a loop.

16.3.3.7.2. Global address leakage

Remember that addresses in IPv6 are global by definition. When using IPv6 with an emulation ns-3 capability, avoid at all costs address leakage toward the global Internet. It is advisable to setup an external firewall to prevent leakage.

16.3.3.7.3. 2001:DB8::/32 addresses

IPv6 standard (RFC 3849) defines the 2001:DB8::/32 address class for the documentation. This manual uses this convention. The addresses in this class are, however, only usable in a document, and routers should discard them.

16.3.4. Validation

The IPv6 protocols has not yet been extensively validated against real implementations. The actual tests involve mainly performing checks of the .pcap trace files with Wireshark, and the results are positive.

16.3.5. NeighborCache

NeighborCacheHelper provides a way to generate NDISC cache automatically. It generates needed NDISC cache before simulation start to avoid the delay and message overhead of neighbor discovery in simulations that are focused on other performance aspects. The state of entries generate by NeighborCacheHelper is STATIC_AUTOGENERATED, which is similar to PERMANENT, but they are not manually added or removed by user, they will be managed by NeighborCacheHelper when user need pre-generate cache. When user is generating neighbor caches globally, neighbor caches will update dynamically when IPv6 addresses are removed or added; when user is generating neighbor caches partially, NeighborCacheHelper will take care of address removal, for adding address user may rerun a reduced-scope PopulateNeighbor() again to pick up the new IP address or manually add an entry to keep the neighbor cache up-to-date, the reason is that: when PopulateNeighborCache() has previously been run with a scope less than global, the code does not know whether it was previously run with a scope of Channel, NetDeviceContainer, or Ip interface container. The source code for NeighborCache is located in src/internet/helper/neighbor-cache-helper A complete example is in src/internet/examples/neighbor-cache-example.cc.

16.3.5.1. Usage

The usages for generating NDISC cache is almost the same as generating ARP cache, see src/internet/doc/ipv4.rst

  • Populate neighbor ARP caches for a given Ipv6InterfaceContainer:

NeighborCacheHelper neighborCache;
neighborCache.PopulateNeighborCache(interfaces);    // interfaces is the Ipv6InterfaceContainer want to generate ARP caches

16.4. Routing overview

ns-3 is intended to support traditional routing approaches and protocols, support ports of open source routing implementations, and facilitate research into unorthodox routing techniques. The overall routing architecture is described below in Routing architecture. Users who wish to just read about how to configure global routing for wired topologies can read Global centralized routing. Unicast routing protocols are described in Unicast routing. Multicast routing is documented in Multicast routing.

16.4.1. Routing architecture

_images/routing.png

Overview of routing

Overview of routing shows the overall routing architecture for Ipv4. The key objects are Ipv4L3Protocol, Ipv4RoutingProtocol(s) (a class to which all routing/forwarding has been delegated from Ipv4L3Protocol), and Ipv4Route(s).

Ipv4L3Protocol must have at least one Ipv4RoutingProtocol added to it at simulation setup time. This is done explicitly by calling Ipv4::SetRoutingProtocol ().

The abstract base class Ipv4RoutingProtocol () declares a minimal interface, consisting of two methods: RouteOutput () and RouteInput (). For packets traveling outbound from a host, the transport protocol will query Ipv4 for the Ipv4RoutingProtocol object interface, and will request a route via Ipv4RoutingProtocol::RouteOutput (). A Ptr to Ipv4Route object is returned. This is analogous to a dst_cache entry in Linux. The Ipv4Route is carried down to the Ipv4L3Protocol to avoid a second lookup there. However, some cases (e.g. Ipv4 raw sockets) will require a call to RouteOutput() directly from Ipv4L3Protocol.

For packets received inbound for forwarding or delivery, the following steps occur. Ipv4L3Protocol::Receive() calls Ipv4RoutingProtocol::RouteInput(). This passes the packet ownership to the Ipv4RoutingProtocol object. There are four callbacks associated with this call:

  • LocalDeliver

  • UnicastForward

  • MulticastForward

  • Error

The Ipv4RoutingProtocol must eventually call one of these callbacks for each packet that it takes responsibility for. This is basically how the input routing process works in Linux.

_images/routing-specialization.png

Ipv4Routing specialization.

This overall architecture is designed to support different routing approaches, including (in the future) a Linux-like policy-based routing implementation, proactive and on-demand routing protocols, and simple routing protocols for when the simulation user does not really care about routing.

Ipv4Routing specialization. illustrates how multiple routing protocols derive from this base class. A class Ipv4ListRouting (implementation class Ipv4ListRoutingImpl) provides the existing list routing approach in ns-3. Its API is the same as base class Ipv4Routing except for the ability to add multiple prioritized routing protocols (Ipv4ListRouting::AddRoutingProtocol(), Ipv4ListRouting::GetRoutingProtocol()).

The details of these routing protocols are described below in Unicast routing. For now, we will first start with a basic unicast routing capability that is intended to globally build routing tables at simulation time t=0 for simulation users who do not care about dynamic routing.

16.4.2. Unicast routing

The following unicast routing protocols are defined for IPv4 and IPv6:

  • classes Ipv4ListRouting and Ipv6ListRouting (used to store a prioritized list of routing protocols)

  • classes Ipv4StaticRouting and Ipv6StaticRouting (covering both unicast and multicast)

  • class Ipv4GlobalRouting (used to store routes computed by the global route manager, if that is used)

  • class Ipv4NixVectorRouting (a more efficient version of global routing that stores source routes in a packet header field)

  • class Rip - the IPv4 RIPv2 protocol (RFC 2453)

  • class RipNg - the IPv6 RIPng protocol (RFC 2080)

  • IPv4 Optimized Link State Routing (OLSR) (a MANET protocol defined in RFC 3626)

  • IPv4 Ad Hoc On Demand Distance Vector (AODV) (a MANET protocol defined in RFC 3561)

  • IPv4 Destination Sequenced Distance Vector (DSDV) (a MANET protocol)

  • IPv4 Dynamic Source Routing (DSR) (a MANET protocol)

In the future, this architecture should also allow someone to implement a Linux-like implementation with routing cache, or a Click modular router, but those are out of scope for now.

16.4.2.1. Ipv[4,6]ListRouting

This section describes the current default ns-3 Ipv[4,6]RoutingProtocol. Typically, multiple routing protocols are supported in user space and coordinate to write a single forwarding table in the kernel. Presently in ns-3, the implementation instead allows for multiple routing protocols to build/keep their own routing state, and the IP implementation will query each one of these routing protocols (in some order determined by the simulation author) until a route is found.

We chose this approach because it may better facilitate the integration of disparate routing approaches that may be difficult to coordinate the writing to a single table, approaches where more information than destination IP address (e.g., source routing) is used to determine the next hop, and on-demand routing approaches where packets must be cached.

16.4.2.1.1. Ipv[4,6]ListRouting::AddRoutingProtocol

Classes Ipv4ListRouting and Ipv6ListRouting provides a pure virtual function declaration for the method that allows one to add a routing protocol:

void AddRoutingProtocol(Ptr<Ipv4RoutingProtocol> routingProtocol,
                        int16_t priority);

void AddRoutingProtocol(Ptr<Ipv6RoutingProtocol> routingProtocol,
                        int16_t priority);

These methods are implemented respectively by class Ipv4ListRoutingImpl and by class Ipv6ListRoutingImpl in the internet module.

The priority variable above governs the priority in which the routing protocols are inserted. Notice that it is a signed int. By default in ns-3, the helper classes will instantiate a Ipv[4,6]ListRoutingImpl object, and add to it an Ipv[4,6]StaticRoutingImpl object at priority zero. Internally, a list of Ipv[4,6]RoutingProtocols is stored, and and the routing protocols are each consulted in decreasing order of priority to see whether a match is found. Therefore, if you want your Ipv4RoutingProtocol to have priority lower than the static routing, insert it with priority less than 0; e.g.:

Ptr<MyRoutingProtocol> myRoutingProto = CreateObject<MyRoutingProtocol>();
listRoutingPtr->AddRoutingProtocol(myRoutingProto, -10);

Upon calls to RouteOutput() or RouteInput(), the list routing object will search the list of routing protocols, in priority order, until a route is found. Such routing protocol will invoke the appropriate callback and no further routing protocols will be searched.

16.4.2.2. Global centralized routing

Global centralized routing is sometimes called “God” routing; it is a special implementation that walks the simulation topology and runs a shortest path algorithm, and populates each node’s routing tables. No actual protocol overhead (on the simulated links) is incurred with this approach. It does have a few constraints:

  • Wired only: It is not intended for use in wireless networks.

  • Unicast only: It does not do multicast.

  • Scalability: Some users of this on large topologies (e.g. 1000 nodes) have noticed that the current implementation is not very scalable. The global centralized routing will be modified in the future to reduce computations and runtime performance.

Presently, global centralized IPv4 unicast routing over both point-to-point and shared (CSMA) links is supported.

By default, when using the ns-3 helper API and the default InternetStackHelper, global routing capability will be added to the node, and global routing will be inserted as a routing protocol with lower priority than the static routes (i.e., users can insert routes via Ipv4StaticRouting API and they will take precedence over routes found by global routing).

16.4.2.2.1. Global Unicast Routing API

The public API is very minimal. User scripts include the following:

#include "ns3/internet-module.h"

If the default InternetStackHelper is used, then an instance of global routing will be aggregated to each node. After IP addresses are configured, the following function call will cause all of the nodes that have an Ipv4 interface to receive forwarding tables entered automatically by the GlobalRouteManager:

Ipv4GlobalRoutingHelper::PopulateRoutingTables();

Note: A reminder that the wifi NetDevice will work but does not take any wireless effects into account. For wireless, we recommend OLSR dynamic routing described below.

It is possible to call this function again in the midst of a simulation using the following additional public function:

Ipv4GlobalRoutingHelper::RecomputeRoutingTables();

which flushes the old tables, queries the nodes for new interface information, and rebuilds the routes.

For instance, this scheduling call will cause the tables to be rebuilt at time 5 seconds:

Simulator::Schedule(Seconds(5),
                    &Ipv4GlobalRoutingHelper::RecomputeRoutingTables);

There are two attributes that govern the behavior. The first is Ipv4GlobalRouting::RandomEcmpRouting. If set to true, packets are randomly routed across equal-cost multipath routes. If set to false (default), only one route is consistently used. The second is Ipv4GlobalRouting::RespondToInterfaceEvents. If set to true, dynamically recompute the global routes upon Interface notification events (up/down, or add/remove address). If set to false (default), routing may break unless the user manually calls RecomputeRoutingTables() after such events. The default is set to false to preserve legacy ns-3 program behavior.

16.4.2.2.2. Global Routing Implementation

This section is for those readers who care about how this is implemented. A singleton object (GlobalRouteManager) is responsible for populating the static routes on each node, using the public Ipv4 API of that node. It queries each node in the topology for a “globalRouter” interface. If found, it uses the API of that interface to obtain a “link state advertisement (LSA)” for the router. Link State Advertisements are used in OSPF routing, and we follow their formatting.

It is important to note that all of these computations are done before packets are flowing in the network. In particular, there are no overhead or control packets being exchanged when using this implementation. Instead, this global route manager just walks the list of nodes to build the necessary information and configure each node’s routing table.

The GlobalRouteManager populates a link state database with LSAs gathered from the entire topology. Then, for each router in the topology, the GlobalRouteManager executes the OSPF shortest path first (SPF) computation on the database, and populates the routing tables on each node.

The quagga (https://www.nongnu.org/quagga/) OSPF implementation was used as the basis for the routing computation logic. One benefit of following an existing OSPF SPF implementation is that OSPF already has defined link state advertisements for all common types of network links:

  • point-to-point (serial links)

  • point-to-multipoint (Frame Relay, ad hoc wireless)

  • non-broadcast multiple access (ATM)

  • broadcast (Ethernet)

Therefore, we think that enabling these other link types will be more straightforward now that the underlying OSPF SPF framework is in place.

Presently, we can handle IPv4 point-to-point, numbered links, as well as shared broadcast (CSMA) links. Equal-cost multipath is also supported. Although wireless link types are supported by the implementation, note that due to the nature of this implementation, any channel effects will not be considered and the routing tables will assume that every node on the same shared channel is reachable from every other node (i.e. it will be treated like a broadcast CSMA link).

The GlobalRouteManager first walks the list of nodes and aggregates a GlobalRouter interface to each one as follows:

typedef std::vector<Ptr<Node>>::iterator Iterator;
for (Iterator i = NodeList::Begin(); i != NodeList::End(); i++)
  {
    Ptr<Node> node = *i;
    Ptr<GlobalRouter> globalRouter = CreateObject<GlobalRouter>(node);
    node->AggregateObject(globalRouter);
  }

This interface is later queried and used to generate a Link State Advertisement for each router, and this link state database is fed into the OSPF shortest path computation logic. The Ipv4 API is finally used to populate the routes themselves.

16.4.2.3. RIP and RIPng

The RIPv2 protocol for IPv4 is described in the RFC 2453, and it consolidates a number of improvements over the base protocol defined in RFC 1058.

This IPv6 routing protocol (RFC 2080) is the evolution of the well-known RIPv1 (see RFC 1058 and RFC 1723) routing protocol for IPv4.

The protocols are very simple, and are normally suitable for flat, simple network topologies.

RIPv1, RIPv2, and RIPng have the very same goals and limitations. In particular, RIP considers any route with a metric equal or greater than 16 as unreachable. As a consequence, the maximum number of hops is the network must be less than 15 (the number of routers is not set). Users are encouraged to read RFC 2080 and RFC 1058 to fully understand RIP behaviour and limitations.

16.4.2.3.1. Routing convergence

RIP uses a Distance-Vector algorithm, and routes are updated according to the Bellman-Ford algorithm (sometimes known as Ford-Fulkerson algorithm). The algorithm has a convergence time of O(|V|*|E|) where |V| and |E| are the number of vertices (routers) and edges (links) respectively. It should be stressed that the convergence time is the number of steps in the algorithm, and each step is triggered by a message. Since Triggered Updates (i.e., when a route is changed) have a 1-5 seconds cooldown, the topology can require some time to be stabilized.

Users should be aware that, during routing tables construction, the routers might drop packets. Data traffic should be sent only after a time long enough to allow RIP to build the network topology. Usually 80 seconds should be enough to have a suboptimal (but working) routing setup. This includes the time needed to propagate the routes to the most distant router (16 hops) with Triggered Updates.

If the network topology is changed (e.g., a link is broken), the recovery time might be quite high, and it might be even higher than the initial setup time. Moreover, the network topology recovery is affected by the Split Horizoning strategy.

The examples examples/routing/ripng-simple-network.cc and examples/routing/rip-simple-network.cc shows both the network setup and network recovery phases.

16.4.2.3.2. Split Horizoning

Split Horizon is a strategy to prevent routing instability. Three options are possible:

  • No Split Horizon

  • Split Horizon

  • Poison Reverse

In the first case, routes are advertised on all the router’s interfaces. In the second case, routers will not advertise a route on the interface from which it was learned. Poison Reverse will advertise the route on the interface from which it was learned, but with a metric of 16 (infinity). For a full analysis of the three techniques, see RFC 1058, section 2.2.

The examples are based on the network topology described in the RFC, but it does not show the effect described there.

The reason are the Triggered Updates, together with the fact that when a router invalidates a route, it will immediately propagate the route unreachability, thus preventing most of the issues described in the RFC.

However, with complex topologies, it is still possible to have route instability phenomena similar to the one described in the RFC after a link failure. As a consequence, all the considerations about Split Horizon remains valid.

16.4.2.3.3. Default routes

RIP protocol should be installed only on routers. As a consequence, nodes will not know what is the default router.

To overcome this limitation, users should either install the default route manually (e.g., by resorting to Ipv4StaticRouting or Ipv6StaticRouting), or by using RADVd (in case of IPv6). RADVd is available in ns-3 in the Applications module, and it is strongly suggested.

16.4.2.3.4. Protocol parameters and options

The RIP ns-3 implementations allow to change all the timers associated with route updates and routes lifetime.

Moreover, users can change the interface metrics on a per-node basis.

The type of Split Horizoning (to avoid routes back-propagation) can be selected on a per-node basis, with the choices being “no split horizon”, “split horizon” and “poison reverse”. See RFC 2080 for further details, and RFC 1058 for a complete discussion on the split horizoning strategies.

Moreover, it is possible to use a non-standard value for Link Down Value (i.e., the value after which a link is considered down). The default is value is 16.

16.4.2.3.5. Limitations

There is no support for the Next Hop option (RFC 2080, Section 2.1.1). The Next Hop option is useful when RIP is not being run on all of the routers on a network. Support for this option may be considered in the future.

There is no support for CIDR prefix aggregation. As a result, both routing tables and route advertisements may be larger than necessary. Prefix aggregation may be added in the future.

16.4.2.4. Other routing protocols

Other routing protocols documentation can be found under the respective modules sections, e.g.:

  • AODV

  • Click

  • DSDV

  • DSR

  • NixVectorRouting

  • OLSR

  • etc.

16.4.3. Multicast routing

The following function is used to add a static multicast route to a node:

void
Ipv4StaticRouting::AddMulticastRoute(Ipv4Address origin,
                                     Ipv4Address group,
                                     uint32_t inputInterface,
                                     std::vector<uint32_t> outputInterfaces);

A multicast route must specify an origin IP address, a multicast group and an input network interface index as conditions and provide a vector of output network interface indices over which packets matching the conditions are sent.

Typically there are two main types of multicast routes:

  • Routes used during forwarding, and

  • Routes used in the originator node.

In the first case all the conditions must be explicitly provided.

In the second case, the route is equivalent to a unicast route, and must be added through Ipv4StaticRouting::AddHostRouteTo.

Another command sets the default multicast route:

void
Ipv4StaticRouting::SetDefaultMulticastRoute(uint32_t outputInterface);

This is the multicast equivalent of the unicast version SetDefaultRoute. We tell the routing system what to do in the case where a specific route to a destination multicast group is not found. The system forwards packets out the specified interface in the hope that “something out there” knows better how to route the packet. This method is only used in initially sending packets off of a host. The default multicast route is not consulted during forwarding – exact routes must be specified using AddMulticastRoute for that case.

Since we’re basically sending packets to some entity we think may know better what to do, we don’t pay attention to “subtleties” like origin address, nor do we worry about forwarding out multiple interfaces. If the default multicast route is set, it is returned as the selected route from LookupStatic irrespective of origin or multicast group if another specific route is not found.

Finally, a number of additional functions are provided to fetch and remove multicast routes:

uint32_t GetNMulticastRoutes() const;

Ipv4MulticastRoute *GetMulticastRoute(uint32_t i) const;

Ipv4MulticastRoute *GetDefaultMulticastRoute() const;

bool RemoveMulticastRoute(Ipv4Address origin,
                          Ipv4Address group,
                          uint32_t inputInterface);

void RemoveMulticastRoute(uint32_t index);

16.5. TCP models in ns-3

This chapter describes the TCP models available in ns-3.

16.5.1. Overview of support for TCP

ns-3 was written to support multiple TCP implementations. The implementations inherit from a few common header classes in the src/network directory, so that user code can swap out implementations with minimal changes to the scripts.

There are three important abstract base classes:

  • class TcpSocket: This is defined in src/internet/model/tcp-socket.{cc,h}. This class exists for hosting TcpSocket attributes that can be reused across different implementations. For instance, the attribute InitialCwnd can be used for any of the implementations that derive from class TcpSocket.

  • class TcpSocketFactory: This is used by the layer-4 protocol instance to create TCP sockets of the right type.

  • class TcpCongestionOps: This supports different variants of congestion control– a key topic of simulation-based TCP research.

There are presently two active implementations of TCP available for ns-3.

Direct Code Execution is limited in its support for newer kernels; at present, only Linux kernel 4.4 is supported. However, the TCP implementations in kernel 4.4 can still be used for ns-3 validation or for specialized simulation use cases.

It should also be mentioned that various ways of combining virtual machines with ns-3 makes available also some additional TCP implementations, but those are out of scope for this chapter.

16.5.2. ns-3 TCP

In brief, the native ns-3 TCP model supports a full bidirectional TCP with connection setup and close logic. Several congestion control algorithms are supported, with CUBIC the default, and NewReno, Westwood, Hybla, HighSpeed, Vegas, Scalable, Veno, Binary Increase Congestion Control (BIC), Yet Another HighSpeed TCP (YeAH), Illinois, H-TCP, Low Extra Delay Background Transport (LEDBAT), TCP Low Priority (TCP-LP), Data Center TCP (DCTCP) and Bottleneck Bandwidth and RTT (BBR) also supported. The model also supports Selective Acknowledgements (SACK), Proportional Rate Reduction (PRR) and Explicit Congestion Notification (ECN). Multipath-TCP is not yet supported in the ns-3 releases.

16.5.2.1. Model history

Until the ns-3.10 release, ns-3 contained a port of the TCP model from GTNetS, developed initially by George Riley and ported to ns-3 by Raj Bhattacharjea. This implementation was substantially rewritten by Adriam Tam for ns-3.10. In 2015, the TCP module was redesigned in order to create a better environment for creating and carrying out automated tests. One of the main changes involves congestion control algorithms, and how they are implemented.

Before the ns-3.25 release, a congestion control was considered as a stand-alone TCP through an inheritance relation: each congestion control (e.g. TcpNewReno) was a subclass of TcpSocketBase, reimplementing some inherited methods. The architecture was redone to avoid this inheritance, by making each congestion control a separate class, and defining an interface to exchange important data between TcpSocketBase and the congestion modules. The Linux tcp_congestion_ops interface was used as the design reference.

Along with congestion control, Fast Retransmit and Fast Recovery algorithms have been modified; in previous releases, these algorithms were delegated to TcpSocketBase subclasses. Starting from ns-3.25, they have been merged inside TcpSocketBase. In future releases, they can be extracted as separate modules, following the congestion control design.

As of the ns-3.31 release, the default initial window was set to 10 segments (in previous releases, it was set to 1 segment). This aligns with current Linux default, and is discussed further in RFC 6928.

In the ns-3.32 release, the default recovery algorithm was set to Proportional Rate Reduction (PRR) from the classic ack-clocked Fast Recovery algorithm.

In the ns-3.34 release, the default congestion control algorithm was set to CUBIC from NewReno.

CUBIC was extended to support Reno-friendliness (see RFC 9438 Section 4.3) in the ns-3.41 release. This feature is called ‘TCP friendliness’ in earlier versions of the CUBIC RFCs, and in the Linux and ns-3 implementations.

16.5.2.2. Acknowledgments

As mentioned above, ns-3 TCP has had multiple authors and maintainers over the years. Several publications exist on aspects of ns-3 TCP, and users of ns-3 TCP are requested to cite one of the applicable papers when publishing new work.

A general reference on the current architecture is found in the following paper:

For an academic peer-reviewed paper on the SACK implementation in ns-3, please refer to:

  • Natale Patriciello. 2017. A SACK-based Conservative Loss Recovery Algorithm for ns-3 TCP: a Linux-inspired Proposal. In Proceedings of the Workshop on ns-3 (WNS3 ‘17). ACM, New York, NY, USA, 1-8. (https://dl.acm.org/citation.cfm?id=3067666)

16.5.2.3. Usage

In many cases, usage of TCP is set at the application layer by telling the ns-3 application which kind of socket factory to use.

Using the helper functions defined in src/applications/helper and src/network/helper, here is how one would create a TCP receiver:

// Create a packet sink on the star "hub" to receive these packets
uint16_t port = 50000;
Address sinkLocalAddress(InetSocketAddress(Ipv4Address::GetAny(), port));
PacketSinkHelper sinkHelper("ns3::TcpSocketFactory", sinkLocalAddress);
ApplicationContainer sinkApp = sinkHelper.Install(serverNode);
sinkApp.Start(Seconds(1.0));
sinkApp.Stop(Seconds(10.0));

Similarly, the below snippet configures OnOffApplication traffic source to use TCP:

// Create the OnOff applications to send TCP to the server
OnOffHelper clientHelper("ns3::TcpSocketFactory", Address());

The careful reader will note above that we have specified the TypeId of an abstract base class TcpSocketFactory. How does the script tell ns-3 that it wants the native ns-3 TCP vs. some other one? Well, when internet stacks are added to the node, the default TCP implementation that is aggregated to the node is the ns-3 TCP. So, by default, when using the ns-3 helper API, the TCP that is aggregated to nodes with an Internet stack is the native ns-3 TCP.

To configure behavior of TCP, a number of parameters are exported through the ns-3 attribute system. These are documented in the Doxygen for class TcpSocket. For example, the maximum segment size is a settable attribute.

To set the default socket type before any internet stack-related objects are created, one may put the following statement at the top of the simulation program:

Config::SetDefault("ns3::TcpL4Protocol::SocketType", StringValue("ns3::TcpNewReno"));

For users who wish to have a pointer to the actual socket (so that socket operations like Bind(), setting socket options, etc. can be done on a per-socket basis), Tcp sockets can be created by using the Socket::CreateSocket() method. The TypeId passed to CreateSocket() must be of type ns3::SocketFactory, so configuring the underlying socket type must be done by twiddling the attribute associated with the underlying TcpL4Protocol object. The easiest way to get at this would be through the attribute configuration system. In the below example, the Node container “n0n1” is accessed to get the zeroth element, and a socket is created on this node:

// Create and bind the socket...
TypeId tid = TypeId::LookupByName("ns3::TcpNewReno");
Config::Set("/NodeList/*/$ns3::TcpL4Protocol/SocketType", TypeIdValue(tid));
Ptr<Socket> localSocket =
  Socket::CreateSocket(n0n1.Get(0), TcpSocketFactory::GetTypeId());

Above, the “*” wild card for node number is passed to the attribute configuration system, so that all future sockets on all nodes are set to NewReno, not just on node ‘n0n1.Get (0)’. If one wants to limit it to just the specified node, one would have to do something like:

// Create and bind the socket...
TypeId tid = TypeId::LookupByName("ns3::TcpNewReno");
std::stringstream nodeId;
nodeId << n0n1.Get(0)->GetId();
std::string specificNode = "/NodeList/" + nodeId.str() + "/$ns3::TcpL4Protocol/SocketType";
Config::Set(specificNode, TypeIdValue(tid));
Ptr<Socket> localSocket =
  Socket::CreateSocket(n0n1.Get(0), TcpSocketFactory::GetTypeId());

Once a TCP socket is created, one will want to follow conventional socket logic and either connect() and send() (for a TCP client) or bind(), listen(), and accept() (for a TCP server). Please note that applications usually create the sockets they use automatically, and so is not straightforward to connect directly to them using pointers. Please refer to the source code of your preferred application to discover how and when it creates the socket.

16.5.2.3.1. TCP Socket interaction and interface with Application layer

In the following there is an analysis on the public interface of the TCP socket, and how it can be used to interact with the socket itself. An analysis of the callback fired by the socket is also carried out. Please note that, for the sake of clarity, we will use the terminology “Sender” and “Receiver” to clearly divide the functionality of the socket. However, in TCP these two roles can be applied at the same time (i.e. a socket could be a sender and a receiver at the same time): our distinction does not lose generality, since the following definition can be applied to both sockets in case of full-duplex mode.


TCP state machine (for commodity use)

_images/tcp-state-machine.png

TCP State machine

In ns-3 we are fully compliant with the state machine depicted in Figure TCP State machine.


Public interface for receivers (e.g. servers receiving data)

Bind()

Bind the socket to an address, or to a general endpoint. A general endpoint is an endpoint with an ephemeral port allocation (that is, a random port allocation) on the 0.0.0.0 IP address. For instance, in current applications, data senders usually binds automatically after a Connect() over a random port. Consequently, the connection will start from this random port towards the well-defined port of the receiver. The IP 0.0.0.0 is then translated by lower layers into the real IP of the device.

Bind6()

Same as Bind(), but for IPv6.

BindToNetDevice()

Bind the socket to the specified NetDevice, creating a general endpoint.

Listen()

Listen on the endpoint for an incoming connection. Please note that this function can be called only in the TCP CLOSED state, and transit in the LISTEN state. When an incoming request for connection is detected (i.e. the other peer invoked Connect()) the application will be signaled with the callback NotifyConnectionRequest (set in SetAcceptCallback() beforehand). If the connection is accepted (the default behavior, when the associated callback is a null one) the Socket will fork itself, i.e. a new socket is created to handle the incoming data/connection, in the state SYN_RCVD. Please note that this newly created socket is not connected anymore to the callbacks on the “father” socket (e.g. DataSent, Recv); the pointer of the newly created socket is provided in the Callback NotifyNewConnectionCreated (set beforehand in SetAcceptCallback), and should be used to connect new callbacks to interesting events (e.g. Recv callback). After receiving the ACK of the SYN-ACK, the socket will set the congestion control, move into ESTABLISHED state, and then notify the application with NotifyNewConnectionCreated.

ShutdownSend()

Signal a termination of send, or in other words prevents data from being added to the buffer. After this call, if buffer is already empty, the socket will send a FIN, otherwise FIN will go when buffer empties. Please note that this is useful only for modeling “Sink” applications. If you have data to transmit, please refer to the Send() / Close() combination of API.

GetRxAvailable()

Get the amount of data that could be returned by the Socket in one or multiple call to Recv or RecvFrom. Please use the Attribute system to configure the maximum available space on the receiver buffer (property “RcvBufSize”).

Recv()

Grab data from the TCP socket. Please remember that TCP is a stream socket, and it is allowed to concatenate multiple packets into bigger ones. If no data is present (i.e. GetRxAvailable returns 0) an empty packet is returned. Set the callback RecvCallback through SetRecvCallback() in order to have the application automatically notified when some data is ready to be read. It’s important to connect that callback to the newly created socket in case of forks.

RecvFrom()

Same as Recv, but with the source address as parameter.


Public interface for senders (e.g. clients uploading data)

Connect()

Set the remote endpoint, and try to connect to it. The local endpoint should be set before this call, or otherwise an ephemeral one will be created. The TCP then will be in the SYN_SENT state. If a SYN-ACK is received, the TCP will setup the congestion control, and then call the callback ConnectionSucceeded.

GetTxAvailable()

Return the amount of data that can be stored in the TCP Tx buffer. Set this property through the Attribute system (“SndBufSize”).

Send()

Send the data into the TCP Tx buffer. From there, the TCP rules will decide if, and when, this data will be transmitted. Please note that, if the tx buffer has enough data to fill the congestion (or the receiver) window, dynamically varying the rate at which data is injected in the TCP buffer does not have any noticeable effect on the amount of data transmitted on the wire, that will continue to be decided by the TCP rules.

SendTo()

Same as Send().

Close()

Terminate the local side of the connection, by sending a FIN (after all data in the tx buffer has been transmitted). This does not prevent the socket in receiving data, and employing retransmit mechanism if losses are detected. If the application calls Close() with unread data in its rx buffer, the socket will send a reset. If the socket is in the state SYN_SENT, CLOSING, LISTEN, FIN_WAIT_2, or LAST_ACK, after that call the application will be notified with NotifyNormalClose(). In other cases, the notification is delayed (see NotifyNormalClose()).


Public callbacks

These callbacks are called by the TCP socket to notify the application of interesting events. We will refer to these with the protected name used in socket.h, but we will provide the API function to set the pointers to these callback as well.

NotifyConnectionSucceeded: SetConnectCallback, 1st argument

Called in the SYN_SENT state, before moving to ESTABLISHED. In other words, we have sent the SYN, and we received the SYN-ACK: the socket prepares the sequence numbers, sends the ACK for the SYN-ACK, tries to send out more data (in another segment) and then invokes this callback. After this callback, it invokes the NotifySend callback.

NotifyConnectionFailed: SetConnectCallback, 2nd argument

Called after the SYN retransmission count goes to 0. SYN packet is lost multiple times, and the socket gives up.

NotifyNormalClose: SetCloseCallbacks, 1st argument

A normal close is invoked. A rare case is when we receive an RST segment (or a segment with bad flags) in normal states. All other cases are: - The application tries to Connect() over an already connected socket - Received an ACK for the FIN sent, with or without the FIN bit set (we are in LAST_ACK) - The socket reaches the maximum amount of retries in retransmitting the SYN (*) - We receive a timeout in the LAST_ACK state - Upon entering the TIME_WAIT state, before waiting the 2*Maximum Segment Lifetime seconds to finally deallocate the socket.

NotifyErrorClose: SetCloseCallbacks, 2nd argument

Invoked when we send an RST segment (for whatever reason) or we reached the maximum amount of data retries.

NotifyConnectionRequest: SetAcceptCallback, 1st argument

Invoked in the LISTEN state, when we receive a SYN. The return value indicates if the socket should accept the connection (return true) or should ignore it (return false).

NotifyNewConnectionCreated: SetAcceptCallback, 2nd argument

Invoked when from SYN_RCVD the socket passes to ESTABLISHED, and after setting up the congestion control, the sequence numbers, and processing the incoming ACK. If there is some space in the buffer, NotifySend is called shortly after this callback. The Socket pointer, passed with this callback, is the newly created socket, after a Fork().

NotifyDataSent: SetDataSentCallback

The Socket notifies the application that some bytes have been transmitted on the IP level. These bytes could still be lost in the node (traffic control layer) or in the network.

NotifySend: SetSendCallback

Invoked if there is some space in the tx buffer when entering the ESTABLISHED state (e.g. after the ACK for SYN-ACK is received), after the connection succeeds (e.g. after the SYN-ACK is received) and after each new ACK (i.e. that advances SND.UNA).

NotifyDataRecv: SetRecvCallback

Called when in the receiver buffer there are in-order bytes, and when in FIN_WAIT_1 or FIN_WAIT_2 the socket receive a in-sequence FIN (that can carry data).

16.5.2.4. Congestion Control Algorithms

Here follows a list of supported TCP congestion control algorithms. For an academic paper on many of these congestion control algorithms, see http://dl.acm.org/citation.cfm?id=2756518 .

16.5.2.4.1. NewReno

NewReno algorithm introduces partial ACKs inside the well-established Reno algorithm. This and other modifications are described in RFC 6582. We have two possible congestion window increment strategy: slow start and congestion avoidance. Taken from RFC 5681:

During slow start, a TCP increments cwnd by at most SMSS bytes for each ACK received that cumulatively acknowledges new data. Slow start ends when cwnd exceeds ssthresh (or, optionally, when it reaches it, as noted above) or when congestion is observed. While traditionally TCP implementations have increased cwnd by precisely SMSS bytes upon receipt of an ACK covering new data, we RECOMMEND that TCP implementations increase cwnd, per Equation (1), where N is the number of previously unacknowledged bytes acknowledged in the incoming ACK.

(1)cwnd += min (N, SMSS)

During congestion avoidance, cwnd is incremented by roughly 1 full-sized segment per round-trip time (RTT), and for each congestion event, the slow start threshold is halved.

16.5.2.4.2. CUBIC

CUBIC (class TcpCubic) is the default TCP congestion control in Linux, macOS (since 2014), and Microsoft Windows (since 2017). CUBIC has two main differences with respect to a more classic TCP congestion control such as NewReno. First, during the congestion avoidance phase, the window size grows according to a cubic function (concave, then convex) with the latter convex portion designed to allow for bandwidth probing. Second, a hybrid slow start (HyStart) algorithm uses observations of delay increases in the slow start phase of window growth to try to exit slow start before window growth causes queue overflow.

CUBIC is documented in RFC 9438, and the ns-3 implementation is patterned partly on the Linux implementation and partly on the RFC, although the Linux 4.4 kernel implementation (through the Direct Code Execution environment) has been used to validate the behavior.

16.5.2.4.3. Linux Reno

TCP Linux Reno (class TcpLinuxReno) is designed to provide a Linux-like implementation of TCP NewReno. The implementation of class TcpNewReno in ns-3 follows RFC standards, and increases cwnd more conservatively than does Linux Reno. Linux Reno modifies slow start and congestion avoidance algorithms to increase cwnd based on the number of bytes being acknowledged by each arriving ACK, rather than by the number of ACKs that arrive. Another major difference in implementation is that Linux maintains the congestion window in units of segments, while the RFCs define the congestion window in units of bytes.

In slow start phase, on each incoming ACK at the TCP sender side cwnd is increased by the number of previously unacknowledged bytes ACKed by the incoming acknowledgment. In contrast, in ns-3 NewReno, cwnd is increased by one segment per acknowledgment. In standards terminology, this difference is referred to as Appropriate Byte Counting (RFC 3465); Linux follows Appropriate Byte Counting while ns-3 NewReno does not.

(2)cwnd += segAcked * segmentSize

(3)cwnd += segmentSize

In congestion avoidance phase, the number of bytes that have been ACKed at the TCP sender side are stored in a ‘bytes_acked’ variable in the TCP control block. When ‘bytes_acked’ becomes greater than or equal to the value of the cwnd, ‘bytes_acked’ is reduced by the value of cwnd. Next, cwnd is incremented by a full-sized segment (SMSS). In contrast, in ns-3 NewReno, cwnd is increased by (1/cwnd) with a rounding off due to type casting into int.

Linux Reno cwnd update
if (m_cWndCnt >= w)
{
    uint32_t delta = m_cWndCnt / w;

    m_cWndCnt -= delta * w;
    tcb->m_cWnd += delta * tcb->m_segmentSize;
    NS_LOG_DEBUG("Subtracting delta * w from m_cWndCnt " << delta * w);
}
New Reno cwnd update
if (segmentsAcked > 0)
{
    double adder = static_cast<double>(tcb->m_segmentSize * tcb->m_segmentSize) / tcb->m_cWnd.Get();
    adder = std::max(1.0, adder);
    tcb->m_cWnd += static_cast<uint32_t>(adder);
    NS_LOG_INFO("In CongAvoid, updated to cwnd " << tcb->m_cWnd <<
                " ssthresh " << tcb->m_ssThresh);
}

So, there are two main difference between the TCP Linux Reno and TCP NewReno in ns-3: 1) In TCP Linux Reno, delayed acknowledgement configuration does not affect congestion window growth, while in TCP NewReno, delayed acknowledgments cause a slower congestion window growth. 2) In congestion avoidance phase, the arithmetic for counting the number of segments acked and deciding when to increment the cwnd is different for TCP Linux Reno and TCP NewReno.

Following graphs shows the behavior of window growth in TCP Linux Reno and TCP NewReno with delayed acknowledgement of 2 segments:

_images/ns3-new-reno-vs-ns3-linux-reno.png

ns-3 TCP NewReno vs. ns-3 TCP Linux Reno

16.5.2.4.4. HighSpeed

TCP HighSpeed is designed for high-capacity channels or, in general, for TCP connections with large congestion windows. Conceptually, with respect to the standard TCP, HighSpeed makes the cWnd grow faster during the probing phases and accelerates the cWnd recovery from losses. This behavior is executed only when the window grows beyond a certain threshold, which allows TCP HighSpeed to be friendly with standard TCP in environments with heavy congestion, without introducing new dangers of congestion collapse.

Mathematically:

(4)cWnd = cWnd + \frac{a(cWnd)}{cWnd}

The function a() is calculated using a fixed RTT the value 100 ms (the lookup table for this function is taken from RFC 3649). For each congestion event, the slow start threshold is decreased by a value that depends on the size of the slow start threshold itself. Then, the congestion window is set to such value.

(5)cWnd = (1 - b(cWnd)) \cdot cWnd

The lookup table for the function b() is taken from the same RFC. More information at: http://dl.acm.org/citation.cfm?id=2756518

16.5.2.4.5. Hybla

The key idea behind TCP Hybla is to obtain for long RTT connections the same instantaneous transmission rate of a reference TCP connection with lower RTT. With analytical steps, it is shown that this goal can be achieved by modifying the time scale, in order for the throughput to be independent from the RTT. This independence is obtained through the use of a coefficient rho.

This coefficient is used to calculate both the slow start threshold and the congestion window when in slow start and in congestion avoidance, respectively.

More information at: http://dl.acm.org/citation.cfm?id=2756518

16.5.2.4.6. Westwood

Westwood and Westwood+ employ the AIAD (Additive Increase/Adaptive Decrease) congestion control paradigm. When a congestion episode happens, instead of halving the cwnd, these protocols try to estimate the network’s bandwidth and use the estimated value to adjust the cwnd. While Westwood performs the bandwidth sampling every ACK reception, Westwood+ samples the bandwidth every RTT.

The TCP Westwood model has been removed in ns-3.38 due to bugs that are impossible to fix without modifying the original Westwood model as presented in the published papers. For further info refer to https://gitlab.com/nsnam/ns-3-dev/-/issues/579

The Westwood+ model does not have such issues, and is still available.

WARNING: this TCP model lacks validation and regression tests; use with caution.

More information at: http://dl.acm.org/citation.cfm?id=381704 and http://dl.acm.org/citation.cfm?id=2512757

16.5.2.4.7. Vegas

TCP Vegas is a pure delay-based congestion control algorithm implementing a proactive scheme that tries to prevent packet drops by maintaining a small backlog at the bottleneck queue. Vegas continuously samples the RTT and computes the actual throughput a connection achieves using Equation (6) and compares it with the expected throughput calculated in Equation (7). The difference between these 2 sending rates in Equation (8) reflects the amount of extra packets being queued at the bottleneck.

(6)actual &= \frac{cWnd}{RTT}

(7)expected &= \frac{cWnd}{BaseRTT}

(8)diff &= expected - actual

To avoid congestion, Vegas linearly increases/decreases its congestion window to ensure the diff value falls between the two predefined thresholds, alpha and beta. diff and another threshold, gamma, are used to determine when Vegas should change from its slow-start mode to linear increase/decrease mode. Following the implementation of Vegas in Linux, we use 2, 4, and 1 as the default values of alpha, beta, and gamma, respectively, but they can be modified through the Attribute system.

More information at: http://dx.doi.org/10.1109/49.464716

16.5.2.4.8. Scalable

Scalable improves TCP performance to better utilize the available bandwidth of a highspeed wide area network by altering NewReno congestion window adjustment algorithm. When congestion has not been detected, for each ACK received in an RTT, Scalable increases its cwnd per:

(9)cwnd = cwnd + 0.01

Following Linux implementation of Scalable, we use 50 instead of 100 to account for delayed ACK.

On the first detection of congestion in a given RTT, cwnd is reduced based on the following equation:

(10)cwnd = cwnd - ceil(0.125 \cdot cwnd)

More information at: http://dl.acm.org/citation.cfm?id=956989

16.5.2.4.9. Veno

TCP Veno enhances Reno algorithm for more effectively dealing with random packet loss in wireless access networks by employing Vegas’s method in estimating the backlog at the bottleneck queue to distinguish between congestive and non-congestive states.

The backlog (the number of packets accumulated at the bottleneck queue) is calculated using Equation (11):

(11)N &= Actual \cdot (RTT - BaseRTT) \\
  &= Diff \cdot BaseRTT

where:

(12)Diff &= Expected - Actual \\
     &= \frac{cWnd}{BaseRTT} - \frac{cWnd}{RTT}

Veno makes decision on cwnd modification based on the calculated N and its predefined threshold beta.

Specifically, it refines the additive increase algorithm of Reno so that the connection can stay longer in the stable state by incrementing cwnd by 1/cwnd for every other new ACK received after the available bandwidth has been fully utilized, i.e. when N exceeds beta. Otherwise, Veno increases its cwnd by 1/cwnd upon every new ACK receipt as in Reno.

In the multiplicative decrease algorithm, when Veno is in the non-congestive state, i.e. when N is less than beta, Veno decrements its cwnd by only 1/5 because the loss encountered is more likely a corruption-based loss than a congestion-based. Only when N is greater than beta, Veno halves its sending rate as in Reno.

More information at: http://dx.doi.org/10.1109/JSAC.2002.807336

16.5.2.4.10. BIC

BIC (class TcpBic) is a predecessor of TCP CUBIC. In TCP BIC the congestion control problem is viewed as a search problem. Taking as a starting point the current window value and as a target point the last maximum window value (i.e. the cWnd value just before the loss event) a binary search technique can be used to update the cWnd value at the midpoint between the two, directly or using an additive increase strategy if the distance from the current window is too large.

This way, assuming a no-loss period, the congestion window logarithmically approaches the maximum value of cWnd until the difference between it and cWnd falls below a preset threshold. After reaching such a value (or the maximum window is unknown, i.e. the binary search does not start at all) the algorithm switches to probing the new maximum window with a ‘slow start’ strategy.

If a loss occur in either these phases, the current window (before the loss) can be treated as the new maximum, and the reduced (with a multiplicative decrease factor Beta) window size can be used as the new minimum.

More information at: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1354672

16.5.2.4.11. YeAH

YeAH-TCP (Yet Another HighSpeed TCP) is a heuristic designed to balance various requirements of a state-of-the-art congestion control algorithm:

  1. fully exploit the link capacity of high BDP networks while inducing a small number of congestion events

  2. compete friendly with Reno flows

  3. achieve intra and RTT fairness

  4. robust to random losses

  5. achieve high performance regardless of buffer size

YeAH operates between 2 modes: Fast and Slow mode. In the Fast mode when the queue occupancy is small and the network congestion level is low, YeAH increments its congestion window according to the aggressive HSTCP rule. When the number of packets in the queue grows beyond a threshold and the network congestion level is high, YeAH enters its Slow mode, acting as Reno with a decongestion algorithm. YeAH employs Vegas’ mechanism for calculating the backlog as in Equation (13). The estimation of the network congestion level is shown in Equation (14).

(13)Q = (RTT - BaseRTT) \cdot \frac{cWnd}{RTT}

(14)L = \frac{RTT - BaseRTT}{BaseRTT}

To ensure TCP friendliness, YeAH also implements an algorithm to detect the presence of legacy Reno flows. Upon the receipt of 3 duplicate ACKs, YeAH decreases its slow start threshold according to Equation (15) if it’s not competing with Reno flows. Otherwise, the ssthresh is halved as in Reno:

(15)ssthresh = min(max(\frac{cWnd}{8}, Q), \frac{cWnd}{2})

More information: http://www.csc.lsu.edu/~sjpark/cs7601/4-YeAH_TCP.pdf

16.5.2.4.12. Illinois

TCP Illinois is a hybrid congestion control algorithm designed for high-speed networks. Illinois implements a Concave-AIMD (or C-AIMD) algorithm that uses packet loss as the primary congestion signal to determine the direction of window update and queueing delay as the secondary congestion signal to determine the amount of change.

The additive increase and multiplicative decrease factors (denoted as alpha and beta, respectively) are functions of the current average queueing delay da as shown in Equations (16) and (17). To improve the protocol robustness against sudden fluctuations in its delay sampling, Illinois allows the increment of alpha to alphaMax only if da stays below d1 for a some (theta) amount of time.

(16)alpha &=
\begin{cases}
   \quad alphaMax              & \quad \text{if } da <= d1 \\
   \quad k1 / (k2 + da)        & \quad \text{otherwise} \\
\end{cases}

(17)beta &=
\begin{cases}
   \quad betaMin               & \quad \text{if } da <= d2 \\
   \quad k3 + k4 \, da         & \quad \text{if } d2 < da < d3 \\
   \quad betaMax               & \quad \text{otherwise}
\end{cases}

where the calculations of k1, k2, k3, and k4 are shown in the following:

(18)k1 &= \frac{(dm - d1) \cdot alphaMin \cdot alphaMax}{alphaMax - alphaMin}

(19)k2 &= \frac{(dm - d1) \cdot alphaMin}{alphaMax - alphaMin} - d1

(20)k3 &= \frac{alphaMin \cdot d3 - alphaMax \cdot d2}{d3 - d2}

(21)k4 &= \frac{alphaMax - alphaMin}{d3 - d2}

Other parameters include da (the current average queueing delay), and Ta (the average RTT, calculated as sumRtt / cntRtt in the implementation) and Tmin (baseRtt in the implementation) which is the minimum RTT ever seen. dm is the maximum (average) queueing delay, and Tmax (maxRtt in the implementation) is the maximum RTT ever seen.

(22)da &= Ta - Tmin

(23)dm &= Tmax - Tmin

(24)d_i &= eta_i \cdot dm

Illinois only executes its adaptation of alpha and beta when cwnd exceeds a threshold called winThresh. Otherwise, it sets alpha and beta to the base values of 1 and 0.5, respectively.

Following the implementation of Illinois in the Linux kernel, we use the following default parameter settings:

  • alphaMin = 0.3 (0.1 in the Illinois paper)

  • alphaMax = 10.0

  • betaMin = 0.125

  • betaMax = 0.5

  • winThresh = 15 (10 in the Illinois paper)

  • theta = 5

  • eta1 = 0.01

  • eta2 = 0.1

  • eta3 = 0.8

More information: http://www.doi.org/10.1145/1190095.1190166

16.5.2.4.13. H-TCP

H-TCP has been designed for high BDP (Bandwidth-Delay Product) paths. It is a dual mode protocol. In normal conditions, it works like traditional TCP with the same rate of increment and decrement for the congestion window. However, in high BDP networks, when it finds no congestion on the path after deltal seconds, it increases the window size based on the alpha function in the following:

(25)alpha(delta)=1+10(delta-deltal)+0.5(delta-deltal)^2

where deltal is a threshold in seconds for switching between the modes and delta is the elapsed time from the last congestion. During congestion, it reduces the window size by multiplying by beta function provided in the reference paper. The calculated throughput between the last two consecutive congestion events is considered for beta calculation.

The transport TcpHtcp can be selected in the program examples/tcp/tcp-variants-comparison.cc to perform an experiment with H-TCP, although it is useful to increase the bandwidth in this example (e.g. to 20 Mb/s) to create a higher BDP link, such as:

./ns3 run "tcp-variants-comparison --transport_prot=TcpHtcp --bandwidth=20Mbps --duration=10"

More information (paper): http://www.hamilton.ie/net/htcp3.pdf

More information (Internet Draft): https://tools.ietf.org/html/draft-leith-tcp-htcp-06

16.5.2.4.14. LEDBAT

Low Extra Delay Background Transport (LEDBAT) is an experimental delay-based congestion control algorithm that seeks to utilize the available bandwidth on an end-to-end path while limiting the consequent increase in queueing delay on that path. LEDBAT uses changes in one-way delay measurements to limit congestion that the flow itself induces in the network.

As a first approximation, the LEDBAT sender operates as shown below:

On receipt of an ACK:

currentdelay = acknowledgement.delay;
basedelay = min(basedelay, currentdelay);
queuingdelay = currentdelay - basedelay;
offtarget =(TARGET - queuingdelay) / TARGET;
cWnd += GAIN * offtarget * bytesnewlyacked * MSS / cWnd;

TARGET is the maximum queueing delay that LEDBAT itself may introduce in the network, and GAIN determines the rate at which the cwnd responds to changes in queueing delay; offtarget is a normalized value representing the difference between the measured current queueing delay and the predetermined TARGET delay. offtarget can be positive or negative; consequently, cwnd increases or decreases in proportion to offtarget.

Following the recommendation of RFC 6817, the default values of the parameters are:

  • TargetDelay = 100

  • baseHistoryLen = 10

  • noiseFilterLen = 4

  • Gain = 1

To enable LEDBAT on all TCP sockets, the following configuration can be used:

Config::SetDefault("ns3::TcpL4Protocol::SocketType", TypeIdValue(TcpLedbat::GetTypeId()));

To enable LEDBAT on a chosen TCP socket, the following configuration can be used:

Config::Set("$ns3::NodeListPriv/NodeList/1/$ns3::TcpL4Protocol/SocketType", TypeIdValue(TcpLedbat::GetTypeId()));

The following unit tests have been written to validate the implementation of LEDBAT:

  • LEDBAT should operate same as NewReno during slow start

  • LEDBAT should operate same as NewReno if timestamps are disabled

  • Test to validate cwnd increment in LEDBAT

In comparison to RFC 6817, the scope and limitations of the current LEDBAT implementation are:

  • It assumes that the clocks on the sender side and receiver side are synchronised

  • In line with Linux implementation, the one-way delay is calculated at the sender side by using the timestamps option in TCP header

  • Only the MIN function is used for noise filtering

More information about LEDBAT is available in RFC 6817: https://tools.ietf.org/html/rfc6817

16.5.2.4.15. TCP-LP

TCP-Low Priority (TCP-LP) is a delay based congestion control protocol in which the low priority data utilizes only the excess bandwidth available on an end-to-end path. TCP-LP uses one way delay measurements as an indicator of congestion as it does not influence cross-traffic in the reverse direction.

On receipt of an ACK:

\text{One way delay} &= \text{Receiver timestamp} - \text{Receiver timestamp echo reply} \\
\text{Smoothed one way delay} &= \frac{7}{8} \times \text{Old Smoothed one way delay} + \frac{1}{8} \times \text{one way delay} \\
\text{If smoothed one way delay} &> \text{owdMin} + \frac{15 \times (\text{owdMax} - \text{owdMin})}{100} \\
    &\text{if LP\_WITHIN\_INF} \\
        &\quad \text{cwnd} = 1 \\
    &\text{else} \\
        &\quad \text{cwnd} = \frac{\text{cwnd}}{2} \\
    &\text{Inference timer is set}

where owdMin and owdMax are the minimum and maximum one way delays experienced throughout the connection, LP_WITHIN_INF indicates if TCP-LP is in inference phase or not

More information (paper): http://cs.northwestern.edu/~akuzma/rice/doc/TCP-LP.pdf

16.5.2.4.16. Data Center TCP (DCTCP)

DCTCP, specified in RFC 8257 and implemented in Linux, is a TCP congestion control algorithm for data center networks. It leverages Explicit Congestion Notification (ECN) to provide more fine-grained congestion feedback to the end hosts, and is intended to work with routers that implement a shallow congestion marking threshold (on the order of a few milliseconds) to achieve high throughput and low latency in the datacenter. However, because DCTCP does not react in the same way to notification of congestion experienced, there are coexistence (fairness) issues between it and legacy TCP congestion controllers, which is why it is recommended to only be used in controlled networking environments such as within data centers.

DCTCP extends the Explicit Congestion Notification signal to estimate the fraction of bytes that encounter congestion, rather than simply detecting that the congestion has occurred. DCTCP then scales the congestion window based on this estimate. This approach achieves high burst tolerance, low latency, and high throughput with shallow-buffered switches.

  • Receiver functionality: If CE is observed in the IP header of an incoming packet at the TCP receiver, the receiver sends congestion notification to the sender by setting ECE in TCP header. This processing is different from standard receiver ECN processing which sets and holds the ECE bit for every ACK until it observes a CWR signal from the TCP sender.

  • Sender functionality: The sender makes use of the modified receiver ECE semantics to maintain an estimate of the fraction of packets marked (\alpha) by using the exponential weighted moving average (EWMA) as shown below:

\alpha = (1 - g) * \alpha + g * F

In the above EWMA:

  • g is the estimation gain (between 0 and 1)

  • F is the fraction of packets marked in current RTT.

For send windows in which at least one ACK was received with ECE set, the sender should respond by reducing the congestion window as follows, once for every window of data:

cwnd = cwnd * (1 - \alpha / 2)

Following the recommendation of RFC 8257, the default values of the parameters are:

g = 0.0625 (i.e., 1/16)

initial alpha (\alpha) = 1

To enable DCTCP on all TCP sockets, the following configuration can be used:

Config::SetDefault("ns3::TcpL4Protocol::SocketType", TypeIdValue(TcpDctcp::GetTypeId()));

To enable DCTCP on a selected node, one can set the “SocketType” attribute on the TcpL4Protocol object of that node to the TcpDctcp TypeId.

The ECN is enabled automatically when DCTCP is used, even if the user has not explicitly enabled it.

DCTCP depends on a simple queue management algorithm in routers / switches to mark packets. The current implementation of DCTCP in ns-3 can use RED with a simple configuration to achieve the behavior of desired queue management algorithm.

To configure RED router for DCTCP:

Config::SetDefault("ns3::RedQueueDisc::UseEcn", BooleanValue(true));
Config::SetDefault("ns3::RedQueueDisc::QW", DoubleValue(1.0));
Config::SetDefault("ns3::RedQueueDisc::MinTh", DoubleValue(16));
Config::SetDefault("ns3::RedQueueDisc::MaxTh", DoubleValue(16));

There is also the option, when running CoDel or FqCoDel, to enable ECN on the queue and to set the “CeThreshold” value to a low value such as 1ms. The following example uses CoDel:

Config::SetDefault("ns3::CoDelQueueDisc::UseEcn", BooleanValue(true));
Config::SetDefault("ns3::CoDelQueueDisc::CeThreshold", TimeValue(MilliSeconds(1)));

The following unit tests have been written to validate the implementation of DCTCP:

  • ECT flags should be set for SYN, SYN+ACK, ACK and data packets for DCTCP traffic

  • ECT flags should not be set for SYN, SYN+ACK and pure ACK packets, but should be set on data packets for ECN enabled traditional TCP flows

  • ECE should be set only when CE flags are received at receiver and even if sender doesn’t send CWR, receiver should not send ECE if it doesn’t receive packets with CE flags

An example program, examples/tcp/tcp-validation.cc, can be used to experiment with DCTCP for long-running flows with different bottleneck link bandwidth, base RTTs, and queuing disciplines. A variant of this program has also been run using the ns-3 Direct Code Execution environment using DCTCP from Linux kernel 4.4, and the results were compared against ns-3 results.

An example program based on an experimental topology found in the original DCTCP SIGCOMM paper is provided in examples/tcp/dctcp-example.cc. This example uses a simple topology consisting of forty DCTCP senders and receivers and two ECN-enabled switches to examine throughput, fairness, and queue delay properties of the network.

This implementation was tested extensively against a version of DCTCP in the Linux kernel version 4.4 using the ns-3 direct code execution (DCE) environment. Some differences were noted:

  • Linux maintains its congestion window in segments and not bytes, and the arithmetic is not floating point, so small differences in the evolution of congestion window have been observed.

  • Linux uses pacing, where packets to be sent are paced out at regular intervals. However, if at any instant the number of segments that can be sent are less than two, Linux does not pace them and instead sends them back-to-back. Currently, ns-3 paces out all packets eligible to be sent in the same manner.

It is important to also note that the current model does not implement Section 3.5 of RFC 8257 regarding the handling of packet loss. This requirement states that DCTCP must react to lost packets in the same way as does a conventional TCP (as specified in RFC 5681). The current DCTCP model does not implement this, and should therefore only be used in simulations that do not involve any packet loss on the DCTCP flows.

More information about DCTCP is available in the RFC 8257: https://tools.ietf.org/html/rfc8257

16.5.2.4.17. BBR

BBR (class TcpBbr) is a congestion control algorithm that regulates the sending rate by deriving an estimate of the bottleneck’s available bandwidth and RTT of the path. It seeks to operate at an optimal point where sender experiences maximum delivery rate with minimum RTT. It creates a network model comprising maximum delivery rate with minimum RTT observed so far, and then estimates BDP (maximum bandwidth * minimum RTT) to control the maximum amount of inflight data. BBR controls congestion by limiting the rate at which packets are sent. It caps the cwnd to one BDP and paces out packets at a rate which is adjusted based on the latest estimate of delivery rate. BBR algorithm is agnostic to packet losses and ECN marks.

pacing_gain controls the rate of sending data and cwnd_gain controls the amount of data to send.

The following is a high level overview of BBR congestion control algorithm:

On receiving an ACK:

rtt = now - packet.sent_time;
update_minimum_rtt(rtt);
delivery_rate = estimate_delivery_rate(packet);
update_maximum_bandwidth(delivery_rate);

After transmitting a data packet:

bdp = max_bandwidth * min_rtt;
if (cwnd * bdp < inflight)
{
    return;
}
if (now > nextSendTime)
{
    transmit(packet);
    nextSendTime = now + packet.size / (pacing_gain * max_bandwidth);
}
else
{
    return;
}
Schedule(nextSendTime, Send);

To enable BBR on all TCP sockets, the following configuration can be used:

Config::SetDefault("ns3::TcpL4Protocol::SocketType", TypeIdValue(TcpBbr::GetTypeId()));

To enable BBR on a chosen TCP socket, the following configuration can be used (note that an appropriate Node ID must be used instead of 1):

Config::Set("$ns3::NodeListPriv/NodeList/1/$ns3::TcpL4Protocol/SocketType", TypeIdValue(TcpBbr::GetTypeId()));

The ns-3 implementation of BBR is based on its Linux implementation. Linux 5.4 kernel implementation has been used to validate the behavior of ns-3 implementation of BBR (See below section on Validation).

In addition, the following unit tests have been written to validate the implementation of BBR in ns-3:

  • BBR should enable (if not already done) TCP pacing feature.

  • Test to validate the values of pacing_gain and cwnd_gain in different phases of BBR.

An example program, examples/tcp/tcp-bbr-example.cc, is provided to experiment with BBR for one long running flow. This example uses a simple topology consisting of one sender, one receiver and two routers to examine congestion window, throughput and queue control. A program similar to this has been run using the Network Stack Tester (NeST) using BBR from Linux kernel 5.4, and the results were compared against ns-3 results.

More information about BBR is available in the following Internet Draft: https://tools.ietf.org/html/draft-cardwell-iccrg-bbr-congestion-control-00

More information about Delivery Rate Estimation is in the following draft: https://tools.ietf.org/html/draft-cheng-iccrg-delivery-rate-estimation-00

For an academic peer-reviewed paper on the BBR implementation in ns-3, please refer to:

16.5.2.5. Support for Explicit Congestion Notification (ECN)

ECN provides end-to-end notification of network congestion without dropping packets. It uses two bits in the IP header: ECN Capable Transport (ECT bit) and Congestion Experienced (CE bit), and two bits in the TCP header: Congestion Window Reduced (CWR) and ECN Echo (ECE).

More information is available in RFC 3168: https://tools.ietf.org/html/rfc3168

The following ECN states are declared in src/internet/model/tcp-socket-state.h

enum EcnStates_t
{
    ECN_DISABLED = 0, //!< ECN disabled traffic
    ECN_IDLE,         //!< ECN is enabled but currently there is no action pertaining to ECE or CWR to be taken
    ECN_CE_RCVD,      //!< Last packet received had CE bit set in IP header
    ECN_SENDING_ECE,  //!< Receiver sends an ACK with ECE bit set in TCP header
    ECN_ECE_RCVD,     //!< Last ACK received had ECE bit set in TCP header
    ECN_CWR_SENT      //!< Sender has reduced the congestion window, and sent a packet with CWR bit set in TCP header. This is used for tracing.
};

Current implementation of ECN is based on RFC 3168 and is referred as Classic ECN.

The following enum represents the mode of ECN:

enum EcnMode_t
{
    ClassicEcn,  //!< ECN functionality as described in RFC 3168.
    DctcpEcn,    //!< ECN functionality as described in RFC 8257. Note: this mode is specific to DCTCP.
};

The following are some important ECN parameters:

// ECN parameters
EcnMode_t              m_ecnMode {ClassicEcn}; //!< ECN mode
UseEcn_t               m_useEcn {Off};         //!< Socket ECN capability
16.5.2.5.1. Enabling ECN

By default, support for ECN is disabled in TCP sockets. To enable, change the value of the attribute ns3::TcpSocketBase::UseEcn to On. Following are supported values for the same, this functionality is aligned with Linux: https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

enum UseEcn_t
{
    Off        = 0,   //!< Disable
    On         = 1,   //!< Enable
    AcceptOnly = 2,   //!< Enable only when the peer endpoint is ECN capable
};

For example:

Config::SetDefault("ns3::TcpSocketBase::UseEcn", StringValue("On"))
16.5.2.5.2. ECN negotiation

ECN capability is negotiated during the three-way TCP handshake:

  1. Sender sends SYN + CWR + ECE

if (m_useEcn == UseEcn_t::On)
{
    SendEmptyPacket(TcpHeader::SYN | TcpHeader::ECE | TcpHeader::CWR);
}
else
{
    SendEmptyPacket(TcpHeader::SYN);
}
m_ecnState = ECN_DISABLED;
  1. Receiver sends SYN + ACK + ECE

if (m_useEcn != UseEcn_t::Off &&(tcpHeader.GetFlags() &(TcpHeader::CWR | TcpHeader::ECE)) == (TcpHeader::CWR | TcpHeader::ECE))
{
    SendEmptyPacket(TcpHeader::SYN | TcpHeader::ACK |TcpHeader::ECE);
    m_ecnState = ECN_IDLE;
}
else
{
    SendEmptyPacket(TcpHeader::SYN | TcpHeader::ACK);
    m_ecnState = ECN_DISABLED;
}
  1. Sender sends ACK

if (m_useEcn != UseEcn_t::Off && (tcpHeader.GetFlags() &(TcpHeader::CWR | TcpHeader::ECE)) == (TcpHeader::ECE))
{
    m_ecnState = ECN_IDLE;
}
else
{
    m_ecnState = ECN_DISABLED;
}

Once the ECN-negotiation is successful, the sender sends data packets with ECT bits set in the IP header.

Note: As mentioned in Section 6.1.1 of RFC 3168, ECT bits should not be set during ECN negotiation. The ECN negotiation implemented in ns-3 follows this guideline.

16.5.2.5.3. ECN State Transitions
  1. Initially both sender and receiver have their m_ecnState set as ECN_DISABLED

  2. Once the ECN negotiation is successful, their states are set to ECN_IDLE

  3. The receiver’s state changes to ECN_CE_RCVD when it receives a packet with CE bit set. The state then moves to ECN_SENDING_ECE when the receiver sends an ACK with ECE set. This state is retained until a CWR is received , following which, the state changes to ECN_IDLE.

  4. When the sender receives an ACK with ECE bit set from receiver, its state is set as ECN_ECE_RCVD

  5. The sender’s state changes to ECN_CWR_SENT when it sends a packet with CWR bit set. It remains in this state until an ACK with valid ECE is received (i.e., ECE is received for a packet that belongs to a new window), following which, its state changes to ECN_ECE_RCVD.

16.5.2.5.4. RFC 3168 compliance

Based on the suggestions provided in RFC 3168, the following behavior has been implemented:

  1. Pure ACK packets should not have the ECT bit set (Section 6.1.4).

  2. In the current implementation, the sender only sends ECT(0) in the IP header.

  3. The sender should should reduce the congestion window only once in each window (Section 6.1.2).

  4. The receiver should ignore the CE bits set in a packet arriving out of window (Section 6.1.5).

  5. The sender should ignore the ECE bits set in the packet arriving out of window (Section 6.1.2).

16.5.2.5.5. Open issues

The following issues are yet to be addressed:

  1. Retransmitted packets should not have the CWR bit set (Section 6.1.5).

  2. Despite the congestion window size being 1 MSS, the sender should reduce its congestion window by half when it receives a packet with the ECE bit set. The sender must reset the retransmit timer on receiving the ECN-Echo packet when the congestion window is one. The sending TCP will then be able to send a new packet only when the retransmit timer expires (Section 6.1.2).

  3. Support for separately handling the enabling of ECN on the incoming and outgoing TCP sessions (e.g. a TCP may perform ECN echoing but not set the ECT codepoints on its outbound data segments).

16.5.2.6. Support for Dynamic Pacing

TCP pacing refers to the sender-side practice of scheduling the transmission of a burst of eligible TCP segments across a time interval such as a TCP RTT, to avoid or reduce bursts. Historically, TCP used the natural ACK clocking mechanism to pace segments, but some network paths introduce aggregation (bursts of ACKs arriving) or ACK thinning, either of which disrupts ACK clocking. Some latency-sensitive congestion controls under development (Prague, BBR) require pacing to operate effectively.

Until recently, the state of the art in Linux was to support pacing in one of two ways:

  1. fq/pacing with sch_fq

  2. TCP internal pacing

The presentation by Dumazet and Cheng at IETF 88 summarizes: https://www.ietf.org/proceedings/88/slides/slides-88-tcpm-9.pdf

The first option was most often used when offloading (TSO) was enabled and when the sch_fq scheduler was used at the traffic control (qdisc) sublayer. In this case, TCP was responsible for setting the socket pacing rate, but the qdisc sublayer would enforce it. When TSO was enabled, the kernel would break a large burst into smaller chunks, with dynamic sizing based on the pacing rate, and hand off the segments to the fq qdisc for pacing.

The second option was used if sch_fq was not enabled; TCP would be responsible for internally pacing.

In 2018, Linux switched to an Early Departure Model (EDM): https://lwn.net/Articles/766564/.

TCP pacing in Linux was added in kernel 3.12, and authors chose to allow a pacing rate of 200% against the current rate, to allow probing for optimal throughput even during slow start phase. Some refinements were added in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43e122b014c9, in which Google reported that it was better to apply a different ratio (120%) in Congestion Avoidance phase. Furthermore, authors found that after cwnd reduction, it was helpful to become more conservative and switch to the conservative ratio (120%) as soon as cwnd >= ssthresh/2, as the initial ramp up (when ssthresh is infinite) still allows doubling cwnd every other RTT. Linux also does not pace the initial window (IW), typically 10 segments in practice.

Linux has also been observed to not pace if the number of eligible segments to be sent is exactly two; they will be sent back to back. If three or more, the first two are sent immediately, and additional segments are paced at the current pacing rate.

In ns-3, the model is as follows. There is no TSO/sch_fq model; only internal pacing according to current Linux policy.

Pacing may be enabled for any TCP congestion control, and a maximum pacing rate can be set. Furthermore, dynamic pacing is enabled for all TCP variants, according to the following guidelines.

  • Pacing of the initial window (IW) is not done by default but can be separately enabled.

  • Pacing of the initial slow start, after IW, is done according to the pacing rate of 200% of the current rate, to allow for window growth This pacing rate can be configured to a different value than 200%.

  • Pacing of congestion avoidance phase is done at a pacing rate of 120% of current rate. This can be configured to a different value than 120%.

  • Pacing of subsequent slow start is done according to the following heuristic. If cwnd < ssthresh/2, such as after a timeout or idle period, pace at the slow start rate (200%). Otherwise, pace at the congestion avoidance rate.

Dynamic pacing is demonstrated by the example program examples/tcp/tcp-pacing.cc.

16.5.2.7. Validation

The following tests are found in the src/internet/test directory. In general, TCP tests inherit from a class called TcpGeneralTest, which provides common operations to set up test scenarios involving TCP objects. For more information on how to write new tests, see the section below on Writing TCP tests.

  • tcp: Basic transmission of string of data from client to server

  • tcp-bytes-in-flight-test: TCP correctly estimates bytes in flight under loss conditions

  • tcp-cong-avoid-test: TCP congestion avoidance for different packet sizes

  • tcp-datasentcb: Check TCP’s ‘data sent’ callback

  • tcp-endpoint-bug2211-test: A test for an issue that was causing stack overflow

  • tcp-fast-retr-test: Fast Retransmit testing

  • tcp-header: Unit tests on the TCP header

  • tcp-highspeed-test: Unit tests on the HighSpeed congestion control

  • tcp-htcp-test: Unit tests on the H-TCP congestion control

  • tcp-hybla-test: Unit tests on the Hybla congestion control

  • tcp-vegas-test: Unit tests on the Vegas congestion control

  • tcp-veno-test: Unit tests on the Veno congestion control

  • tcp-scalable-test: Unit tests on the Scalable congestion control

  • tcp-bic-test: Unit tests on the BIC congestion control

  • tcp-yeah-test: Unit tests on the YeAH congestion control

  • tcp-illinois-test: Unit tests on the Illinois congestion control

  • tcp-ledbat-test: Unit tests on the LEDBAT congestion control

  • tcp-lp-test: Unit tests on the TCP-LP congestion control

  • tcp-dctcp-test: Unit tests on the DCTCP congestion control

  • tcp-bbr-test: Unit tests on the BBR congestion control

  • tcp-option: Unit tests on TCP options

  • tcp-pkts-acked-test: Unit test the number of time that PktsAcked is called

  • tcp-rto-test: Unit test behavior after a RTO occurs

  • tcp-rtt-estimation-test: Check RTT calculations, including retransmission cases

  • tcp-slow-start-test: Check behavior of slow start

  • tcp-timestamp: Unit test on the timestamp option

  • tcp-wscaling: Unit test on the window scaling option

  • tcp-zero-window-test: Unit test persist behavior for zero window conditions

  • tcp-close-test: Unit test on the socket closing: both receiver and sender have to close their socket when all bytes are transferred

  • tcp-ecn-test: Unit tests on Explicit Congestion Notification

  • tcp-pacing-test: Unit tests on dynamic TCP pacing rate

Several tests have dependencies outside of the internet module, so they are located in a system test directory called src/test/ns3tcp.

  • ns3-tcp-loss: Check behavior of ns-3 TCP upon packet losses

  • ns3-tcp-no-delay: Check that ns-3 TCP Nagle’s algorithm works correctly and that it can be disabled

  • ns3-tcp-socket: Check that ns-3 TCP successfully transfers an application data write of various sizes

  • ns3-tcp-state: Check the operation of the TCP state machine for several cases

Several TCP validation test results can also be found in the wiki page describing this implementation.

The ns-3 implementation of TCP Linux Reno was validated against the NewReno implementation of Linux kernel 4.4.0 using ns-3 Direct Code Execution (DCE). DCE is a framework which allows the users to run kernel space protocol inside ns-3 without changing the source code.

In this validation, cwnd traces of DCE Linux reno were compared to those of ns-3 Linux Reno and NewReno for a delayed acknowledgement configuration of 1 segment (in the ns-3 implementation; Linux does not allow direct configuration of this setting). It can be observed that cwnd traces for ns-3 Linux Reno are closely overlapping with DCE reno, while for ns-3 NewReno there was deviation in the congestion avoidance phase.

_images/dce-linux-reno-vs-ns3-linux-reno.png

DCE Linux Reno vs. ns-3 Linux Reno

_images/dce-linux-reno-vs-ns3-new-reno.png

DCE Linux Reno vs. ns-3 NewReno

The difference in the cwnd in the early stage of this flow is because of the way cwnd is plotted. As ns-3 provides a trace source for cwnd, an ns-3 Linux Reno cwnd simple is obtained every time the cwnd value changes, whereas for DCE Linux Reno, the kernel does not have a corresponding trace source. Instead, we use the “ss” command of the Linux kernel to obtain cwnd values. The “ss” samples cwnd at an interval of 0.5 seconds.

Figure DCTCP throughput for 10ms/50Mbps bottleneck, 1ms CE threshold shows a long-running file transfer using DCTCP over a 50 Mbps bottleneck (running CoDel queue disc with a 1ms CE threshold setting) with a 10 ms base RTT. The figure shows that DCTCP reaches link capacity very quickly and stays there for the duration with minimal change in throughput. In contrast, Figure DCTCP throughput for 80ms/50Mbps bottleneck, 1ms CE threshold plots the throughput for the same configuration except with an 80 ms base RTT. In this case, the DCTCP exits slow start early and takes a long time to build the flow throughput to the bottleneck link capacity. DCTCP is not intended to be used at such a large base RTT, but this figure highlights the sensitivity to RTT (and can be reproduced using the Linux implementation).

_images/dctcp-10ms-50mbps-tcp-throughput.png

DCTCP throughput for 10ms/50Mbps bottleneck, 1ms CE threshold

_images/dctcp-80ms-50mbps-tcp-throughput.png

DCTCP throughput for 80ms/50Mbps bottleneck, 1ms CE threshold

Similar to DCTCP, TCP CUBIC has been tested against the Linux kernel version 4.4 implementation. Figure CUBIC cwnd evolution for 50ms/50Mbps bottleneck, no ECN compares the congestion window evolution between ns-3 and Linux for a single flow operating over a 50 Mbps link with 50 ms base RTT and the CoDel AQM. Some differences can be observed between the peak of slow start window growth (ns-3 exits slow start earlier due to its HyStart implementation), and the window growth is a bit out-of-sync (likely due to different implementations of the algorithm), but the cubic concave/convex window pattern, and the signs of TCP CUBIC fast convergence algorithm (alternating patterns of cubic and concave window growth) can be observed. The ns-3 congestion window is maintained in bytes (unlike Linux which uses segments) but has been normalized to segments for these plots. Figure CUBIC cwnd evolution for 50ms/50Mbps bottleneck, with ECN displays the outcome of a similar scenario but with ECN enabled throughout.

_images/cubic-50ms-50mbps-tcp-cwnd-no-ecn.png

CUBIC cwnd evolution for 50ms/50Mbps bottleneck, no ECN

_images/cubic-50ms-50mbps-tcp-cwnd-ecn.png

CUBIC cwnd evolution for 50ms/50Mbps bottleneck, with ECN

TCP ECN operation is tested in the ARED and RED tests that are documented in the traffic-control module documentation.

Like DCTCP and TCP CUBIC, the ns-3 implementation of TCP BBR was validated against the BBR implementation of Linux kernel 5.4 using Network Stack Tester (NeST). NeST is a python package which allows the users to emulate kernel space protocols using Linux network namespaces. Figure Congestion window evolution: ns-3 BBR vs. Linux BBR (using NeST) compares the congestion window evolution between ns-3 and Linux for a single flow operating over a 10 Mbps link with 10 ms base RTT and FIFO queue discipline.

_images/ns3-bbr-vs-linux-bbr.png

Congestion window evolution: ns-3 BBR vs. Linux BBR (using NeST)

It can be observed that the congestion window traces for ns-3 BBR closely overlap with Linux BBR. The periodic drops in congestion window every 10 seconds depict the PROBE_RTT phase of the BBR algorithm. In this phase, BBR algorithm keeps the congestion window fixed to 4 segments.

The example program, examples/tcp-bbr-example.cc has been used to obtain the congestion window curve shown in Figure Congestion window evolution: ns-3 BBR vs. Linux BBR (using NeST). The detailed instructions to reproduce ns-3 plot and NeST plot can be found at: https://github.com/mohittahiliani/BBR-Validation

16.5.2.8. Writing a new congestion control algorithm

Writing (or porting) a congestion control algorithms from scratch (or from other systems) is a process completely separated from the internals of TcpSocketBase.

All operations that are delegated to a congestion control are contained in the class TcpCongestionOps. It mimics the structure tcp_congestion_ops of Linux, and the following operations are defined:

virtual std::string GetName() const;
virtual uint32_t GetSsThresh(Ptr<const TcpSocketState> tcb, uint32_t bytesInFlight);
virtual void IncreaseWindow(Ptr<TcpSocketState> tcb, uint32_t segmentsAcked);
virtual void PktsAcked(Ptr<TcpSocketState> tcb, uint32_t segmentsAcked,const Time& rtt);
virtual Ptr<TcpCongestionOps> Fork();
virtual void CwndEvent(Ptr<TcpSocketState> tcb, const TcpSocketState::TcpCaEvent_t event);

The most interesting methods to write are GetSsThresh and IncreaseWindow. The latter is called when TcpSocketBase decides that it is time to increase the congestion window. Much information is available in the Transmission Control Block, and the method should increase cWnd and/or ssThresh based on the number of segments acked.

GetSsThresh is called whenever the socket needs an updated value of the slow start threshold. This happens after a loss; congestion control algorithms are then asked to lower such value, and to return it.

PktsAcked is used in case the algorithm needs timing information (such as RTT), and it is called each time an ACK is received.

CwndEvent is used in case the algorithm needs the state of socket during different congestion window event.

16.5.2.9. TCP SACK and non-SACK

To avoid code duplication and the effort of maintaining two different versions of the TCP core, namely RFC 6675 (TCP-SACK) and RFC 5681 (TCP congestion control), we have merged RFC 6675 in the current code base. If the receiver supports the option, the sender bases its retransmissions over the received SACK information. However, in the absence of that option, the best it can do is to follow the RFC 5681 specification (on Fast Retransmit/Recovery) and employing NewReno modifications in case of partial ACKs.

A similar concept is used in Linux with the function tcp_add_reno_sack. Our implementation resides in the TcpTxBuffer class that implements a scoreboard through two different lists of segments. TcpSocketBase actively uses the API provided by TcpTxBuffer to query the scoreboard; please refer to the Doxygen documentation (and to in-code comments) if you want to learn more about this implementation.

For an academic peer-reviewed paper on the SACK implementation in ns-3, please refer to https://dl.acm.org/citation.cfm?id=3067666.

16.5.2.10. Loss Recovery Algorithms

The following loss recovery algorithms are supported in ns-3 TCP. The current default (as of ns-3.32 release) is Proportional Rate Reduction (PRR), while the default for ns-3.31 and earlier was Classic Recovery.

16.5.2.10.1. Classic Recovery

Classic Recovery refers to the combination of NewReno algorithm described in RFC 6582 along with SACK based loss recovery algorithm mentioned in RFC 6675. SACK based loss recovery is used when sender and receiver support SACK options. In the case when SACK options are disabled, the NewReno modification handles the recovery.

At the start of recovery phase the congestion window is reduced differently for NewReno and SACK based recovery. For NewReno the reduction is done as given below:

cWnd = ssThresh

For SACK based recovery, this is done as follows:

cWnd = ssThresh + (dupAckCount * segmentSize)

While in the recovery phase, the congestion window is inflated by segmentSize on arrival of every ACK when NewReno is used. The congestion window is kept same when SACK based loss recovery is used.

16.5.2.10.2. Proportional Rate Reduction

Proportional Rate Reduction (PRR) is a loss recovery algorithm described in RFC 6937 and currently used in Linux. The design of PRR helps in avoiding excess window adjustments and aims to keep the congestion window as close as possible to ssThresh.

PRR updates the congestion window by comparing the values of bytesInFlight and ssThresh. If the value of bytesInFlight is greater than ssThresh, congestion window is updated as shown below:

sndcnt = CEIL(prrDelivered * ssThresh / RecoverFS) - prrOut

cWnd = pipe + sndcnt

where RecoverFS is the value of bytesInFlight at the start of recovery phase, prrDelivered is the total bytes delivered during recovery phase, prrOut is the total bytes sent during recovery phase and sndcnt represents the number of bytes to be sent in response to each ACK.

Otherwise, the congestion window is updated by either using Conservative Reduction Bound (CRB) or Slow Start Reduction Bound (SSRB) with SSRB being the default Reduction Bound. Each Reduction Bound calculates a maximum data sending limit. For CRB, the limit is calculated as shown below:

limit = prrDelivered - prr out

For SSRB, it is calculated as:

limit = MAX(prrDelivered - prrOut, DeliveredData) + MSS

where DeliveredData represents the total number of bytes delivered to the receiver as indicated by the current ACK and MSS is the maximum segment size.

After limit calculation, the cWnd is updated as given below:

sndcnt = MIN (ssThresh - pipe, limit)

cWnd = pipe + sndcnt

More information (paper): https://dl.acm.org/citation.cfm?id=2068832

More information (RFC): https://tools.ietf.org/html/rfc6937

16.5.2.11. Adding a new loss recovery algorithm in ns-3

Writing (or porting) a loss recovery algorithms from scratch (or from other systems) is a process completely separated from the internals of TcpSocketBase.

All operations that are delegated to a loss recovery are contained in the class TcpRecoveryOps and are given below:

virtual std::string GetName() const;
virtual void EnterRecovery(Ptr<const TcpSocketState> tcb, uint32_t unAckDataCount,
                           bool isSackEnabled, uint32_t dupAckCount,
                           uint32_t bytesInFlight, uint32_t lastDeliveredBytes);
virtual void DoRecovery(Ptr<const TcpSocketState> tcb, uint32_t unAckDataCount,
                        bool isSackEnabled, uint32_t dupAckCount,
                        uint32_t bytesInFlight, uint32_t lastDeliveredBytes);
virtual void ExitRecovery(Ptr<TcpSocketState> tcb, uint32_t bytesInFlight);
virtual void UpdateBytesSent(uint32_t bytesSent);
virtual Ptr<TcpRecoveryOps> Fork();

EnterRecovery is called when packet loss is detected and recovery is triggered. While in recovery phase, each time when an ACK arrives, DoRecovery is called which performs the necessary congestion window changes as per the recovery algorithm. ExitRecovery is called just prior to exiting recovery phase in order to perform the required congestion window adjustments. UpdateBytesSent is used to keep track of bytes sent and is called whenever a data packet is sent during recovery phase.

16.5.2.12. Delivery Rate Estimation

Current TCP implementation measures the approximate value of the delivery rate of inflight data based on Delivery Rate Estimation.

As high level idea, keep in mind that the algorithm keeps track of 2 variables:

  1. delivered: Total amount of data delivered so far.

  2. deliveredStamp: Last time delivered was updated.

When a packet is transmitted, the value of delivered (d0) and deliveredStamp (t0) is stored in its respective TcpTxItem.

When an acknowledgement comes for this packet, the value of delivered and deliveredStamp is updated to d1 and t1 in the same TcpTxItem.

After processing the acknowledgement, the rate sample is calculated and then passed to a congestion avoidance algorithm:

delivery_rate = (d1 - d0)/(t1 - t0)

The implementation to estimate delivery rate is a joint work between TcpTxBuffer and TcpRateOps. For more information, please take a look at their Doxygen documentation.

The implementation follows the Internet draft (Delivery Rate Estimation): https://tools.ietf.org/html/draft-cheng-iccrg-delivery-rate-estimation-00

16.5.2.13. Current limitations
  • TcpCongestionOps interface does not contain every possible Linux operation

16.5.2.14. Writing TCP tests

The TCP subsystem supports automated test cases on both socket functions and congestion control algorithms. To show how to write tests for TCP, here we explain the process of creating a test case that reproduces the Bug #1571.

The bug concerns the zero window situation, which happens when the receiver cannot handle more data. In this case, it advertises a zero window, which causes the sender to pause transmission and wait for the receiver to increase the window.

The sender has a timer to periodically check the receiver’s window: however, in modern TCP implementations, when the receiver has freed a “significant” amount of data, the receiver itself sends an “active” window update, meaning that the transmission could be resumed. Nevertheless, the sender timer is still necessary because window updates can be lost.

Note

During the text, we will assume some knowledge about the general design of the TCP test infrastructure, which is explained in detail into the Doxygen documentation. As a brief summary, the strategy is to have a class that sets up a TCP connection, and that calls protected members of itself. In this way, subclasses can implement the necessary members, which will be called by the main TcpGeneralTest class when events occur. For example, after processing an ACK, the method ProcessedAck will be invoked. Subclasses interested in checking some particular things which must have happened during an ACK processing, should implement the ProcessedAck method and check the interesting values inside the method. To get a list of available methods, please check the Doxygen documentation.

We describe the writing of two test cases, covering both situations: the sender’s zero-window probing and the receiver “active” window update. Our focus will be on dealing with the reported problems, which are:

  • an ns-3 receiver does not send “active” window update when its receive buffer is being freed;

  • even if the window update is artificially crafted, the transmission does not resume.

However, other things should be checked in the test:

  • Persistent timer setup

  • Persistent timer teardown if rWnd increases

To construct the test case, one first derives from the TcpGeneralTest class:

The code is the following:

TcpZeroWindowTest::TcpZeroWindowTest(const std::string &desc)
  : TcpGeneralTest(desc)
{
}

Then, one should define the general parameters for the TCP connection, which will be one-sided (one node is acting as SENDER, while the other is acting as RECEIVER):

  • Application packet size set to 500, and 20 packets in total (meaning a stream of 10k bytes)

  • Segment size for both SENDER and RECEIVER set to 500 bytes

  • Initial slow start threshold set to UINT32_MAX

  • Initial congestion window for the SENDER set to 10 segments (5000 bytes)

  • Congestion control: NewReno

We have also to define the link properties, because the above definition does not work for every combination of propagation delay and sender application behavior.

  • Link one-way propagation delay: 50 ms

  • Application packet generation interval: 10 ms

  • Application starting time: 20 s after the starting point

To define the properties of the environment (e.g. properties which should be set before the object creation, such as propagation delay) one next implements the method ConfigureEnvironment:

void
TcpZeroWindowTest::ConfigureEnvironment()
{
    TcpGeneralTest::ConfigureEnvironment();
    SetAppPktCount(20);
    SetMTU(500);
    SetTransmitStart(Seconds(2.0));
    SetPropagationDelay(MilliSeconds(50));
}

For other properties, set after the object creation, one can use ConfigureProperties (). The difference is that some values, such as initial congestion window or initial slow start threshold, are applicable only to a single instance, not to every instance we have. Usually, methods that requires an id and a value are meant to be called inside ConfigureProperties (). Please see the Doxygen documentation for an exhaustive list of the tunable properties.

void
TcpZeroWindowTest::ConfigureProperties()
{
    TcpGeneralTest::ConfigureProperties();
    SetInitialCwnd(SENDER, 10);
}

To see the default value for the experiment, please see the implementation of both methods inside TcpGeneralTest class.

Note

If some configuration parameters are missing, add a method called “SetSomeValue” which takes as input the value only (if it is meant to be called inside ConfigureEnvironment) or the socket and the value (if it is meant to be called inside ConfigureProperties).

To define a zero-window situation, we choose (by design) to initiate the connection with a 0-byte rx buffer. This implies that the RECEIVER, in its first SYN-ACK, advertises a zero window. This can be accomplished by implementing the method CreateReceiverSocket, setting an Rx buffer value of 0 bytes (at line 6 of the following code):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Ptr<TcpSocketMsgBase>
TcpZeroWindowTest::CreateReceiverSocket(Ptr<Node> node)
{
    Ptr<TcpSocketMsgBase> socket = TcpGeneralTest::CreateReceiverSocket(node);

    socket->SetAttribute("RcvBufSize", UintegerValue(0));
    Simulator::Schedule(Seconds(10.0),
                      &TcpZeroWindowTest::IncreaseBufSize, this);

    return socket;
}

Even so, to check the active window update, we should schedule an increase of the buffer size. We do this at line 7 and 8, scheduling the function IncreaseBufSize.

void
TcpZeroWindowTest::IncreaseBufSize()
{
    SetRcvBufSize(RECEIVER, 2500);
}

Which utilizes the SetRcvBufSize method to edit the RxBuffer object of the RECEIVER. As said before, check the Doxygen documentation for class TcpGeneralTest to be aware of the various possibilities that it offers.

Note

By design, we choose to maintain a close relationship between TcpSocketBase and TcpGeneralTest: they are connected by a friendship relation. Since friendship is not passed through inheritance, if one discovers that one needs to access or to modify a private (or protected) member of TcpSocketBase, one can do so by adding a method in the class TcpGeneralSocket. An example of such method is SetRcvBufSize, which allows TcpGeneralSocket subclasses to forcefully set the RxBuffer size.

void
TcpGeneralTest::SetRcvBufSize(SocketWho who, uint32_t size)
{
    if (who == SENDER)
    {
        m_senderSocket->SetRcvBufSize(size);
    }
    else if (who == RECEIVER)
    {
        m_receiverSocket->SetRcvBufSize(size);
    }
    else
    {
        NS_FATAL_ERROR("Not defined");
    }
}

Next, we can start to follow the TCP connection:

  1. At time 0.0 s the connection is opened sender side, with a SYN packet sent from SENDER to RECEIVER

  2. At time 0.05 s the RECEIVER gets the SYN and replies with a SYN-ACK

  3. At time 0.10 s the SENDER gets the SYN-ACK and replies with a SYN.

While the general structure is defined, and the connection is started, we need to define a way to check the rWnd field on the segments. To this aim, we can implement the methods Rx and Tx in the TcpGeneralTest subclass, checking each time the actions of the RECEIVER and the SENDER. These methods are defined in TcpGeneralTest, and they are attached to the Rx and Tx traces in the TcpSocketBase. One should write small tests for every detail that one wants to ensure during the connection (it will prevent the test from changing over the time, and it ensures that the behavior will stay consistent through releases). We start by ensuring that the first SYN-ACK has 0 as advertised window size:

void
TcpZeroWindowTest::Tx(const Ptr<const Packet> p, const TcpHeader &h, SocketWho who)
{
    ...
    else if (who == RECEIVER)
    {
        NS_LOG_INFO("\tRECEIVER TX " << h << " size " << p->GetSize());

        if (h.GetFlags() & TcpHeader::SYN)
        {
            NS_TEST_ASSERT_MSG_EQ(h.GetWindowSize(),
                                  0,
                                  "RECEIVER window size is not 0 in the SYN-ACK");
        }
    }
    ...
}

Practically, we are checking that every SYN packet sent by the RECEIVER has the advertised window set to 0. The same thing is done also by checking, in the Rx method, that each SYN received by SENDER has the advertised window set to 0. Thanks to the log subsystem, we can print what is happening through messages. If we run the experiment, enabling the logging, we can see the following:

./ns3 shell
gdb --args ./build/utils/ns3-dev-test-runner-debug --test-name=tcp-zero-window-test --stop-on-failure --fullness=QUICK --assert-on-failure --verbose
(gdb) run

0.00s TcpZeroWindowTestSuite:Tx(): 0.00  SENDER TX 49153 > 4477 [SYN] Seq=0 Ack=0 Win=32768 ns3::TcpOptionWinScale(2) ns3::TcpOptionTS(0;0) size 36
0.05s TcpZeroWindowTestSuite:Rx(): 0.05  RECEIVER RX 49153 > 4477 [SYN] Seq=0 Ack=0 Win=32768 ns3::TcpOptionWinScale(2) ns3::TcpOptionTS(0;0) ns3::TcpOptionEnd(EOL) size 0
0.05s TcpZeroWindowTestSuite:Tx(): 0.05  RECEIVER TX 4477 > 49153 [SYN|ACK] Seq=0 Ack=1 Win=0 ns3::TcpOptionWinScale(0) ns3::TcpOptionTS(50;0) size 36
0.10s TcpZeroWindowTestSuite:Rx(): 0.10  SENDER RX 4477 > 49153 [SYN|ACK] Seq=0 Ack=1 Win=0 ns3::TcpOptionWinScale(0) ns3::TcpOptionTS(50;0) ns3::TcpOptionEnd(EOL) size 0
0.10s TcpZeroWindowTestSuite:Tx(): 0.10  SENDER TX 49153 > 4477 [ACK] Seq=1 Ack=1 Win=32768 ns3::TcpOptionTS(100;50) size 32
0.15s TcpZeroWindowTestSuite:Rx(): 0.15  RECEIVER RX 49153 > 4477 [ACK] Seq=1 Ack=1 Win=32768 ns3::TcpOptionTS(100;50) ns3::TcpOptionEnd(EOL) size 0
(...)

The output is cut to show the threeway handshake. As we can see from the headers, the rWnd of RECEIVER is set to 0, and thankfully our tests are not failing. Now we need to test for the persistent timer, which should be started by the SENDER after it receives the SYN-ACK. Since the Rx method is called before any computation on the received packet, we should utilize another method, namely ProcessedAck, which is the method called after each processed ACK. In the following, we show how to check if the persistent event is running after the processing of the SYN-ACK:

void
TcpZeroWindowTest::ProcessedAck(const Ptr<const TcpSocketState> tcb,
                                const TcpHeader& h,
                                SocketWho who)
{
    if (who == SENDER)
    {
        if (h.GetFlags() & TcpHeader::SYN)
        {
            EventId persistentEvent = GetPersistentEvent(SENDER);
            NS_TEST_ASSERT_MSG_EQ(persistentEvent.IsPending(),
                                  true,
                                  "Persistent event not started");
        }
    }
}

Since we programmed the increase of the buffer size after 10 simulated seconds, we expect the persistent timer to fire before any rWnd changes. When it fires, the SENDER should send a window probe, and the receiver should reply reporting again a zero window situation. At first, we investigates on what the sender sends:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
if (Simulator::Now().GetSeconds() <= 6.0)
{
    NS_TEST_ASSERT_MSG_EQ(p->GetSize() - h.GetSerializedSize(),
                          0,
                          "Data packet sent anyway");
}
else if (Simulator::Now().GetSeconds() > 6.0 &&
         Simulator::Now().GetSeconds() <= 7.0)
{
    NS_TEST_ASSERT_MSG_EQ(m_zeroWindowProbe, false, "Sent another probe");

    if (!m_zeroWindowProbe)
    {
        NS_TEST_ASSERT_MSG_EQ(p->GetSize() - h.GetSerializedSize(),
                              1,
                              "Data packet sent instead of window probe");
        NS_TEST_ASSERT_MSG_EQ(h.GetSequenceNumber(),
                              SequenceNumber32(1),
                              "Data packet sent instead of window probe");
        m_zeroWindowProbe = true;
    }
}

We divide the events by simulated time. At line 1, we check everything that happens before the 6.0 seconds mark; for instance, that no data packets are sent, and that the state remains OPEN for both sender and receiver.

Since the persist timeout is initialized at 6 seconds (exercise left for the reader: edit the test, getting this value from the Attribute system), we need to check (line 6) between 6.0 and 7.0 simulated seconds that the probe is sent. Only one probe is allowed, and this is the reason for the check at line 11.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
if (Simulator::Now().GetSeconds() > 6.0 &&
    Simulator::Now().GetSeconds() <= 7.0)
{
    NS_TEST_ASSERT_MSG_EQ(h.GetSequenceNumber(),
                          SequenceNumber32(1),
                          "Data packet sent instead of window probe");
    NS_TEST_ASSERT_MSG_EQ(h.GetWindowSize(),
                          0,
                          "No zero window advertised by RECEIVER");
}

For the RECEIVER, the interval between 6 and 7 seconds is when the zero-window segment is sent.

Other checks are redundant; the safest approach is to deny any other packet exchange between the 7 and 10 seconds mark.

else if (Simulator::Now().GetSeconds() > 7.0 &&
         Simulator::Now().GetSeconds() < 10.0)
{
    NS_FATAL_ERROR("No packets should be sent before the window update");
}

The state checks are performed at the end of the methods, since they are valid in every condition:

NS_TEST_ASSERT_MSG_EQ(GetCongStateFrom(GetTcb(SENDER)),
                      TcpSocketState::CA_OPEN,
                      "Sender State is not OPEN");
NS_TEST_ASSERT_MSG_EQ(GetCongStateFrom(GetTcb(RECEIVER)),
                      TcpSocketState::CA_OPEN,
                      "Receiver State is not OPEN");

Now, the interesting part in the Tx method is to check that after the 10.0 seconds mark (when the RECEIVER sends the active window update) the value of the window should be greater than zero (and precisely, set to 2500):

else if (Simulator::Now().GetSeconds() >= 10.0)
{
    NS_TEST_ASSERT_MSG_EQ(h.GetWindowSize(),
                          2500,
                          "Receiver window not updated");
}

To be sure that the sender receives the window update, we can use the Rx method:

1
2
3
4
5
6
7
if (Simulator::Now().GetSeconds() >= 10.0)
{
    NS_TEST_ASSERT_MSG_EQ(h.GetWindowSize(),
                          2500,
                          "Receiver window not updated");
    m_windowUpdated = true;
}

We check every packet after the 10 seconds mark to see if it has the window updated. At line 5, we also set to true a boolean variable, to check that we effectively reach this test.

Last but not least, we implement also the NormalClose() method, to check that the connection ends with a success:

void
TcpZeroWindowTest::NormalClose(SocketWho who)
{
    if (who == SENDER)
    {
        m_senderFinished = true;
    }
    else if (who == RECEIVER)
    {
        m_receiverFinished = true;
    }
}

The method is called only if all bytes are transmitted successfully. Then, in the method FinalChecks(), we check all variables, which should be true (which indicates that we have perfectly closed the connection).

void
TcpZeroWindowTest::FinalChecks()
{
    NS_TEST_ASSERT_MSG_EQ(m_zeroWindowProbe,
                          true,
                          "Zero window probe not sent");
    NS_TEST_ASSERT_MSG_EQ(m_windowUpdated,
                          true,
                          "Window has not updated during the connection");
    NS_TEST_ASSERT_MSG_EQ(m_senderFinished,
                          true,
                          "Connection not closed successfully(SENDER)");
    NS_TEST_ASSERT_MSG_EQ(m_receiverFinished,
                          true,
                          "Connection not closed successfully(RECEIVER)");
}

To run the test, the usual way is

./test.py -s tcp-zero-window-test

PASS: TestSuite tcp-zero-window-test
1 of 1 tests passed (1 passed, 0 skipped, 0 failed, 0 crashed, 0 valgrind errors)

To see INFO messages, use a combination of ./ns3 shell and gdb (really useful):

./ns3 shell && gdb --args ./build/utils/ns3-dev-test-runner-debug --test-name=tcp-zero-window-test --stop-on-failure --fullness=QUICK --assert-on-failure --verbose

and then, hit “Run”.

Note

This code magically runs without any reported errors; however, in real cases, when you discover a bug you should expect the existing test to fail (this could indicate a well-written test and a bad-written model, or a bad-written test; hopefully the first situation). Correcting bugs is an iterative process. For instance, commits created to make this test case running without errors are 11633:6b74df04cf44, (others to be merged).

16.6. UDP model in ns-3

This chapter describes the UDP model available in ns-3.

16.6.1. Generic support for UDP

ns-3 supports a native implementation of UDP. It provides a connectionless, unreliable datagram packet service. Packets may be reordered or duplicated before they arrive. UDP calculates and checks checksums to catch transmission errors.

This implementation inherits from a few common header classes in the src/network directory, so that user code can swap out implementations with minimal changes to the scripts.

Here are the important abstract base classes:

  • class UdpSocket: This is defined in: src/internet/model/udp-socket.{cc,h} This is an abstract base class of all UDP sockets. This class exists solely for hosting UdpSocket attributes that can be reused across different implementations, and for declaring UDP-specific multicast API.

  • class UdpSocketImpl: This class subclasses UdpSocket, and provides a socket interface to ns-3’s implementation of UDP.

  • class UdpSocketFactory: This is used by the layer-4 protocol instance to create UDP sockets.

  • class UdpSocketFactoryImpl: This class is derived from SocketFactory and implements the API for creating UDP sockets.

  • class UdpHeader: This class contains fields corresponding to those in a network UDP header (port numbers, payload size, checksum) as well as methods for serialization to and deserialization from a byte buffer.

  • class UdpL4Protocol: This is a subclass of IpL4Protocol and provides an implementation of the UDP protocol.

16.6.2. ns-3 UDP

This is an implementation of the User Datagram Protocol described in RFC 768. UDP uses a simple connectionless communication model with a minimum of protocol mechanism. The implementation provides checksums for data integrity, and port numbers for addressing different functions at the source and destination of the datagram. It has no handshaking dialogues, and thus exposes the user’s data to any unreliability of the underlying network. There is no guarantee of data delivery, ordering, or duplicate protection.

16.6.2.1. Usage

In many cases, usage of UDP is set at the application layer by telling the ns-3 application which kind of socket factory to use.

Using the helper functions defined in src/applications/helper, here is how one would create a UDP receiver:

// Create a packet sink on the receiver
uint16_t port = 50000;
Address sinkLocalAddress(InetSocketAddress(Ipv4Address::GetAny(), port));
PacketSinkHelper sinkHelper("ns3::UdpSocketFactory", sinkLocalAddress);
ApplicationContainer sinkApp = sinkHelper.Install(serverNode);
sinkApp.Start(Seconds(1.0));
sinkApp.Stop(Seconds(10.0));

Similarly, the below snippet configures OnOffApplication traffic source to use UDP:

// Create the OnOff applications to send data to the UDP receiver
OnOffHelper clientHelper("ns3::UdpSocketFactory", Address());
clientHelper.SetAttribute("Remote", remoteAddress);
ApplicationContainer clientApps =(clientHelper.Install(clientNode);
clientApps.Start(Seconds(2.0));
clientApps.Stop(Seconds(9.0));

For users who wish to have a pointer to the actual socket(so that socket operations like Bind(), setting socket options, etc. can be done on a per-socket basis), UDP sockets can be created by using the Socket::CreateSocket() method as given below:

Ptr<Node> node = CreateObject<Node>();
InternetStackHelper internet;
internet.Install(node);

Ptr<SocketFactory> socketFactory = node->GetObject<UdpSocketFactory>();
Ptr<Socket> socket = socketFactory->CreateSocket();
socket->Bind(InetSocketAddress(Ipv4Address::GetAny(), 80));

Once a UDP socket is created, we do not need an explicit connection setup before sending and receiving data. Being a connectionless protocol, all we need to do is to create a socket and bind it to a known port. For a client, simply create a socket and start sending data. The Bind() call allows an application to specify a port number and an address on the local machine. It allocates a local IPv4 endpoint for this socket.

At the end of data transmission, the socket is closed using the Socket::Close(). It returns a 0 on success and -1 on failure.

Please note that applications usually create the sockets automatically. Please refer to the source code of your preferred application to discover how and when it creates the socket.

16.6.2.1.1. UDP Socket interaction and interface with Application layer

The following is the description of the public interface of the UDP socket, and how the interface is used to interact with the socket itself.

Socket APIs for UDP connections:

Connect()

This is called when Send() is used instead of SendTo() by the user. It sets the address of the remote endpoint which is used by Send(). If the remote address is valid, this method makes a callback to ConnectionSucceeded.

Bind()

Bind the socket to an address, or to a general endpoint. A general endpoint is an endpoint with an ephemeral port allocation (that is, a random port allocation) on the 0.0.0.0 IP address. For instance, in current applications, data senders usually bind automatically after a Connect() over a random port. Consequently, the connection will start from this random port towards the well-defined port of the receiver. The IP 0.0.0.0 is then translated by lower layers into the real IP of the device.

Bind6()

Same as Bind(), but for IPv6.

BindToNetDevice()

Bind the socket to the specified NetDevice. If set on a socket, this option will force packets to leave the bound device regardless of the device that IP routing would naturally choose. In the receive direction, only packets received from the bound interface will be delivered.

ShutdownSend()

Signals the termination of send, or in other words, prevents data from being added to the buffer.

Recv()

Grabs data from the UDP socket and forwards it to the application layer. If no data is present (i.e. m_deliveryQueue.empty() returns 0), an empty packet is returned.

RecvFrom()

Same as Recv(), but with the source address as parameter.

SendTo()

The SendTo() API is the UDP counterpart of the TCP API Send(). It additionally specifies the address to which the message is to be sent because no prior connection is established in UDP communication. It returns the number of bytes sent or -1 in case of failure.

Close()

The close API closes a socket and terminates the connection. This results in freeing all the data structures previously allocated.


Public callbacks

These callbacks are called by the UDP socket to notify the application of interesting events. We will refer to these with the protected name used in socket.h, but we will provide the API function to set the pointers to these callback as well.

NotifyConnectionSucceeded: SetConnectCallback, 1st argument

Called when the Connect() succeeds and the remote address is validated.

NotifyConnectionFailed: SetConnectCallback, 2nd argument

Called in Connect() when the the remote address validation fails.

NotifyDataSent: SetDataSentCallback

The socket notifies the application that some bytes have been transmitted at the IP layer. These bytes could still be lost in the node (traffic control layer) or in the network.

NotifySend: SetSendCallback

Invoked to get the space available in the tx buffer when a packet (that carries data) is sent.

NotifyDataRecv: SetRecvCallback

Called when the socket receives a packet (that carries data) in the receiver buffer.

16.6.2.2. Validation

The following test cases have been provided for UDP implementation in the src/internet/test/udp-test.cc file.

  • UdpSocketImplTest: Checks data received via UDP Socket over IPv4.

  • UdpSocketLoopbackTest: Checks data received via UDP Socket Loopback over IPv4.

  • Udp6SocketImplTest : Checks data received via UDP Socket over IPv6.

  • Udp6SocketLoopbackTest : Checks data received via UDP Socket Loopback over IPv6 Test.

16.6.2.3. Limitations
  • UDP_CORK is presently not the part of this implementation.

  • NotifyNormalClose, NotifyErrorClose, NotifyConnectionRequest and NotifyNewConnectionCreated socket API callbacks are not supported.

17. Internet Applications Module Documentation

The goal of this module is to hold all the Internet-specific applications, and most notably some very specific applications (e.g., ping) or daemons (e.g., radvd). Other non-Internet-specific applications such as packet generators are contained in other modules.

The source code for the new module lives in the directory src/internet-apps.

Each application has its own goals, limitations and scope, which are briefly explained in the following.

All the applications are extensively used in the top-level examples directories. The users are encouraged to check the scripts therein to have a clear overview of the various options and usage tricks.

17.1. Ping

The Ping application supports both IPv4 and IPv6 and replaces earlier ns-3 implementations called v4Ping and Ping6 that were address family dependent. Ping was introduced in the ns-3.38 release cycle.

17.1.1. Model Description

This application behaves similarly to the Unix ping application, although with fewer options supported. Ping sends ICMP Echo Request messages to a remote address, and collects statistics and reports on the ICMP Echo Reply responses that are received. The application can be used to send ICMP echo requests to unicast, broadcast, and multicast IPv4 and IPv6 addresses. The application can produce a verbose output similar to the real application, and can also export statistics and results via trace sources. The following can be controlled via attributes of this class:

  • Destination address

  • Local address (sender address)

  • Packet size (default 56 bytes)

  • Packet interval (default 1 second)

  • Timeout value (default 1 second)

  • The count, or maximum number of packets to send

  • Verbose mode

In practice, the real-world ping application behavior varies slightly depending on the operating system (Linux, macOS, Windows, etc.). Most implementations also support a very large number of options. The ns-3 model is intended to handle the most common use cases of testing for reachability.

17.1.1.1. Design

The aim of ns-3 Ping application is to mimic the built-in application found in most operating systems. In practice, ping is usually used to check reachability of a destination, but additional options have been added over time and the tool can be used in different ways to gather statistics about reachability and round trip times (RTT). Since ns-3 is mainly used for performance studies and not for operational forensics, some options of real ping implementations may not be useful for simulations. However, the ns-3 application can deliver output and RTT samples similar to how the real application operates.

Ping is usually installed on a source node and does not require any ns-3 application installation on the destination node. Ping is an Application that can be started and stopped using the base class Application APIs.

17.1.1.2. Behavior

The behavior of real ping applications varies across operating systems. For example, on Linux, the first ICMP sequence number sent is one, while on macOS, the first sequence number is zero. The behavior when pinging non-existent hosts also can differ (Linux is quiet while macOS is verbose). Windows and other operating systems like Cisco routers also can behave slightly differently.

This implementation tries to generally follow the Linux behavior, except that it will print out a verbose ‘request timed out’ message when an echo request is sent and no reply arrives in a timely manner. The timeout value (time that ping waits for a response to return) defaults to one second, but once there are RTT samples available, the timeout is set to twice the observed RTT. In contrast to Linux (but aligned with macOS), the first sequence number sent is zero.

17.1.1.3. Scope and Limitations

ping implementations have a lot of command-line options. The ns-3 implementation only supports a few of the most commonly-used options; patches to add additional options would be welcome.

At the present time, fragmentation (sending an ICMP Echo Request larger than the path MTU) is not handled correctly during Echo Response reassembly.

17.1.2. Usage

Users may create and install Ping applications on nodes on a one-by-one basis using CreateObject or by using the PingHelper. For CreateObject, the following can be used:

Ptr<Node> n = ...;
Ptr<Ping> ping = CreateObject<Ping> ();
// Configure ping as needed...
n->AddApplication (ping);

Users should be aware of how this application stops. For most ns-3 applications, StopApplication() should be called before the simulation is stopped. If the Count attribute of this application is set to a positive integer, the application will stop (and a report will be printed) either when Count responses have been received or when StopApplication() is called, whichever comes first. If Count is zero, meaning infinite pings, then StopApplication() should be used to eventually stop the application and generate the report. If StopApplication() is called while a packet (echo request) is in-flight, the response cannot be received and the packet will be treated as lost in the report– real ping applications work this way as well. To avoid this, it is recommended to call StopApplication() at a time when an Echo Request or Echo Response packet is not expected to be in flight.

17.1.2.1. Helpers

The PingHelper supports the typical Install usage pattern in ns-3. The following sample code is from the program examples/tcp/tcp-validation.cc.

PingHelper pingHelper(Ipv4Address("192.168.1.2"));
pingHelper.SetAttribute("Interval", TimeValue(pingInterval));
pingHelper.SetAttribute("Size", UintegerValue(pingSize));
pingHelper.SetAttribute("VerboseMode", EnumValue(Ping::VerboseMode::SILENT));
ApplicationContainer pingContainer = pingHelper.Install(pingServer);
Ptr<Ping> ping = pingContainer.Get(0)->GetObject<Ping>();
ping->TraceConnectWithoutContext("Rtt", MakeBoundCallback(&TracePingRtt, &pingOfStream));
pingContainer.Start(Seconds(1));
pingContainer.Stop(stopTime - Seconds(1));

The first statement sets the remote address (destination) for all application instances created with this helper. The second and third statements perform further configuration. The fourth statement configures the verbosity to be totally silent. The fifth statement is a typical Install() method that returns an ApplicationContainer (in this case, of size 1). The sixth and seventh statements fetch the application instance created and configure a trace sink (TracePingRtt) for the Rtt trace source. The eighth and ninth statements configure the start and stop time, respectively.

The helper is most useful when there are many similarly configured applications to install on a collection of nodes (a NodeContainer). When there is only one Ping application to configure in a program, or when the configuration between different instances is different, it may be more straightforward to directly create the Ping applications without the PingHelper.

17.1.2.2. Attributes

The following attributes can be configured:

  • Destination: The IPv4 or IPv6 address of the machine we want to ping

  • VerboseMode: Configure verbose, quiet, or silent output

  • Interval: Time interval between sending each packet

  • Size: The number of data bytes to be sent, before ICMP and IP headers are added

  • Count: The maximum number of packets the application will send

  • InterfaceAddress: Local address of the sender

  • Timeout: Time to wait for response if no RTT samples are available

17.1.2.3. Output

If VerboseMode mode is set to VERBOSE, ping will output the results of ICMP Echo Reply responses to std::cout output stream. If the mode is set to QUIET, only the initial statement and summary are printed. If the mode is set to SILENT, no output will be printed to std::cout. These behavioral differences can be seen with the ping-example.cc as follows:

$ ./ns3 run --no-build 'ping-example --ns3::Ping::VerboseMode=Verbose'
$ ./ns3 run --no-build 'ping-example --ns3::Ping::VerboseMode=Quiet'
$ ./ns3 run --no-build 'ping-example --ns3::Ping::VerboseMode=Silent'

Additional output can be gathered by using the four trace sources provided by Ping:

  • Tx: This trace executes when a new packet is sent, and returns the sequence number and full packet (including ICMP header).

  • Rtt: Each time an ICMP echo reply is received, this trace is called and reports the sequence number and RTT.

  • Drop: If an ICMP error is returned instead of an echo reply, the sequence number and reason for reported drop are returned.

  • Report: When ping completes and exits, it prints output statistics to the terminal. These values are copied to a struct PingReport and returned in this trace source.

17.1.2.4. Example

A basic ping-example.cc program is provided to highlight the following usage. The topology has three nodes interconnected by two point-to-point links. Each link has 5 ms one-way delay, for a round-trip propagation delay of 20 ms. The transmission rate on each link is 100 Mbps. The routing between links is enabled by ns-3’s NixVector routing.

By default, this program will send 5 pings from node A to node C. When using the default IPv6, the output will look like this:

The example program will also produce four pcap traces (one for each NetDevice in the scenario) that can be viewed using tcpdump or Wireshark.

Other program options include options to change the destination and source addresses, number of packets (count), packet size, interval, and whether to enable logging (if logging is enabled in the build). These program options will override any corresponding attribute settings.

Finally, the program has some code that can be enabled to selectively force packet drops to check such behavior.

17.1.3. Validation

The following test cases have been added for regression testing:

  1. Unlimited pings, no losses, StopApplication () with no packets in flight

  2. Unlimited pings, no losses, StopApplication () with one packet in flight

  3. Test for operation of count attribute and exit time after all pings are received, for IPv4”

  4. Test the operation of interval attribute, for IPv4

  5. Test for behavior of pinging an unreachable host when the network does not send an ICMP unreachable message

  6. Test pinging to IPv4 broadcast address and IPv6 all nodes multicast address

  7. Test behavior of first reply lost in a count-limited configuration

  8. Test behavior of second reply lost in a count-limited configuration

  9. Test behavior of last reply lost in a count-limited configuration.

17.2. Radvd

This app mimics a “RADVD” daemon. I.e., the daemon responsible for IPv6 routers advertisements. All the IPv6 routers should have a RADVD daemon installed.

The configuration of the Radvd application mimics the one of the radvd Linux program.

17.3. DHCPv4

The ns-3 implementation of Dynamic Host Configuration Protocol (DHCP) follows the specifications of RFC 2131 and RFC 2132.

The source code for DHCP is located in src/internet-apps/model and consists of the following 6 files:

  • dhcp-server.h,

  • dhcp-server.cc,

  • dhcp-client.h,

  • dhcp-client.cc,

  • dhcp-header.h and

  • dhcp-header.cc

17.3.1. Helpers

The following two files have been added to src/internet-apps/helper for DHCP:

  • dhcp-helper.h and

  • dhcp-helper.cc

17.3.2. Tests

The tests for DHCP can be found at src/internet-apps/test/dhcp-test.cc

17.3.3. Examples

The examples for DHCP can be found at src/internet-apps/examples/dhcp-example.cc

17.3.4. Scope and Limitations

The server should be provided with a network address, mask and a range of address for the pool. One client application can be installed on only one netdevice in a node, and can configure address for only that netdevice.

The following five basic DHCP messages are supported:

  • DHCP DISCOVER

  • DHCP OFFER

  • DHCP REQUEST

  • DHCP ACK

  • DHCP NACK

Also, the following eight options of BootP are supported:

  • 1 (Mask)

  • 50 (Requested Address)

  • 51 (Address Lease Time)

  • 53 (DHCP message type)

  • 54 (DHCP server identifier)

  • 58 (Address renew time)

  • 59 (Address rebind time)

  • 255 (end)

The client identifier option (61) can be implemented in near future.

In the current implementation, a DHCP client can obtain IPv4 address dynamically from the DHCP server, and can renew it within a lease time period.

Multiple DHCP servers can be configured, but the implementation does not support the use of a DHCP Relay yet.

17.4. V4TraceRoute

Documentation is missing for this application.

PageBreak

18. Low-Rate Wireless Personal Area Network (LR-WPAN)

This chapter describes the implementation of ns-3 models for the low-rate, wireless personal area network (LR-WPAN) as specified by IEEE standard 802.15.4 (2003,2006,2011).

18.1. Model Description

The model is implemented into the src/lrwpan/ folder. The model design closely follows the standard from an architectural standpoint.

_images/lr-wpan-arch.png

Architecture and scope of lr-wpan models

The grey areas in the figure (adapted from Fig 3. of IEEE Std. 802.15.4-2006) show the scope of the model.

The Spectrum NetDevice from Nicola Baldo is the basis for the implementation.

The implementation also borrows some ideas from the ns-2 models developed by Zheng and Lee.

18.1.1. APIs

The APIs closely follow the standard, adapted for ns-3 naming conventions and idioms. The APIs are organized around the concept of service primitives as shown in the following figure adapted from Figure 14 of IEEE Std. 802.15.4-2006.

_images/lr-wpan-primitives.png

Service primitives

The APIs primitives are organized around four conceptual services and service access points (SAP):

  • MAC data service (MCPS)

  • MAC management service (MLME)

  • PHY data service (PD)

  • PHY management service (PLME)

In general, primitives are standardized as follows (e.g. Sec 7.1.1.1.1 of IEEE 802.15.4-2006)::

MCPS-DATA.request(
                  SrcAddrMode,
                  DstAddrMode,
                  DstPANId,
                  DstAddr,
                  msduLength,
                  msdu,
                  msduHandle,
                  TxOptions,
                  SecurityLevel,
                  KeyIdMode,
                  KeySource,
                  KeyIndex
                 )

In ns-3 this maps to classes, structs and methods such as::

struct McpsDataRequestParameters
{
  uint8_t m_srcAddrMode;
  uint8_t m_dstAddrMode;
  ...
};

void
LrWpanMac::McpsDataRequest(McpsDataRequestParameters params)
{
...
}

The MAC primitives currently supported by the ns-3 model are:

  • MCPS-DATA (Request, Confirm, Indication)

  • MLME-START (Request, Confirm)

  • MLME-SCAN (Request, Confirm)

  • MLME-BEACON-NOFIFY (Indication)

  • MLME-ASSOCIATE.Request (Request, Confirm, Response, Indication)

  • MLME-POLL (Confirm)

  • MLME-COMM-STATUS (Indication)

  • MLME-SYNC (Request)

  • MLME-SYNC-LOSS (Indication)

  • MLME-SET (Request, Confirm)

  • MLME-GET (Request, Confirm)

The PHY primitives currently supported by the ns-3 model are: * PLME-CCA (Request, Confirm) * PD-DATA (Request, Confirm, Indication) * PLME-SET-TRX-STATE (Request, Confirm) * PLME-SET (Request, Confirm) * PLME-GET (Request, Confirm)

For more information on primitives, See IEEE 802.15.4-2011, Table 8.

18.1.2. The PHY layer

The physical layer components consist of a PHY model, an error rate model, and a loss model. The PHY state transitions are roughly model after ATMEL’s AT86RF233.

_images/lr-wpan-phy.png

Ns-3 lr-wpan PHY basic operating mode state diagram

The error rate model presently models the error rate for IEEE 802.15.4 2.4 GHz AWGN channel for OQPSK; the model description can be found in IEEE Std 802.15.4-2006, section E.4.1.7. The Phy model is based on SpectrumPhy and it follows specification described in section 6 of IEEE Std 802.15.4-2006. It models PHY service specifications, PPDU formats, PHY constants and PIB attributes. It currently only supports the transmit power spectral density mask specified in 2.4 GHz per section 6.5.3.1. The noise power density assumes uniformly distributed thermal noise across the frequency bands. The loss model can fully utilize all existing simple (non-spectrum phy) loss models. The Phy model uses the existing single spectrum channel model. The physical layer is modeled on packet level, that is, no preamble/SFD detection is done. Packet reception will be started with the first bit of the preamble (which is not modeled), if the SNR is more than -5 dB, see IEEE Std 802.15.4-2006, appendix E, Figure E.2. Reception of the packet will finish after the packet was completely transmitted. Other packets arriving during reception will add up to the interference/noise.

Rx sensitivity is defined as the weakest possible signal point at which a receiver can receive and decode a packet with a high success rate. According to the standard (IEEE Std 802.15.4-2006, section 6.1.7), this corresponds to the point where the packet error rate is under 1% for 20 bytes PSDU reference packets (11 bytes MAC header + 7 bytes payload (MSDU) + FCS 2 bytes). Setting low Rx sensitivity values (increasing the radio hearing capabilities) have the effect to receive more packets (and at a greater distance) but it raises the probability to have dropped packets at the MAC layer or the probability of corrupted packets. By default, the receiver sensitivity is set to the maximum theoretical possible value of -106.58 dBm for the supported IEEE 802.15.4 O-QPSK 250kps. This rx sensitivity is set for the “perfect radio” which only considers the floor noise, in essence, this do not include the noise factor (noise introduced by imperfections in the demodulator chip or external factors). The receiver sensitivity can be changed to different values using SetRxSensitivity function in the PHY to simulate the hearing capabilities of different compliant radio transceivers (the standard minimum compliant Rx sensitivity is -85 dBm).:

                                                           (defined by the standard)
NoiseFloor          Max Sensitivity                          Min Sensitivity
-106.987dBm          -106.58dBm                                   -85dBm
 |-------------------------|------------------------------------------|
                       Noise Factor = 1
                           | <--------------------------------------->|
                                 Acceptable sensitivity range

The example lr-wpan-per-plot.cc shows that at given Rx sensitivity, packets are dropped regardless of their theoretical error probability. This program outputs a file named 802.15.4-per-vs-rxSignal.plt. Loading this file into gnuplot yields a file 802.15.4-per-vs-rsSignal.eps, which can be converted to pdf or other formats. Packet payload size, Tx power and Rx sensitivity can be configured. The point where the blue line crosses with the PER indicates the Rx sensitivity. The default output is shown below.

_images/802-15-4-per-sens.png

Default output of the program lr-wpan-per-plot.cc

18.1.3. The MAC layer

The MAC at present implements both, the unslotted CSMA/CA (non-beacon mode) and the slotted CSMA/CA (beacon-enabled mode). The beacon-enabled mode supports only direct transmissions. Indirect transmissions and Guaranteed Time Slots (GTS) are currently not supported.

The present implementation supports a single PAN coordinator, support for additional coordinators is under consideration for future releases.

The implemented MAC is similar to Contiki’s NullMAC, i.e., a MAC without sleep features. The radio is assumed to be always active (receiving or transmitting), of completely shut down. Frame reception is not disabled while performing the CCA.

The main API supported is the data transfer API (McpsDataRequest/Indication/Confirm). CSMA/CA according to Stc 802.15.4-2006, section 7.5.1.4 is supported. Frame reception and rejection according to Std 802.15.4-2006, section 7.5.6.2 is supported, including acknowledgements. Both short and extended addressing are supported. Various trace sources are supported, and trace sources can be hooked to sinks.

18.1.3.1. Scan and Association

The implemented ns-3 MAC layer supports scanning. Typically, a scanning request is preceded by an association request but these can be used independently. ns-3 IEEE 802.15.4 MAC layer supports 4 types of scanning:

  • Energy Detection (ED) Scan: In an energy scan, a device or a coordinator scan a set number of channels looking for traces of energy. The maximum energy registered during a given amount of time is stored. Energy scan is typically used to measure the quality of a channel at any given time. For this reason, coordinators often use this scan before initiating a PAN on a channel.

  • Active Scan: A device sends beacon request commands on a set number of channels looking for a PAN coordinator. The receiving coordinator must be configured on non-beacon mode. Coordinators on beacon-mode ignore these requests. The coordinators who accept the request, respond with a beacon. After an active scan take place, during the association process devices extract the information in the PAN descriptors from the collected beacons and based on this information (e.g. channel, LQI level), choose a coordinator to associate with.

  • Passive Scan: In a passive scan, no beacon requests commands are sent. Devices scan a set number of channels looking for beacons currently being transmitted (coordinators in beacon-mode). Like in the active scan, the information from beacons is stored in PAN descriptors and used by the device to choose a coordinator to associate with.

  • Orphan Scan: Orphan scan is used typically by device as a result of repeated communication failure attempts with a coordinator. In other words, an orphan scan represents the intent of a device to relocate its coordinator. In some situations, it can be used by devices higher layers to not only rejoin a network but also join a network for the first time. In an orphan scan, a device send a orphan notification command to a given list of channels. If a coordinator receives this notification, it responds to the device with a coordinator realignment command.

In active and passive scans, the link quality indicator (LQI) is the main parameter used to determine the optimal coordinator. LQI values range from 0 to 255. Where 255 is the highest quality link value and 0 the lowest. Typically, a link lower than 127 is considered a link with poor quality.

In LR-WPAN, association is used to join PANs. All devices in LR-WPAN must belong to a PAN to communicate. ns-3 uses a classic association procedure described in the standard. The standard also covers a more effective association procedure known as fast association (See IEEE 802.15.4-2015, fastA) but this association is currently not supported by ns-3. Alternatively, ns-3 can do a “quick and dirty” association using either `LrWpanHelper::AssociateToPan` or `LrWpanHelper::AssociateToBeaconPan`. These functions are used when a preset association can be done. For example, when the relationships between existing nodes and coordinators are known and can be set before the beginning of the simulation. In other situations, like in many networks in real deployments or in large networks, it is desirable that devices “associate themselves” with the best possible available coordinator candidates. This is a process known as bootstrap, and simulating this process makes it possible to demonstrate the kind of situations a node would face in which large networks to associate in real environment.

Bootstrap (a.k.a. network initialization) is possible with a combination of scan and association MAC primitives. Details on the general process for this network initialization is described in the standard. Bootstrap is a complex process that not only requires the scanning networks, but also the exchange of command frames and the use of a pending transaction list (indirect transmissions) in the coordinator to store command frames. The following summarizes the whole process:

_images/lr-wpan-assocSequence.png

Bootstrap as whole depends on procedures that also take place on higher layers of devices and coordinators. These procedures are briefly described in the standard but out of its scope (See IEE 802.15.4-2011 Section 5.1.3.1.). However, these procedures are necessary for a “complete bootstrap” process. In the examples in ns-3, these high layer procedures are only briefly implemented to demonstrate a complete example that shows the use of scan and association. A full high layer (e.g. such as those found in Zigbee and Thread protocol stacks) should complete these procedures more robustly.

18.1.3.2. MAC transmission Queues

By default, Tx queue and Ind Tx queue (the pending transaction list) are not limited but they can configure to drop packets after they reach a limit of elements (transaction overflow). Additionally, the Ind Tx queue drop packets when the packet has been longer than macTransactionPersistenceTime (transaction expiration). Expiration of packets in the Tx queue is not supported. Finally, packets in the Tx queue may be dropped due to excessive transmission retries or channel access failure.

18.1.3.3. MAC addresses

Contrary to other technologies, a IEEE 802.15.4 has 2 different kind of addresses:

  • Long addresses (64 bits)

  • Short addresses (16 bits)

The 64-bit addresses are unique worldwide, and set by the device vendor (in a real device). The 16-bit addresses are not guaranteed to be unique, and they are typically either assigned during the devices deployment, or assigned dynamically during the device bootstrap.

The other relevant “address” to consider is the PanId (16 bits), which represents the PAN the device is attached to.

Due to the limited number of available bytes in a packet, IEEE 802.15.4 tries to use short addresses instead of long addresses, even though the two might be used at the same time.

For the sake of communicating with the upper layers, and in particular to generate auto-configured IPv6 addresses, each NetDevice must identify itself with a MAC address. The MAC addresses are also used during packet reception, so it is important to use them consistently.

Focusing on IPv6 Stateless address autoconfiguration (SLAAC), there are two relevant RFCs to consider: RFC 4944 and RFC 6282, and the two differ on how to build the IPv6 address given the NetDevice address.

RFC 4944 mandates that the IID part of the IPv6 address is calculated as YYYY:00ff:fe00:XXXX, while RFC 6282 mandates that the IID part of the IPv6 address is calculated as 0000:00ff:fe00:XXXX where XXXX is the device short address, and YYYY is the PanId. In both cases the U/L bit must be set to local, so in the RFC 4944 the PanId might have one bit flipped.

In order to facilitate interoperability, and to avoid unwanted module dependencies, the ns-3 implementation moves the IID calculation in the LrWpanNetDevice::GetAddress (), which will return an Address formatted properly, i.e.:

  • The Long address (a Mac64Address) if the Short address has not been set, or

  • A properly formatted 48-bit pseudo-address (a Mac48Address) if the short address has been set.

The 48-bit pseudo-address is generated according to either RFC 4944 or RFC 6282 depending on the configuration of an Attribute (PseudoMacAddressMode).

The default is to use RFC 6282 style addresses.

Note that, on reception, a packet might contain either a short or a long address. This is reflected in the upper-layer notification callback, which can contain either the pseudo-address (48 bits) or the long address (64 bit) of the sender.

Note also that RFC 4944 or RFC 6282 are the RFCs defining the IPv6 address compression formats (HC1 and IPHC respectively). It is definitely not a good idea to either mix devices using different pseudo-address format or compression types in the same network. This point is further discussed in the sixlowpan module documentation.

18.1.4. NetDevice

Although it is expected that other technology profiles (such as 6LoWPAN and ZigBee) will write their own NetDevice classes, a basic LrWpanNetDevice is provided, which encapsulates the common operations of creating a generic LrWpan device and hooking things together.

18.2. Usage

18.2.1. Enabling lr-wpan

Add lr-wpan to the list of modules built with ns-3.

18.2.2. Helpers

The helper is patterned after other device helpers. In particular, tracing (ascii and pcap) is enabled similarly, and enabling of all lr-wpan log components is performed similarly. Use of the helper is exemplified in examples/lr-wpan-data.cc. For ascii tracing, the transmit and receive traces are hooked at the Mac layer.

The default propagation loss model added to the channel, when this helper is used, is the LogDistancePropagationLossModel with default parameters.

18.2.3. Examples

The following examples have been written, which can be found in src/lr-wpan/examples/:

  • lr-wpan-data.cc: A simple example showing end-to-end data transfer.

  • lr-wpan-error-distance-plot.cc: An example to plot variations of the packet success ratio as a function of distance.

  • lr-wpan-per-plot.cc: An example to plot the theoretical and experimental packet error rate (PER) as a function of receive signal.

  • lr-wpan-error-model-plot.cc: An example to test the phy.

  • lr-wpan-packet-print.cc: An example to print out the MAC header fields.

  • lr-wpan-phy-test.cc: An example to test the phy.

  • lr-wpan-ed-scan.cc: Simple example showing the use of energy detection (ED) scan in the MAC.

  • lr-wpan-active-scan.cc: A simple example showing the use of an active scan in the MAC.

  • lr-wpan-mlme.cc: Demonstrates the use of lr-wpan beacon mode. Nodes use a manual association (i.e. No bootstrap) in this example.

  • lr-wpan-bootstrap.cc: Demonstrates the use of scanning and association working together to initiate a PAN.

  • lr-wpan-orphan-scan.cc: Demonstrates the use of an orphan scanning in a simple network joining procedure.

In particular, the module enables a very simplified end-to-end data transfer scenario, implemented in lr-wpan-data.cc. The figure shows a sequence of events that are triggered when the MAC receives a DataRequest from the higher layer. It invokes a Clear Channel Assessment (CCA) from the PHY, and if successful, sends the frame down to the PHY where it is transmitted over the channel and results in a DataIndication on the peer node.

_images/lr-wpan-data-example.png

Data example for simple LR-WPAN data transfer end-to-end

The example lr-wpan-error-distance-plot.cc plots the packet success ratio (PSR) as a function of distance, using the default LogDistance propagation loss model and the 802.15.4 error model. The channel (default 11), packet size (default PSDU 20 bytes = 11 bytes MAC header + data payload), transmit power (default 0 dBm) and Rx sensitivity (default -106.58 dBm) can be varied by command line arguments. The program outputs a file named 802.15.4-psr-distance.plt. Loading this file into gnuplot yields a file 802.15.4-psr-distance.eps, which can be converted to pdf or other formats. The following image shows the output of multiple runs using different Rx sensitivity values. A higher Rx sensitivity (lower dBm) results in a increased communication distance but also makes the radio susceptible to more interference from surrounding devices.

_images/802-15-4-psr-distance.png

Default output of the program lr-wpan-error-distance-plot.cc

18.2.4. Tests

The following tests have been written, which can be found in src/lr-wpan/tests/:

  • lr-wpan-ack-test.cc: Check that acknowledgments are being used and issued in the correct order.

  • lr-wpan-cca-test.cc: Test the behavior of CCA under specific circumstances (Hidden terminal and CCA vulnerable windows)

  • lr-wpan-collision-test.cc: Test correct reception of packets with interference and collisions.

  • lr-wpan-ed-test.cc: Test the energy detection (ED) capabilities of the Lr-Wpan implementation.

  • lr-wpan-error-model-test.cc: Check that the error model gives predictable values.

  • lr-wpan-ifs-test.cc: Check that the Intraframe Spaces (IFS) are being used and issued in the correct order.

  • lr-wpan-mac-test.cc: Test various MAC capabilities such different types of scanning and RX_ON_WHEN_IDLE.

  • lr-wpan-packet-test.cc: Test the 802.15.4 MAC header/trailer classes

  • lr-wpan-pd-plme-sap-test.cc: Test the PLME and PD primitives from Lr-wpan’s PHY.

  • lr-wpan-spectrum-value-helper-test.cc: Test that the conversion between power (expressed as a scalar quantity) and spectral power, and back again, falls within a 25% tolerance across the range of possible channels and input powers.

  • lr-wpan-slotted-csmaca-test.cc: Test the transmission and deferring of data packets in the Contention Access Period (CAP) for the slotted CSMA/CA (beacon-enabled mode).

The test suite lr-wpan-cca-test.cc demenostrate some known CCA behaviors. The test suite includes a test that demonstrates the well known hidden terminal problem. The second test included in the test suite shows a known vulnerability window (192us) in CCA that can cause a false positive identification of channel as IDLE caused by the turnAround delay between the CCA and the actual transmission of the frame.

18.3. Validation

The model has not been validated against real hardware. The error model has been validated against the data in IEEE Std 802.15.4-2006, section E.4.1.7 (Figure E.2). The MAC behavior (CSMA backoff) has been validated by hand against expected behavior. The below plot is an example of the error model validation and can be reproduced by running lr-wpan-error-model-plot.cc:

_images/802-15-4-ber.png

Default output of the program lr-wpan-error-model-plot.cc

18.4. Scope and Limitations

Future versions of this document will contain a PICS proforma similar to Appendix D of IEEE 802.15.4-2006. The current emphasis is on direct transmissions running on both, slotted and unslotted mode (CSMA/CA) of 802.15.4 operation for use in Zigbee.

  • Indirect data transmissions are not supported but planned for a future update.

  • Devices are capable of associating with a single PAN coordinator. Interference is modeled as AWGN but this is currently not thoroughly tested.

  • The standard describes the support of multiple PHY band-modulations but currently, only 250kbps O-QPSK (channel page 0) is supported.

  • Active and passive MAC scans are able to obtain a LQI value from a beacon frame, however, the scan primitives assumes LQI is correctly implemented and does not check the validity of its value.

  • Configuration of the ED thresholds are currently not supported.

  • Coordinator realignment command is only supported in orphan scans.

  • Disassociation primitives are not supported.

  • Security is not supported.

  • Guaranteed Time Slots (GTS) are not supported.

  • Not all attributes are supported by the MLME-SET and MLME-GET primitives.

  • Indirect transmissions are only supported during the association process.

  • RSSI is not supported as this is part of the 2015 revision and the current implementation only supports until the 2011 revision.

18.5. References

[1] Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks (WPANs), IEEE Computer Society, IEEE Std 802.15.4-2006, 8 September 2006.

[2] IEEE Standard for Local and metropolitan area networks–Part 15.4: Low-Rate Wireless Personal Area Networks (LR-WPANs),” in IEEE Std 802.15.4-2011 (Revision of IEEE Std 802.15.4-2006) , vol., no., pp.1-314, 5 Sept. 2011, doi: 10.1109/IEEESTD.2011.6012487.

[3] J. Zheng and Myung J. Lee, “A comprehensive performance study of IEEE 802.15.4,” Sensor Network Operations, IEEE Press, Wiley Interscience, Chapter 4, pp. 218-237, 2006.

[4] Alberto Gallegos Ramonet and Taku Noguchi. 2020. LR-WPAN: Beacon Enabled Direct Transmissions on Ns-3. In 2020 the 6th International Conference on Communication and Information Processing (ICCIP 2020). Association for Computing Machinery, New York, NY, USA, 115–122. https://doi.org/10.1145/3442555.3442574.

[5] Gallegos Ramonet, A.; Noguchi, T. Performance Analysis of IEEE 802.15.4 Bootstrap Process. Electronics 2022, 11, 4090. https://doi.org/10.3390/electronics11244090.

19. LTE Module

19.1. Design Documentation

19.1.1. Overview

An overview of the LTE-EPC simulation model is depicted in the figure Overview of the LTE-EPC simulation model. There are two main components:

  • the LTE Model. This model includes the LTE Radio Protocol stack (RRC, PDCP, RLC, MAC, PHY). These entities reside entirely within the UE and the eNB nodes.

  • the EPC Model. This model includes core network interfaces, protocols and entities. These entities and protocols reside within the SGW, PGW and MME nodes, and partially within the eNB nodes.

_images/epc-topology-with-split.png

Overview of the LTE-EPC simulation model

19.1.2. Design Criteria

19.1.2.1. LTE Model

The LTE model has been designed to support the evaluation of the following aspects of LTE systems:

  • Radio Resource Management

  • QoS-aware Packet Scheduling

  • Inter-cell Interference Coordination

  • Dynamic Spectrum Access

In order to model LTE systems to a level of detail that is sufficient to allow a correct evaluation of the above mentioned aspects, the following requirements have been considered:

  1. At the radio level, the granularity of the model should be at least that of the Resource Block (RB). In fact, this is the fundamental unit being used for resource allocation. Without this minimum level of granularity, it is not possible to model accurately packet scheduling and inter-cell-interference. The reason is that, since packet scheduling is done on a per-RB basis, an eNB might transmit on a subset only of all the available RBs, hence interfering with other eNBs only on those RBs where it is transmitting. Note that this requirement rules out the adoption of a system level simulation approach, which evaluates resource allocation only at the granularity of call/bearer establishment.

  2. The simulator should scale up to tens of eNBs and hundreds of User Equipment (UEs). This rules out the use of a link level simulator, i.e., a simulator whose radio interface is modeled with a granularity up to the symbol level. This is because to have a symbol level model it is necessary to implement all the PHY layer signal processing, whose huge computational complexity severely limits simulation. In fact, link-level simulators are normally limited to a single eNB and one or a few UEs.

  3. It should be possible within the simulation to configure different cells so that they use different carrier frequencies and system bandwidths. The bandwidth used by different cells should be allowed to overlap, in order to support dynamic spectrum licensing solutions such as those described in [Ofcom2600MHz] and [RealWireless]. The calculation of interference should handle appropriately this case.

  4. To be more representative of the LTE standard, as well as to be as close as possible to real-world implementations, the simulator should support the MAC Scheduler API published by the FemtoForum [FFAPI]. This interface is expected to be used by femtocell manufacturers for the implementation of scheduling and Radio Resource Management (RRM) algorithms. By introducing support for this interface in the simulator, we make it possible for LTE equipment vendors and operators to test in a simulative environment exactly the same algorithms that would be deployed in a real system.

  5. The LTE simulation model should contain its own implementation of the API defined in [FFAPI]. Neither binary nor data structure compatibility with vendor-specific implementations of the same interface are expected; hence, a compatibility layer should be interposed whenever a vendor-specific MAC scheduler is to be used with the simulator. This requirement is necessary to allow the simulator to be independent from vendor-specific implementations of this interface specification. We note that [FFAPI] is a logical specification only, and its implementation (e.g., translation to some specific programming language) is left to the vendors.

  6. The model is to be used to simulate the transmission of IP packets by the upper layers. With this respect, it shall be considered that in LTE the Scheduling and Radio Resource Management do not work with IP packets directly, but rather with RLC PDUs, which are obtained by segmentation and concatenation of IP packets done by the RLC entities. Hence, these functionalities of the RLC layer should be modeled accurately.

19.1.2.2. EPC Model

The main objective of the EPC model is to provides means for the simulation of end-to-end IP connectivity over the LTE model. To this aim, it supports for the interconnection of multiple UEs to the Internet, via a radio access network of multiple eNBs connected to the core network, as shown in Figure Overview of the LTE-EPC simulation model.

The following design choices have been made for the EPC model:

  1. The Packet Data Network (PDN) type supported is both IPv4 and IPv6. In other words, the end-to-end connections between the UEs and the remote hosts can be IPv4 and IPv6. However, the networks between the core network elements (MME, SGWs and PGWs) are IPv4-only.

  2. The SGW and PGW functional entities are implemented in different nodes, which are hence referred to as the SGW node and PGW node, respectively.

  3. The MME functional entities is implemented as a network node, which is hence referred to as the MME node.

  4. The scenarios with inter-SGW mobility are not of interest. But several SGW nodes may be present in simulations scenarios.

  5. A requirement for the EPC model is that it can be used to simulate the end-to-end performance of realistic applications. Hence, it should be possible to use with the EPC model any regular ns-3 application working on top of TCP or UDP.

  6. Another requirement is the possibility of simulating network topologies with the presence of multiple eNBs, some of which might be equipped with a backhaul connection with limited capabilities. In order to simulate such scenarios, the user data plane protocols being used between the eNBs and the SGW should be modeled accurately.

  7. It should be possible for a single UE to use different applications with different QoS profiles. Hence, multiple EPS bearers should be supported for each UE. This includes the necessary classification of TCP/UDP traffic over IP done at the UE in the uplink and at the PGW in the downlink.

  8. The initial focus of the EPC model is mainly on the EPC data plane. The accurate modeling of the EPC control plane is, for the time being, not a requirement; however, the necessary control plane interactions among the different network nodes of the core network are realized by implementing control protocols/messages among them. Direct interaction among the different simulation objects via the provided helper objects should be avoided as much as possible.

  9. The focus of the EPC model is on simulations of active users in ECM connected mode. Hence, all the functionality that is only relevant for ECM idle mode (in particular, tracking area update and paging) are not modeled at all.

  10. The model should allow the possibility to perform an X2-based handover between two eNBs.

19.1.3. Architecture

19.1.3.1. LTE Model
19.1.3.1.1. UE architecture

The architecture of the LTE radio protocol stack model of the UE is represented in the figures LTE radio protocol stack architecture for the UE on the data plane and LTE radio protocol stack architecture for the UE on the control plane which highlight respectively the data plane and the control plane.

_images/lte-arch-ue-data.png

LTE radio protocol stack architecture for the UE on the data plane

_images/lte-arch-ue-ctrl.png

LTE radio protocol stack architecture for the UE on the control plane

The architecture of the PHY/channel model of the UE is represented in figure PHY and channel model architecture for the UE.

_images/lte-ue-phy.png

PHY and channel model architecture for the UE

19.1.3.1.2. eNB architecture

The architecture of the LTE radio protocol stack model of the eNB is represented in the figures LTE radio protocol stack architecture for the eNB on the data plane and LTE radio protocol stack architecture for the eNB on the control plane which highlight respectively the data plane and the control plane.

_images/lte-arch-enb-data.png

LTE radio protocol stack architecture for the eNB on the data plane

_images/lte-arch-enb-ctrl.png

LTE radio protocol stack architecture for the eNB on the control plane

The architecture of the PHY/channel model of the eNB is represented in figure PHY and channel model architecture for the eNB.

_images/lte-enb-phy.png

PHY and channel model architecture for the eNB

19.1.3.2. EPC Model
19.1.3.2.1. EPC data plane

In Figure LTE-EPC data plane protocol stack, we represent the end-to-end LTE-EPC data plane protocol stack as it is modeled in the simulator. The figure shows all nodes in the data path, i.e. UE, eNB, SGW, PGW and a remote host in the Internet. All protocol stacks (S5 protocol stack, S1-U protocol stack and the LTE radio protocol stack) specified by 3GPP are present.

_images/lte-epc-e2e-data-protocol-stack-with-split.png

LTE-EPC data plane protocol stack

19.1.3.2.2. EPC control plane

The architecture of the implementation of the control plane model is shown in figure LTE-EPC control plane protocol stack. The control interfaces that are modeled explicitly are the S1-MME, the S11, and the S5 interfaces. The X2 interface is also modeled explicitly and it is described in more detail in section X2

The S1-MME, the S11 and the S5 interfaces are modeled using protocol data units sent over its respective links. These interfaces use the SCTP protocol as transport protocol but currently, the SCTP protocol is not modeled in the ns-3 simulator, so the UDP protocol is used instead of the SCTP protocol.

_images/lte-epc-e2e-control-protocol-stack-with-split.png

LTE-EPC control plane protocol stack

19.1.4. Channel and Propagation

For channel modeling purposes, the LTE module uses the SpectrumChannel interface provided by the spectrum module. At the time of this writing, two implementations of such interface are available: SingleModelSpectrumChannel and MultiModelSpectrumChannel, and the LTE module requires the use of the MultiModelSpectrumChannel in order to work properly. This is because of the need to support different frequency and bandwidth configurations. All the propagation models supported by MultiModelSpectrumChannel can be used within the LTE module.

19.1.4.1. Use of the Buildings model with LTE

The recommended propagation model to be used with the LTE module is the one provided by the Buildings module, which was in fact designed specifically with LTE (though it can be used with other wireless technologies as well). Please refer to the documentation of the Buildings module for generic information on the propagation model it provides.

In this section we will highlight some considerations that specifically apply when the Buildings module is used together with the LTE module.

The naming convention used in the following will be:

  • User equipment: UE

  • Macro Base Station: MBS

  • Small cell Base Station (e.g., pico/femtocell): SC

The LTE module considers FDD only, and implements downlink and uplink propagation separately. As a consequence, the following pathloss computations are performed

  • MBS <-> UE (indoor and outdoor)

  • SC (indoor and outdoor) <-> UE (indoor and outdoor)

The LTE model does not provide the following pathloss computations:

  • UE <-> UE

  • MBS <-> MBS

  • MBS <-> SC

  • SC <-> SC

The Buildings model does not know the actual type of the node; i.e., it is not aware of whether a transmitter node is a UE, a MBS, or a SC. Rather, the Buildings model only cares about the position of the node: whether it is indoor and outdoor, and what is its z-axis respect to the rooftop level. As a consequence, for an eNB node that is placed outdoor and at a z-coordinate above the rooftop level, the propagation models typical of MBS will be used by the Buildings module. Conversely, for an eNB that is placed outdoor but below the rooftop, or indoor, the propagation models typical of pico and femtocells will be used.

For communications involving at least one indoor node, the corresponding wall penetration losses will be calculated by the Buildings model. This covers the following use cases:

  • MBS <-> indoor UE

  • outdoor SC <-> indoor UE

  • indoor SC <-> indoor UE

  • indoor SC <-> outdoor UE

Please refer to the documentation of the Buildings module for details on the actual models used in each case.

19.1.4.2. Fading Model

The LTE module includes a trace-based fading model derived from the one developed during the GSoC 2010 [Piro2011]. The main characteristic of this model is the fact that the fading evaluation during simulation run-time is based on per-calculated traces. This is done to limit the computational complexity of the simulator. On the other hand, it needs huge structures for storing the traces; therefore, a trade-off between the number of possible parameters and the memory occupancy has to be found. The most important ones are:

  • users’ speed: relative speed between users (affects the Doppler frequency, which in turns affects the time-variance property of the fading)

  • number of taps (and relative power): number of multiple paths considered, which affects the frequency property of the fading.

  • time granularity of the trace: sampling time of the trace.

  • frequency granularity of the trace: number of values in frequency to be evaluated.

  • length of trace: ideally large as the simulation time, might be reduced by windowing mechanism.

  • number of users: number of independent traces to be used (ideally one trace per user).

With respect to the mathematical channel propagation model, we suggest the one provided by the rayleighchan function of Matlab, since it provides a well accepted channel modelization both in time and frequency domain. For more information, the reader is referred to [mathworks].

The simulator provides a matlab script (src/lte/model/fading-traces/fading-trace-generator.m) for generating traces based on the format used by the simulator. In detail, the channel object created with the rayleighchan function is used for filtering a discrete-time impulse signal in order to obtain the channel impulse response. The filtering is repeated for different TTI, thus yielding subsequent time-correlated channel responses (one per TTI). The channel response is then processed with the pwelch function for obtaining its power spectral density values, which are then saved in a file with the proper format compatible with the simulator model.

Since the number of variable it is pretty high, generate traces considering all of them might produce a high number of traces of huge size. On this matter, we considered the following assumptions of the parameters based on the 3GPP fading propagation conditions (see Annex B.2 of [TS36104]):

  • users’ speed: typically only a few discrete values are considered, i.e.:

    • 0 and 3 kmph for pedestrian scenarios

    • 30 and 60 kmph for vehicular scenarios

    • 0, 3, 30 and 60 for urban scenarios

  • channel taps: only a limited number of sets of channel taps are normally considered, for example three models are mentioned in Annex B.2 of [TS36104].

  • time granularity: we need one fading value per TTI, i.e., every 1 ms (as this is the granularity in time of the ns-3 LTE PHY model).

  • frequency granularity: we need one fading value per RB (which is the frequency granularity of the spectrum model used by the ns-3 LTE model).

  • length of the trace: the simulator includes the windowing mechanism implemented during the GSoC 2011, which consists of picking up a window of the trace each window length in a random fashion.

  • per-user fading process: users share the same fading trace, but for each user a different starting point in the trace is randomly picked up. This choice was made to avoid the need to provide one fading trace per user.

According to the parameters we considered, the following formula express in detail the total size S_{traces} of the fading traces:

S_{traces} = S_{sample} \times N_{RB} \times \frac{T_{trace}}{T_{sample}} \times N_{scenarios} \mbox{ [bytes]}

where S_{sample} is the size in bytes of the sample (e.g., 8 in case of double precision, 4 in case of float precision), N_{RB} is the number of RB or set of RBs to be considered, T_{trace} is the total length of the trace, T_{sample} is the time resolution of the trace (1 ms), and N_{scenarios} is the number of fading scenarios that are desired (i.e., combinations of different sets of channel taps and user speed values). We provide traces for 3 different scenarios one for each taps configuration defined in Annex B.2 of [TS36104]:

  • Pedestrian: with nodes’ speed of 3 kmph.

  • Vehicular: with nodes’ speed of 60 kmph.

  • Urban: with nodes’ speed of 3 kmph.

hence N_{scenarios} = 3. All traces have T_{trace} = 10 s and RB_{NUM} = 100. This results in a total 24 MB bytes of traces.

19.1.4.3. Antennas

Being based on the SpectrumPhy, the LTE PHY model supports antenna modeling via the ns-3 AntennaModel class. Hence, any model based on this class can be associated with any eNB or UE instance. For instance, the use of the CosineAntennaModel associated with an eNB device allows to model one sector of a macro base station. By default, the IsotropicAntennaModel is used for both eNBs and UEs.

19.1.5. PHY

19.1.5.1. Overview

The physical layer model provided in this LTE simulator is based on the one described in [Piro2011], with the following modifications. The model now includes the inter cell interference calculation and the simulation of uplink traffic, including both packet transmission and CQI generation.

19.1.5.2. Subframe Structure

The subframe is divided into control and data part as described in Figure LTE subframe division..

_images/lte-subframe-structure.png

LTE subframe division.

Considering the granularity of the simulator based on RB, the control and the reference signaling have to be consequently modeled considering this constraint. According to the standard [TS36211], the downlink control frame starts at the beginning of each subframe and lasts up to three symbols across the whole system bandwidth, where the actual duration is provided by the Physical Control Format Indicator Channel (PCFICH). The information on the allocation are then mapped in the remaining resource up to the duration defined by the PCFICH, in the so called Physical Downlink Control Channel (PDCCH). A PDCCH transports a single message called Downlink Control Information (DCI) coming from the MAC layer, where the scheduler indicates the resource allocation for a specific user. The PCFICH and PDCCH are modeled with the transmission of the control frame of a fixed duration of 3/14 of milliseconds spanning in the whole available bandwidth, since the scheduler does not estimate the size of the control region. This implies that a single transmission block models the entire control frame with a fixed power (i.e., the one used for the PDSCH) across all the available RBs. According to this feature, this transmission represents also a valuable support for the Reference Signal (RS). This allows of having every TTI an evaluation of the interference scenario since all the eNB are transmitting (simultaneously) the control frame over the respective available bandwidths. We note that, the model does not include the power boosting since it does not reflect any improvement in the implemented model of the channel estimation.

The Sounding Reference Signal (SRS) is modeled similar to the downlink control frame. The SRS is periodically placed in the last symbol of the subframe in the whole system bandwidth. The RRC module already includes an algorithm for dynamically assigning the periodicity as function of the actual number of UEs attached to a eNB according to the UE-specific procedure (see Section 8.2 of [TS36213]).

19.1.5.3. MAC to Channel delay

To model the latency of real MAC and PHY implementations, the PHY model simulates a MAC-to-channel delay in multiples of TTIs (1ms). The transmission of both data and control packets are delayed by this amount.

19.1.5.4. CQI feedback

The generation of CQI feedback is done accordingly to what specified in [FFAPI]. In detail, we considered the generation of periodic wideband CQI (i.e., a single value of channel state that is deemed representative of all RBs in use) and inband CQIs (i.e., a set of value representing the channel state for each RB).

The CQI index to be reported is obtained by first obtaining a SINR measurement and then passing this SINR measurement to the Adaptive Modulation and Coding module which will map it to the CQI index.

In downlink, the SINR used to generate CQI feedback can be calculated in two different ways:

  1. Ctrl method: SINR is calculated combining the signal power from the reference signals (which in the simulation is equivalent to the PDCCH) and the interference power from the PDCCH. This approach results in considering any neighboring eNB as an interferer, regardless of whether this eNB is actually performing any PDSCH transmission, and regardless of the power and RBs used for eventual interfering PDSCH transmissions.

  2. Mixed method: SINR is calculated combining the signal power from the reference signals (which in the simulation is equivalent to the PDCCH) and the interference power from the PDSCH. This approach results in considering as interferers only those neighboring eNBs that are actively transmitting data on the PDSCH, and allows to generate inband CQIs that account for different amounts of interference on different RBs according to the actual interference level. In the case that no PDSCH transmission is performed by any eNB, this method consider that interference is zero, i.e., the SINR will be calculated as the ratio of signal to noise only.

To switch between this two CQI generation approaches, LteHelper::UsePdschForCqiGeneration needs to be configured: false for first approach and true for second approach (true is default value):

Config::SetDefault("ns3::LteHelper::UsePdschForCqiGeneration", BooleanValue(true));

In uplink, two types of CQIs are implemented:

  • SRS based, periodically sent by the UEs.

  • PUSCH based, calculated from the actual transmitted data.

The scheduler interface include an attribute system called UlCqiFilter for managing the filtering of the CQIs according to their nature, in detail:

  • SRS_UL_CQI for storing only SRS based CQIs.

  • PUSCH_UL_CQI for storing only PUSCH based CQIs.

It has to be noted that, the FfMacScheduler provides only the interface and it is matter of the actual scheduler implementation to include the code for managing these attributes (see scheduler related section for more information on this matter).

19.1.5.5. Interference Model

The PHY model is based on the well-known Gaussian interference models, according to which the powers of interfering signals (in linear units) are summed up together to determine the overall interference power.

The sequence diagram of Figure Sequence diagram of the PHY interference calculation procedure shows how interfering signals are processed to calculate the SINR, and how SINR is then used for the generation of CQI feedback.

_images/lte-phy-interference.png

Sequence diagram of the PHY interference calculation procedure

19.1.5.6. LTE Spectrum Model

The usage of the radio spectrum by eNBs and UEs in LTE is described in [TS36101]. In the simulator, radio spectrum usage is modeled as follows. Let f_c denote the LTE Absolute Radio Frequency Channel Number, which identifies the carrier frequency on a 100 kHz raster; furthermore, let B be the Transmission Bandwidth Configuration in number of Resource Blocks. For every pair (f_c,B) used in the simulation we define a corresponding SpectrumModel using the functionality provided by the Spectrum Module . model using the Spectrum framework described in [Baldo2009]. f_c and B can be configured for every eNB instantiated in the simulation; hence, each eNB can use a different spectrum model. Every UE will automatically use the spectrum model of the eNB it is attached to. Using the MultiModelSpectrumChannel described in [Baldo2009], the interference among eNBs that use different spectrum models is properly accounted for. This allows to simulate dynamic spectrum access policies, such as for example the spectrum licensing policies that are discussed in [Ofcom2600MHz].

19.1.5.7. Data PHY Error Model

The simulator includes an error model of the data plane (i.e., PDSCH and PUSCH) according to the standard link-to-system mapping (LSM) techniques. The choice is aligned with the standard system simulation methodology of OFDMA radio transmission technology. Thanks to LSM we are able to maintain a good level of accuracy and at the same time limiting the computational complexity increase. It is based on the mapping of single link layer performance obtained by means of link level simulators to system (in our case network) simulators. In particular link the layer simulator is used for generating the performance of a single link from a PHY layer perspective, usually in terms of code block error rate (BLER), under specific static conditions. LSM allows the usage of these parameters in more complex scenarios, typical of system/network simulators, where we have more links, interference and “colored” channel propagation phenomena (e.g., frequency selective fading).

To do this the Vienna LTE Simulator [ViennaLteSim] has been used for what concerns the extraction of link layer performance and the Mutual Information Based Effective SINR (MIESM) as LSM mapping function using part of the work recently published by the Signet Group of University of Padua [PaduaPEM].

19.1.5.7.1. MIESM

The specific LSM method adopted is the one based on the usage of a mutual information metric, commonly referred to as the mutual information per per coded bit (MIB or MMIB when a mean of multiples MIBs is involved). Another option would be represented by the Exponential ESM (EESM); however, recent studies demonstrate that MIESM outperforms EESM in terms of accuracy [LozanoCost].

_images/miesm_scheme.png

MIESM computational procedure diagram

The mutual information (MI) is dependent on the constellation mapping and can be calculated per transport block (TB) basis, by evaluating the MI over the symbols and the subcarrier. However, this would be too complex for a network simulator. Hence, in our implementation a flat channel response within the RB has been considered; therefore the overall MI of a TB is calculated averaging the MI evaluated per each RB used in the TB. In detail, the implemented scheme is depicted in Figure MIESM computational procedure diagram, where we see that the model starts by evaluating the MI value for each RB, represented in the figure by the SINR samples. Then the equivalent MI is evaluated per TB basis by averaging the MI values. Finally, a further step has to be done since the link level simulator returns the performance of the link in terms of block error rate (BLER) in a addive white gaussian noise (AWGN) channel, where the blocks are the code blocks (CBs) independently encoded/decoded by the turbo encoder. On this matter the standard 3GPP segmentation scheme has been used for estimating the actual CB size (described in section 5.1.2 of [TS36212]). This scheme divides the TB in N_{K_-} blocks of size K_- and N_{K+} blocks of size K_+. Therefore the overall TB BLER (TBLER) can be expressed as

TBLER = 1- \prod\limits_{i=1}^{C}(1-CBLER_i)

where the CBLER_i is the BLER of the CB i obtained according to the link level simulator CB BLER curves. For estimating the CBLER_i, the MI evaluation has been implemented according to its numerical approximation defined in [wimaxEmd]. Moreover, for reducing the complexity of the computation, the approximation has been converted into lookup tables. In detail, Gaussian cumulative model has been used for approximating the AWGN BLER curves with three parameters which provides a close fit to the standard AWGN performances, in formula:

CBLER_i = \frac{1}{2}\left[1-erf\left(\frac{x-b_{ECR}}{\sqrt{2}c_{ECR}} \right) \right]

where x is the MI of the TB, b_{ECR} represents the “transition center” and c_{ECR} is related to the “transition width” of the Gaussian cumulative distribution for each Effective Code Rate (ECR) which is the actual transmission rate according to the channel coding and MCS. For limiting the computational complexity of the model we considered only a subset of the possible ECRs in fact we would have potentially 5076 possible ECRs (i.e., 27 MCSs and 188 CB sizes). On this respect, we will limit the CB sizes to some representative values (i.e., 40, 140, 160, 256, 512, 1024, 2048, 4032, 6144), while for the others the worst one approximating the real one will be used (i.e., the smaller CB size value available respect to the real one). This choice is aligned to the typical performance of turbo codes, where the CB size is not strongly impacting on the BLER. However, it is to be notes that for CB sizes lower than 1000 bits the effect might be relevant (i.e., till 2 dB); therefore, we adopt this unbalanced sampling interval for having more precision where it is necessary. This behaviour is confirmed by the figures presented in the Annes Section.

19.1.5.7.2. BLER Curves

On this respect, we reused part of the curves obtained within [PaduaPEM]. In detail, we introduced the CB size dependency to the CB BLER curves with the support of the developers of [PaduaPEM] and of the LTE Vienna Simulator. In fact, the module released provides the link layer performance only for what concerns the MCSs (i.e, with a given fixed ECR). In detail the new error rate curves for each has been evaluated with a simulation campaign with the link layer simulator for a single link with AWGN noise and for CB size of 104, 140, 256, 512, 1024, 2048, 4032 and 6144. These curves has been mapped with the Gaussian cumulative model formula presented above for obtaining the correspondents b_{ECR} and c_{ECR} parameters.

The BLER performance of all MCS obtained with the link level simulator are plotted in the following figures (blue lines) together with their correspondent mapping to the Gaussian cumulative distribution (red dashed lines).

_images/MCS_1_4.png

BLER for MCS 1, 2, 3 and 4.

_images/MCS_5_8.png

BLER for MCS 5, 6, 7 and 8.

_images/MCS_9_12.png

BLER for MCS 9, 10, 11 and 12.

_images/MCS_13_16.png

BLER for MCS 13, 14, 15 and 16.

_images/MCS_17_20.png

BLER for MCS 17, 17, 19 and 20.

_images/MCS_21_24.png

BLER for MCS 21, 22, 23 and 24.

_images/MCS_25_28.png

BLER for MCS 25, 26, 27 and 28.

_images/MCS_29_29.png

BLER for MCS 29.

19.1.5.7.3. Integration of the BLER curves in the ns-3 LTE module

The model implemented uses the curves for the LSM of the recently LTE PHY Error Model released in the ns3 community by the Signet Group [PaduaPEM] and the new ones generated for different CB sizes. The LteSpectrumPhy class is in charge of evaluating the TB BLER thanks to the methods provided by the LteMiErrorModel class, which is in charge of evaluating the TB BLER according to the vector of the perceived SINR per RB, the MCS and the size in order to proper model the segmentation of the TB in CBs. In order to obtain the vector of the perceived SINRs for data and control signals, two instances of LteChunkProcessor (dedicated to evaluate the SINR for obtaining physical error performance) have been attached to UE downlink and eNB uplink LteSpectrumPhy modules for evaluating the error model distribution of PDSCH (UE side) and ULSCH (eNB side).

The model can be disabled for working with a zero-losses channel by setting the DataErrorModelEnabled attribute of the LteSpectrumPhy class (by default is active). This can be done according to the standard ns3 attribute system procedure, that is:

Config::SetDefault("ns3::LteSpectrumPhy::DataErrorModelEnabled", BooleanValue(false));
19.1.5.8. Control Channels PHY Error Model

The simulator includes the error model for downlink control channels (PCFICH and PDCCH), while in uplink it is assumed and ideal error-free channel. The model is based on the MIESM approach presented before for considering the effects of the frequency selective channel since most of the control channels span the whole available bandwidth.

19.1.5.8.1. PCFICH + PDCCH Error Model

The model adopted for the error distribution of these channels is based on an evaluation study carried out in the RAN4 of 3GPP, where different vendors investigated the demodulation performance of the PCFICH jointly with PDCCH. This is due to the fact that the PCFICH is the channel in charge of communicating to the UEs the actual dimension of the PDCCH (which spans between 1 and 3 symbols); therefore the correct decodification of the DCIs depends on the correct interpretation of both ones. In 3GPP this problem have been evaluated for improving the cell-edge performance [FujitsuWhitePaper], where the interference among neighboring cells can be relatively high due to signal degradation. A similar problem has been notices in femto-cell scenario and, more in general, in HetNet scenarios the bottleneck has been detected mainly as the PCFICH channel [Bharucha2011], where in case of many eNBs are deployed in the same service area, this channel may collide in frequency, making impossible the correct detection of the PDCCH channel, too.

In the simulator, the SINR perceived during the reception has been estimated according to the MIESM model presented above in order to evaluate the error distribution of PCFICH and PDCCH. In detail, the SINR samples of all the RBs are included in the evaluation of the MI associated to the control frame and, according to this values, the effective SINR (eSINR) is obtained by inverting the MI evaluation process. It has to be noted that, in case of MIMO transmission, both PCFICH and the PDCCH use always the transmit diversity mode as defined by the standard. According to the eSINR perceived the decodification error probability can be estimated as function of the results presented in [R4-081920]. In case an error occur, the DCIs discarded and therefore the UE will be not able to receive the correspondent Tbs, therefore resulting lost.

19.1.5.9. MIMO Model

The use of multiple antennas both at transmitter and receiver side, known as multiple-input and multiple-output (MIMO), is a problem well studied in literature during the past years. Most of the work concentrate on evaluating analytically the gain that the different MIMO schemes might have in term of capacity; however someones provide also information of the gain in terms of received power [CatreuxMIMO].

According to the considerations above, a model more flexible can be obtained considering the gain that MIMO schemes bring in the system from a statistical point of view. As highlighted before, [CatreuxMIMO] presents the statistical gain of several MIMO solutions respect to the SISO one in case of no correlation between the antennas. In the work the gain is presented as the cumulative distribution function (CDF) of the output SINR for what concern SISO, MIMO-Alamouti, MIMO-MMSE, MIMO-OSIC-MMSE and MIMO-ZF schemes. Elaborating the results, the output SINR distribution can be approximated with a log-normal one with different mean and variance as function of the scheme considered. However, the variances are not so different and they are approximately equal to the one of the SISO mode already included in the shadowing component of the BuildingsPropagationLossModel, in detail:

  • SISO: \mu = 13.5 and \sigma = 20 [dB].

  • MIMO-Alamouti: \mu = 17.7 and \sigma = 11.1 [dB].

  • MIMO-MMSE: \mu = 10.7 and \sigma = 16.6 [dB].

  • MIMO-OSIC-MMSE: \mu = 12.6 and \sigma = 15.5 [dB].

  • MIMO-ZF: \mu = 10.3 and \sigma = 12.6 [dB].

Therefore the PHY layer implements the MIMO model as the gain perceived by the receiver when using a MIMO scheme respect to the one obtained using SISO one. We note that, these gains referred to a case where there is no correlation between the antennas in MIMO scheme; therefore do not model degradation due to paths correlation.

19.1.5.10. UE PHY Measurements Model

According to [TS36214], the UE has to report a set of measurements of the eNBs that the device is able to perceive: the reference signal received power (RSRP) and the reference signal received quality (RSRQ). The former is a measure of the received power of a specific eNB, while the latter includes also channel interference and thermal noise. The UE has to report the measurements jointly with the physical cell identity (PCI) of the cell. Both the RSRP and RSRQ measurements are performed during the reception of the RS, while the PCI is obtained with the Primary Synchronization Signal (PSS). The PSS is sent by the eNB each 5 subframes and in detail in the subframes 1 and 6. In real systems, only 504 distinct PCIs are available, and hence it could occur that two nearby eNBs use the same PCI; however, in the simulator we model PCIs using simulation metadata, and we allow up to 65535 distinct PCIs, thereby avoiding PCI collisions provided that less that 65535 eNBs are simulated in the same scenario.

According to [TS36133] sections 9.1.4 and 9.1.7, RSRP is reported by PHY layer in dBm while RSRQ in dB. The values of RSRP and RSRQ are provided to higher layers through the C-PHY SAP (by means of UeMeasurementsParameters struct) every 200 ms as defined in [TS36331]. Layer 1 filtering is performed by averaging the all the measurements collected during the last window slot. The periodicity of reporting can be adjusted for research purposes by means of the LteUePhy::UeMeasurementsFilterPeriod attribute.

The formulas of the RSRP and RSRQ can be simplified considering the assumption of the PHY layer that the channel is flat within the RB, the finest level of accuracy. In fact, this implies that all the REs within a RB have the same power, therefore:

RSRP = \frac{\sum_{k=0}^{K-1}\frac{\sum_{m=0}^{M-1}(P(k,m))}{M}}{K}
     = \frac{\sum_{k=0}^{K-1}\frac{(M \times P(k))}{M}}{K}
     = \frac{\sum_{k=0}^{K-1}(P(k))}{K}

where P(k,m) represents the signal power of the RE m within the RB k, which, as observed before, is constant within the same RB and equal to P(k), M is the number of REs carrying the RS in a RB and K is the number of RBs. It is to be noted that P(k), and in general all the powers defined in this section, is obtained in the simulator from the PSD of the RB (which is provided by the LteInterferencePowerChunkProcessor), in detail:

P(k) = PSD_{RB}(k)*180000/12

where PSD_{RB}(k) is the power spectral density of the RB k, 180000 is the bandwidth in Hz of the RB and 12 is the number of REs per RB in an OFDM symbol. Similarly, for RSSI we have

RSSI = \sum_{k=0}^{K-1} \frac{\sum_{s=0}^{S-1} \sum_{r=0}^{R-1}( P(k,s,r) + I(k,s,r) + N(k,s,r))}{S}

where S is the number of OFDM symbols carrying RS in a RB and R is the number of REs carrying a RS in a OFDM symbol (which is fixed to 2) while P(k,s,r), I(k,s,r) and N(k,s,r) represent respectively the perceived power of the serving cell, the interference power and the noise power of the RE r in symbol s. As for RSRP, the measurements within a RB are always equals among each others according to the PHY model; therefore P(k,s,r) = P(k), I(k,s,r) = I(k) and N(k,s,r) = N(k), which implies that the RSSI can be calculated as:

RSSI = \sum_{k=0}^{K-1} \frac{S \times 2 \times ( P(k) + I(k) + N(k))}{S}
     = \sum_{k=0}^{K-1} 2 \times ( P(k) + I(k) + N (k))

Considering the constraints of the PHY reception chain implementation, and in order to maintain the level of computational complexity low, only RSRP can be directly obtained for all the cells. This is due to the fact that LteSpectrumPhy is designed for evaluating the interference only respect to the signal of the serving eNB. This implies that the PHY layer is optimized for managing the power signals information with the serving eNB as a reference. However, RSRP and RSRQ of neighbor cell i can be extracted by the current information available of the serving cell j as detailed in the following:

RSRP_i = \frac{\sum_{k=0}^{K-1}(P_i(k))}{K}

RSSI_i = RSSI_j = \sum_{k=0}^{K-1} 2 \times ( I_j(k) + P_j(k) + N_j(k) )

RSRQ_i^j = K \times RSRP_i / RSSI_j

where RSRP_i is the RSRP of the neighbor cell i, P_i(k) is the power perceived at any RE within the RB k, K is the total number of RBs, RSSI_i is the RSSI of the neighbor cell i when the UE is attached to cell j (which, since it is the sum of all the received powers, coincides with RSSI_j), I_j(k) is the total interference perceived by UE in any RE of RB k when attached to cell i (obtained by the LteInterferencePowerChunkProcessor), P_j(k) is the power perceived of cell j in any RE of the RB k and N is the power noise spectral density in any RE. The sample is considered as valid in case of the RSRQ evaluated is above the LteUePhy::RsrqUeMeasThreshold attribute.

19.1.6. HARQ

The HARQ scheme implemented is based on a incremental redundancy (IR) solutions combined with multiple stop-and-wait processes for enabling a continuous data flow. In detail, the solution adopted is the soft combining hybrid IR Full incremental redundancy (also called IR Type II), which implies that the retransmissions contain only new information respect to the previous ones. The resource allocation algorithm of the HARQ has been implemented within the respective scheduler classes (i.e., RrFfMacScheduler and PfFfMacScheduler, refer to their correspondent sections for more info), while the decodification part of the HARQ has been implemented in the LteSpectrumPhy and LteHarqPhy classes which will be detailed in this section.

According to the standard, the UL retransmissions are synchronous and therefore are allocated 7 ms after the original transmission. On the other hand, for the DL, they are asynchronous and therefore can be allocated in a more flexible way starting from 7 ms and it is a matter of the specific scheduler implementation. The HARQ processes behavior is depicted in Figure:ref:fig-harq-processes-scheme.

At the MAC layer, the HARQ entity residing in the scheduler is in charge of controlling the 8 HARQ processes for generating new packets and managing the retransmissions both for the DL and the UL. The scheduler collects the HARQ feedback from eNB and UE PHY layers (respectively for UL and DL connection) by means of the FF API primitives SchedUlTriggerReq and SchedUlTriggerReq. According to the HARQ feedback and the RLC buffers status, the scheduler generates a set of DCIs including both retransmissions of HARQ blocks received erroneous and new transmissions, in general, giving priority to the former. On this matter, the scheduler has to take into consideration one constraint when allocating the resource for HARQ retransmissions, it must use the same modulation order of the first transmission attempt (i.e., QPSK for MCS \in [0..9], 16QAM for MCS \in [10..16] and 64QAM for MCS \in [17..28]). This restriction comes from the specification of the rate matcher in the 3GPP standard [ TS36212]_, where the algorithm fixes the modulation order for generating the different blocks of the redundancy versions.

The PHY Error Model model (i.e., the LteMiErrorModel class already presented before) has been extended for considering IR HARQ according to [wimaxEmd], where the parameters for the AWGN curves mapping for MIESM mapping in case of retransmissions are given by:

R_{eff} = \frac{X}{\sum\limits_{i=1}^q C_i}

M_{I eff} = \frac{\sum\limits_{i=1}^q C_i M_i}{\sum\limits_{i=1}^q C_i}

where X is the number of original information bits, C_i are number of coded bits, M_i are the mutual information per HARQ block received on the total number of q retransmissions. Therefore, in order to be able to return the error probability with the error model implemented in the simulator evaluates the R_{eff} and the MI_{I eff} and return the value of error probability of the ECR of the same modulation with closest lower rate respect to the R_{eff}. In order to consider the effect of HARQ retransmissions a new sets of curves have been integrated respect to the standard one used for the original MCS. The new curves are intended for covering the cases when the most conservative MCS of a modulation is used which implies the generation of R_{eff} lower respect to the one of standard MCSs. On this matter the curves for 1, 2 and 3 retransmissions have been evaluated for 10 and 17. For MCS 0 we considered only the first retransmission since the produced code rate is already very conservative (i.e., 0.04) and returns an error rate enough robust for the reception (i.e., the downturn of the BLER is centered around -18 dB). It is to be noted that, the size of first TB transmission has been assumed as containing all the information bits to be coded; therefore X is equal to the size of the first TB sent of a an HARQ process. The model assumes that the eventual presence of parity bits in the codewords is already considered in the link level curves. This implies that as soon as the minimum R_{eff} is reached the model is not including the gain due to the transmission of further parity bits.

_images/lte-harq-processes-scheme.png

HARQ processes behavior in LTE

The part of HARQ devoted to manage the decodification of the HARQ blocks has been implemented in the LteHarqPhy and LteSpectrumPhy classes. The former is in charge of maintaining the HARQ information for each active process . The latter interacts with LteMiErrorModel class for evaluating the correctness of the blocks received and includes the messaging algorithm in charge of communicating to the HARQ entity in the scheduler the result of the decodifications. These messages are encapsulated in the dlInfoListElement for DL and ulInfoListElement for UL and sent through the PUCCH and the PHICH respectively with an ideal error free model according to the assumptions in their implementation. A sketch of the iteration between HARQ and LTE protocol stack in represented in Figure:ref:fig-harq-architecture.

Finally, the HARQ engine is always active both at MAC and PHY layer; however, in case of the scheduler does not support HARQ the system will continue to work with the HARQ functions inhibited (i.e., buffers are filled but not used). This implementation characteristic gives backward compatibility with schedulers implemented before HARQ integration.

_images/lte-harq-architecture.png

Interaction between HARQ and LTE protocol stack

19.1.7. MAC

19.1.7.1. Resource Allocation Model

We now briefly describe how resource allocation is handled in LTE, clarifying how it is modeled in the simulator. The scheduler is in charge of generating specific structures called Data Control Indication (DCI) which are then transmitted by the PHY of the eNB to the connected UEs, in order to inform them of the resource allocation on a per subframe basis. In doing this in the downlink direction, the scheduler has to fill some specific fields of the DCI structure with all the information, such as: the Modulation and Coding Scheme (MCS) to be used, the MAC Transport Block (TB) size, and the allocation bitmap which identifies which RBs will contain the data transmitted by the eNB to each user.

For the mapping of resources to physical RBs, we adopt a localized mapping approach (see [Sesia2009], Section 9.2.2.1); hence in a given subframe each RB is always allocated to the same user in both slots. The allocation bitmap can be coded in different formats; in this implementation, we considered the Allocation Type 0 defined in [TS36213], according to which the RBs are grouped in Resource Block Groups (RBG) of different size determined as a function of the Transmission Bandwidth Configuration in use.

For certain bandwidth values not all the RBs are usable, since the group size is not a common divisor of the group. This is for instance the case when the bandwidth is equal to 25 RBs, which results in a RBG size of 2 RBs, and therefore 1 RB will result not addressable. In uplink the format of the DCIs is different, since only adjacent RBs can be used because of the SC-FDMA modulation. As a consequence, all RBs can be allocated by the eNB regardless of the bandwidth configuration.

19.1.7.2. Adaptive Modulation and Coding

The simulator provides two Adaptive Modulation and Coding (AMC) models: one based on the GSoC model [Piro2011] and one based on the physical error model (described in the following sections).

The former model is a modified version of the model described in [Piro2011], which in turn is inspired from [Seo2004]. Our version is described in the following. Let i denote the generic user, and let \gamma_i be its SINR. We get the spectral efficiency \eta_i of user i using the following equations:

\mathrm{BER} = 0.00005

\Gamma = \frac{ -\ln{ (5 * \mathrm{BER}) } }{ 1.5}

\eta_i = \log_2 { \left( 1 + \frac{ {\gamma}_i }{ \Gamma } \right)}

The procedure described in [R1-081483] is used to get the corresponding MCS scheme. The spectral efficiency is quantized based on the channel quality indicator (CQI), rounding to the lowest value, and is mapped to the corresponding MCS scheme.

Finally, we note that there are some discrepancies between the MCS index in [R1-081483] and that indicated by the standard: [TS36213] Table 7.1.7.1-1 says that the MCS index goes from 0 to 31, and 0 appears to be a valid MCS scheme (TB size is not 0) but in [R1-081483] the first useful MCS index is 1. Hence to get the value as intended by the standard we need to subtract 1 from the index reported in [R1-081483].

The alternative model is based on the physical error model developed for this simulator and explained in the following subsections. This scheme is able to adapt the MCS selection to the actual PHY layer performance according to the specific CQI report. According to their definition, a CQI index is assigned when a single PDSCH TB with the modulation coding scheme and code rate correspondent to that CQI index in table 7.2.3-1 of [TS36213] can be received with an error probability less than 0.1. In case of wideband CQIs, the reference TB includes all the RBGs available in order to have a reference based on the whole available resources; while, for subband CQIs, the reference TB is sized as the RBGs.

19.1.7.3. Transport Block model

The model of the MAC Transport Blocks (TBs) provided by the simulator is simplified with respect to the 3GPP specifications. In particular, a simulator-specific class (PacketBurst) is used to aggregate MAC SDUs in order to achieve the simulator’s equivalent of a TB, without the corresponding implementation complexity. The multiplexing of different logical channels to and from the RLC layer is performed using a dedicated packet tag (LteRadioBearerTag), which performs a functionality which is partially equivalent to that of the MAC headers specified by 3GPP.

19.1.7.4. The FemtoForum MAC Scheduler Interface

This section describes the ns-3 specific version of the LTE MAC Scheduler Interface Specification published by the FemtoForum [FFAPI].

We implemented the ns-3 specific version of the FemtoForum MAC Scheduler Interface [FFAPI] as a set of C++ abstract classes; in particular, each primitive is translated to a C++ method of a given class. The term implemented here is used with the same meaning adopted in [FFAPI], and hence refers to the process of translating the logical interface specification to a particular programming language. The primitives in [FFAPI] are grouped in two groups: the CSCHED primitives, which deal with scheduler configuration, and the SCHED primitives, which deal with the execution of the scheduler. Furthermore, [FFAPI] defines primitives of two different kinds: those of type REQ go from the MAC to the Scheduler, and those of type IND/CNF go from the scheduler to the MAC. To translate these characteristics into C++, we define the following abstract classes that implement Service Access Points (SAPs) to be used to issue the primitives:

  • the FfMacSchedSapProvider class defines all the C++ methods that correspond to SCHED primitives of type REQ;

  • the FfMacSchedSapUser class defines all the C++ methods that correspond to SCHED primitives of type CNF/IND;

  • the FfMacCschedSapProvider class defines all the C++ methods that correspond to CSCHED primitives of type REQ;

  • the FfMacCschedSapUser class defines all the C++ methods that correspond to CSCHED primitives of type CNF/IND;

There are 3 blocks involved in the MAC Scheduler interface: Control block, Subframe block and Scheduler block. Each of these blocks provide one part of the MAC Scheduler interface. The figure below shows the relationship between the blocks and the SAPs defined in our implementation of the MAC Scheduler Interface.

_images/ff-mac-saps.png

In addition to the above principles, the following design choices have been taken:

  • The definition of the MAC Scheduler interface classes follows the naming conventions of the ns-3 Coding Style. In particular, we follow the CamelCase convention for the primitive names. For example, the primitive CSCHED_CELL_CONFIG_REQ is translated to CschedCellConfigReq in the ns-3 code.

  • The same naming conventions are followed for the primitive parameters. As the primitive parameters are member variables of classes, they are also prefixed with a m_.

  • regarding the use of vectors and lists in data structures, we note that [FFAPI] is a pretty much C-oriented API. However, considered that C++ is used in ns-3, and that the use of C arrays is discouraged, we used STL vectors (std::vector) for the implementation of the MAC Scheduler Interface, instead of using C arrays as implicitly suggested by the way [FFAPI] is written.

  • In C++, members with constructors and destructors are not allow in unions. Hence all those data structures that are said to be unions in [FFAPI] have been defined as structs in our code.

The figure below shows how the MAC Scheduler Interface is used within the eNB.

_images/ff-example.png

The User side of both the CSCHED SAP and the SCHED SAP are implemented within the eNB MAC, i.e., in the file lte-enb-mac.cc. The eNB MAC can be used with different scheduler implementations without modifications. The same figure also shows, as an example, how the Round Robin Scheduler is implemented: to interact with the MAC of the eNB, the Round Robin scheduler implements the Provider side of the SCHED SAP and CSCHED SAP interfaces. A similar approach can be used to implement other schedulers as well. A description of each of the scheduler implementations that we provide as part of our LTE simulation module is provided in the following subsections.

19.1.7.4.1. Round Robin (RR) Scheduler

The Round Robin (RR) scheduler is probably the simplest scheduler found in the literature. It works by dividing the available resources among the active flows, i.e., those logical channels which have a non-empty RLC queue. If the number of RBGs is greater than the number of active flows, all the flows can be allocated in the same subframe. Otherwise, if the number of active flows is greater than the number of RBGs, not all the flows can be scheduled in a given subframe; then, in the next subframe the allocation will start from the last flow that was not allocated. The MCS to be adopted for each user is done according to the received wideband CQIs.

For what concern the HARQ, RR implements the non adaptive version, which implies that in allocating the retransmission attempts RR uses the same allocation configuration of the original block, which means maintaining the same RBGs and MCS. UEs that are allocated for HARQ retransmissions are not considered for the transmission of new data in case they have a transmission opportunity available in the same TTI. Finally, HARQ can be disabled with ns3 attribute system for maintaining backward compatibility with old test cases and code, in detail:

Config::SetDefault("ns3::RrFfMacScheduler::HarqEnabled", BooleanValue(false));

The scheduler implements the filtering of the uplink CQIs according to their nature with UlCqiFilter attribute, in detail:

  • SRS_UL_CQI: only SRS based CQI are stored in the internal attributes.

  • PUSCH_UL_CQI: only PUSCH based CQI are stored in the internal attributes.

19.1.7.4.2. Proportional Fair (PF) Scheduler

The Proportional Fair (PF) scheduler [Sesia2009] works by scheduling a user when its instantaneous channel quality is high relative to its own average channel condition over time. Let i,j denote generic users; let t be the subframe index, and k be the resource block index; let M_{i,k}(t) be MCS usable by user i on resource block k according to what reported by the AMC model (see Adaptive Modulation and Coding); finally, let S(M, B) be the TB size in bits as defined in [TS36213] for the case where a number B of resource blocks is used. The achievable rate R_{i}(k,t) in bit/s for user i on resource block group k at subframe t is defined as

R_{i}(k,t) =  \frac{S\left( M_{i,k}(t), 1\right)}{\tau}

where \tau is the TTI duration. At the start of each subframe t, each RBG is assigned to a certain user. In detail, the index \widehat{i}_{k}(t) to which RBG k is assigned at time t is determined as

\widehat{i}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
 \left( \frac{ R_{j}(k,t) }{ T_\mathrm{j}(t) } \right)

where T_{j}(t) is the past throughput performance perceived by the user j. According to the above scheduling algorithm, a user can be allocated to different RBGs, which can be either adjacent or not, depending on the current condition of the channel and the past throughput performance T_{j}(t). The latter is determined at the end of the subframe t using the following exponential moving average approach:

T_{j}(t) =
(1-\frac{1}{\alpha})T_{j}(t-1)
+\frac{1}{\alpha} \widehat{T}_{j}(t)

where \alpha is the time constant (in number of subframes) of the exponential moving average, and \widehat{T}_{j}(t) is the actual throughput achieved by the user i in the subframe t. \widehat{T}_{j}(t) is measured according to the following procedure. First we determine the MCS \widehat{M}_j(t) actually used by user j:

\widehat{M}_j(t) = \min_{k: \widehat{i}_{k}(t) = j}{M_{j,k}(t)}

then we determine the total number \widehat{B}_j(t) of RBGs allocated to user j:

\widehat{B}_j(t) = \left| \{ k :  \widehat{i}_{k}(t) = j \} \right|

where |\cdot| indicates the cardinality of the set; finally,

\widehat{T}_{j}(t) = \frac{S\left( \widehat{M}_j(t), \widehat{B}_j(t)
\right)}{\tau}

For what concern the HARQ, PF implements the non adaptive version, which implies that in allocating the retransmission attempts the scheduler uses the same allocation configuration of the original block, which means maintaining the same RBGs and MCS. UEs that are allocated for HARQ retransmissions are not considered for the transmission of new data in case they have a transmission opportunity available in the same TTI. Finally, HARQ can be disabled with ns3 attribute system for maintaining backward compatibility with old test cases and code, in detail:

Config::SetDefault("ns3::PfFfMacScheduler::HarqEnabled", BooleanValue(false));
19.1.7.4.3. Maximum Throughput (MT) Scheduler

The Maximum Throughput (MT) scheduler [FCapo2012] aims to maximize the overall throughput of eNB. It allocates each RB to the user that can achieve the maximum achievable rate in the current TTI. Currently, MT scheduler in NS-3 has two versions: frequency domain (FDMT) and time domain (TDMT). In FDMT, every TTI, MAC scheduler allocates RBGs to the UE who has highest achievable rate calculated by subband CQI. In TDMT, every TTI, MAC scheduler selects one UE which has highest achievable rate calculated by wideband CQI. Then MAC scheduler allocates all RBGs to this UE in current TTI. The calculation of achievable rate in FDMT and TDMT is as same as the one in PF. Let i,j denote generic users; let t be the subframe index, and k be the resource block index; let M_{i,k}(t) be MCS usable by user i on resource block k according to what reported by the AMC model (see Adaptive Modulation and Coding); finally, let S(M, B) be the TB size in bits as defined in [TS36213] for the case where a number B of resource blocks is used. The achievable rate R_{i}(k,t) in bit/s for user i on resource block k at subframe t is defined as

R_{i}(k,t) =  \frac{S\left( M_{i,k}(t), 1\right)}{\tau}

where \tau is the TTI duration. At the start of each subframe t, each RB is assigned to a certain user. In detail, the index \widehat{i}_{k}(t) to which RB k is assigned at time t is determined as

\widehat{i}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
    \left( { R_{j}(k,t) } \right)

When there are several UEs having the same achievable rate, current implementation always selects the first UE created in script. Although MT can maximize cell throughput, it cannot provide fairness to UEs in poor channel condition.

19.1.7.4.4. Throughput to Average (TTA) Scheduler

The Throughput to Average (TTA) scheduler [FCapo2012] can be considered as an intermediate between MT and PF. The metric used in TTA is calculated as follows:

\widehat{i}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
 \left( \frac{ R_{j}(k,t) }{ R_{j}(t) } \right)

Here, R_{i}(k,t) in bit/s represents the achievable rate for user i on resource block k at subframe t. The calculation method already is shown in MT and PF. Meanwhile, R_{i}(t) in bit/s stands for the achievable rate for i at subframe t. The difference between those two achievable rates is how to get MCS. For R_{i}(k,t), MCS is calculated by subband CQI while R_{i}(t) is calculated by wideband CQI. TTA scheduler can only be implemented in frequency domain (FD) because the achievable rate of particular RBG is only related to FD scheduling.

19.1.7.4.5. Blind Average Throughput Scheduler

The Blind Average Throughput scheduler [FCapo2012] aims to provide equal throughput to all UEs under eNB. The metric used in TTA is calculated as follows:

\widehat{i}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
 \left( \frac{ 1 }{ T_\mathrm{j}(t) } \right)

where T_{j}(t) is the past throughput performance perceived by the user j and can be calculated by the same method in PF scheduler. In the time domain blind average throughput (TD-BET), the scheduler selects the UE with largest priority metric and allocates all RBGs to this UE. On the other hand, in the frequency domain blind average throughput (FD-BET), every TTI, the scheduler first selects one UE with lowest pastAverageThroughput (largest priority metric). Then scheduler assigns one RBG to this UE, it calculates expected throughput of this UE and uses it to compare with past average throughput T_{j}(t) of other UEs. The scheduler continues to allocate RBG to this UE until its expected throughput is not the smallest one among past average throughput T_{j}(t) of all UE. Then the scheduler will use the same way to allocate RBG for a new UE which has the lowest past average throughput T_{j}(t) until all RBGs are allocated to UEs. The principle behind this is that, in every TTI, the scheduler tries the best to achieve the equal throughput among all UEs.

19.1.7.4.6. Token Bank Fair Queue Scheduler

Token Bank Fair Queue (TBFQ) is a QoS aware scheduler which derives from the leaky-bucket mechanism. In TBFQ, a traffic flow of user i is characterized by following parameters:

  • t_{i}: packet arrival rate (byte/sec )

  • r_{i}: token generation rate (byte/sec)

  • p_{i}: token pool size (byte)

  • E_{i}: counter that records the number of token borrowed from or given to the token bank by flow i ; E_{i} can be smaller than zero

Each K bytes data consumes k tokens. Also, TBFQ maintains a shared token bank (B) so as to balance the traffic between different flows. If token generation rate r_{i} is bigger than packet arrival rate t_{i}, then tokens overflowing from token pool are added to the token bank, and E_{i} is increased by the same amount. Otherwise, flow i needs to withdraw tokens from token bank based on a priority metric frac{E_{i}}{r_{i}}, and E_{i} is decreased. Obviously, the user contributes more on token bank has higher priority to borrow tokens; on the other hand, the user borrows more tokens from bank has lower priority to continue to withdraw tokens. Therefore, in case of several users having the same token generation rate, traffic rate and token pool size, user suffers from higher interference has more opportunity to borrow tokens from bank. In addition, TBFQ can police the traffic by setting the token generation rate to limit the throughput. Additionally, TBFQ also maintains following three parameters for each flow:

  • Debt limit d_{i}: if E_{i} belows this threshold, user i cannot further borrow tokens from bank. This is for preventing malicious UE to borrow too much tokens.

  • Credit limit c_{i}: the maximum number of tokens UE i can borrow from the bank in one time.

  • Credit threshold C: once E_{i} reaches debt limit, UE i must store C tokens to bank in order to further borrow token from bank.

LTE in NS-3 has two versions of TBFQ scheduler: frequency domain TBFQ (FD-TBFQ) and time domain TBFQ (TD-TBFQ). In FD-TBFQ, the scheduler always select UE with highest metric and allocates RBG with highest subband CQI until there are no packets within UE’s RLC buffer or all RBGs are allocated [FABokhari2009]. In TD-TBFQ, after selecting UE with maximum metric, it allocates all RBGs to this UE by using wideband CQI [WKWong2004].

19.1.7.4.7. Priority Set Scheduler

Priority set scheduler (PSS) is a QoS aware scheduler which combines time domain (TD) and frequency domain (FD) packet scheduling operations into one scheduler [GMonghal2008]. It controls the fairness among UEs by a specified Target Bit Rate (TBR).

In TD scheduler part, PSS first selects UEs with non-empty RLC buffer and then divide them into two sets based on the TBR:

  • set 1: UE whose past average throughput is smaller than TBR; TD scheduler calculates their priority metric in Blind Equal Throughput (BET) style:

\widehat{i}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
 \left( \frac{ 1 }{ T_\mathrm{j}(t) } \right)

  • set 2: UE whose past average throughput is larger (or equal) than TBR; TD scheduler calculates their priority metric in Proportional Fair (PF) style:

\widehat{i}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
 \left( \frac{ R_{j}(k,t) }{ T_\mathrm{j}(t) } \right)

UEs belonged to set 1 have higher priority than ones in set 2. Then PSS will select N_{mux} UEs with highest metric in two sets and forward those UE to FD scheduler. In PSS, FD scheduler allocates RBG k to UE n that maximums the chosen metric. Two PF schedulers are used in PF scheduler:

  • Proportional Fair scheduled (PFsch)

\widehat{Msch}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
 \left( \frac{ R_{j}(k,t) }{ Tsch_\mathrm{j}(t) } \right)

  • Carrier over Interference to Average (CoIta)

\widehat{Mcoi}_{k}(t) = \underset{j=1,...,N}{\operatorname{argmax}}
 \left( \frac{ CoI[j,k] }{ \sum_{k=0}^{N_{RBG}} CoI[j,k] } \right)

where Tsch_{j}(t) is similar past throughput performance perceived by the user j, with the difference that it is updated only when the i-th user is actually served. CoI[j,k] is an estimation of the SINR on the RBG k of UE j. Both PFsch and CoIta is for decoupling FD metric from TD scheduler. In addition, PSS FD scheduler also provide a weight metric W[n] for helping controlling fairness in case of low number of UEs.

W[n] =  max (1, \frac{TBR}{ T_{j}(t) })

where T_{j}(t) is the past throughput performance perceived by the user j . Therefore, on RBG k, the FD scheduler selects the UE j that maximizes the product of the frequency domain metric (Msch, MCoI) by weight W[n]. This strategy will guarantee the throughput of lower quality UE tend towards the TBR.

Config::SetDefault("ns3::PfFfMacScheduler::HarqEnabled", BooleanValue(false));

The scheduler implements the filtering of the uplink CQIs according to their nature with UlCqiFilter attribute, in detail:

  • SRS_UL_CQI: only SRS based CQI are stored in the internal attributes.

  • PUSCH_UL_CQI: only PUSCH based CQI are stored in the internal attributes.

19.1.7.4.8. Channel and QoS Aware Scheduler

The Channel and QoS Aware (CQA) Scheduler [Bbojovic2014] is an LTE MAC downlink scheduling algorithm that considers the head of line (HOL) delay, the GBR parameters and channel quality over different subbands. The CQA scheduler is based on joint TD and FD scheduling.

In the TD (at each TTI) the CQA scheduler groups users by priority. The purpose of grouping is to enforce the FD scheduling to consider first the flows with highest HOL delay. The grouping metric m_{td} for user j=1,...,N is defined in the following way:

m_{td}^{j}(t) = \lceil\frac{d_{hol}^{j}(t)}{g}\rceil \;,

where d_{hol}^{j}(t) is the current value of HOL delay of flow j, and g is a grouping parameter that determines granularity of the groups, i.e. the number of the flows that will be considered in the FD scheduling iteration.

The groups of flows selected in the TD iteration are forwarded to the FD scheduling starting from the flows with the highest value of the m_{td} metric until all RBGs are assigned in the corresponding TTI. In the FD, for each RBG k=1,...,K, the CQA scheduler assigns the current RBG to the user j that has the maximum value of the FD metric which we define in the following way:

m_{fd}^{(k,j)}(t) = d_{HOL}^{j}(t) \cdot m_{GBR}^j(t) \cdot m_{ca}^{k,j}(t) \;,

where m_{GBR}^j(t) is calculated as follows:

m_{GBR}^j(t)=\frac{GBR^j}{\overline{R^j}(t)}=\frac{GBR^j}{(1-\alpha)\cdot\overline{R^j}(t-1)+\alpha \cdot r^j(t)} \;,

where GBR^j is the bit rate specified in EPS bearer of the flow j, \overline{R^j}(t) is the past averaged throughput that is calculated with a moving average, r^{j}(t) is the throughput achieved at the time t, and \alpha is a coefficient such that 0 \le \alpha
\le1.

For m_{ca}^{(k,j)}(t) we consider two different metrics: m_{pf}^{(k,j)}(t) and m_{ff}^{(k,j)}(t). m_{pf} is the Proportional Fair metric which is defined as follows:

m_{pf}^{(k,j)}(t) = \frac{R_e^{(k,j)}}{\overline{R^j}(t)} \;,

where R_e^{(k,j)}(t) is the estimated achievable throughput of user j over RBG k calculated by the Adaptive Modulation and Coding (AMC) scheme that maps the channel quality indicator (CQI) value to the transport block size in bits.

The other channel awareness metric that we consider is m_{ff} which is proposed in [GMonghal2008] and it represents the frequency selective fading gains over RBG k for user j and is calculated in the following way:

m_{ff}^{(k,j)}(t) = \frac{CQI^{(k,j)}(t)}{\sum_{k=1}^{K}CQI(t)^{(k,j)}} \;,

where CQI^{(k,j)}(t) is the last reported CQI value from user j for the k-th RBG.

The user can select whether m_{pf} or m_{ff} is used by setting the attribute ns3::CqaFfMacScheduler::CqaMetric respectively to "CqaPf" or "CqaFf".

19.1.7.5. Random Access

The LTE model includes a model of the Random Access procedure based on some simplifying assumptions, which are detailed in the following for each of the messages and signals described in the specs [TS36321].

  • Random Access (RA) preamble: in real LTE systems this corresponds to a Zadoff-Chu (ZC) sequence using one of several formats available and sent in the PRACH slots which could in principle overlap with PUSCH. PRACH Configuration Index 14 is assumed, i.e., preambles can be sent on any system frame number and subframe number. The RA preamble is modeled using the LteControlMessage class, i.e., as an ideal message that does not consume any radio resources. The collision of preamble transmission by multiple UEs in the same cell are modeled using a protocol interference model, i.e., whenever two or more identical preambles are transmitted in same cell at the same TTI, no one of these identical preambles will be received by the eNB. Other than this collision model, no error model is associated with the reception of a RA preamble.

  • Random Access Response (RAR): in real LTE systems, this is a special MAC PDU sent on the DL-SCH. Since MAC control elements are not accurately modeled in the simulator (only RLC and above PDUs are), the RAR is modeled as an LteControlMessage that does not consume any radio resources. Still, during the RA procedure, the LteEnbMac will request to the scheduler the allocation of resources for the RAR using the FF MAC Scheduler primitive SCHED_DL_RACH_INFO_REQ. Hence, an enhanced scheduler implementation (not available at the moment) could allocate radio resources for the RAR, thus modeling the consumption of Radio Resources for the transmission of the RAR.

  • Message 3: in real LTE systems, this is an RLC TM SDU sent over resources specified in the UL Grant in the RAR. In the simulator, this is modeled as a real RLC TM RLC PDU whose UL resources are allocated by the scheduler upon call to SCHED_DL_RACH_INFO_REQ.

  • Contention Resolution (CR): in real LTE system, the CR phase is needed to address the case where two or more UE sent the same RA preamble in the same TTI, and the eNB was able to detect this preamble in spite of the collision. Since this event does not occur due to the protocol interference model used for the reception of RA preambles, the CR phase is not modeled in the simulator, i.e., the CR MAC CE is never sent by the eNB and the UEs consider the RA to be successful upon reception of the RAR. As a consequence, the radio resources consumed for the transmission of the CR MAC CE are not modeled.

Figure Sequence diagram of the Contention-based MAC Random Access procedure and Sequence diagram of the Non-contention-based MAC Random Access procedure shows the sequence diagrams of respectively the contention-based and non-contention-based MAC random access procedure, highlighting the interactions between the MAC and the other entities.

_images/mac-random-access-contention.png

Sequence diagram of the Contention-based MAC Random Access procedure

_images/mac-random-access-noncontention.png

Sequence diagram of the Non-contention-based MAC Random Access procedure

19.1.8. RLC

19.1.8.1. Overview

The RLC entity is specified in the 3GPP technical specification [TS36322], and comprises three different types of RLC: Transparent Mode (TM), Unacknowledged Mode (UM) and Acknowledged Mode (AM). The simulator includes one model for each of these entities

The RLC entities provide the RLC service interface to the upper PDCP layer and the MAC service interface to the lower MAC layer. The RLC entities use the PDCP service interface from the upper PDCP layer and the MAC service interface from the lower MAC layer.

Figure Implementation Model of PDCP, RLC and MAC entities and SAPs shows the implementation model of the RLC entities and its relationship with all the other entities and services in the protocol stack.

_images/lte-rlc-implementation-model.png

Implementation Model of PDCP, RLC and MAC entities and SAPs

19.1.8.2. Service Interfaces
19.1.8.2.1. RLC Service Interface

The RLC service interface is divided into two parts:

  • the RlcSapProvider part is provided by the RLC layer and used by the upper PDCP layer and

  • the RlcSapUser part is provided by the upper PDCP layer and used by the RLC layer.

Both the UM and the AM RLC entities provide the same RLC service interface to the upper PDCP layer.

19.1.8.2.1.1. RLC Service Primitives

The following list specifies which service primitives are provided by the RLC service interfaces:

  • RlcSapProvider::TransmitPdcpPdu

    • The PDCP entity uses this primitive to send a PDCP PDU to the lower RLC entity in the transmitter peer

  • RlcSapUser::ReceivePdcpPdu

    • The RLC entity uses this primitive to send a PDCP PDU to the upper PDCP entity in the receiver peer

19.1.8.2.2. MAC Service Interface

The MAC service interface is divided into two parts:

  • the MacSapProvider part is provided by the MAC layer and used by the upper RLC layer and

  • the MacSapUser part is provided by the upper RLC layer and used by the MAC layer.

19.1.8.2.2.1. MAC Service Primitives

The following list specifies which service primitives are provided by the MAC service interfaces:

  • MacSapProvider::TransmitPdu

    • The RLC entity uses this primitive to send a RLC PDU to the lower MAC entity in the transmitter peer

  • MacSapProvider::ReportBufferStatus

    • The RLC entity uses this primitive to report the MAC entity the size of pending buffers in the transmitter peer

  • MacSapUser::NotifyTxOpportunity

    • The MAC entity uses this primitive to notify the RLC entity a transmission opportunity

  • MacSapUser::ReceivePdu

    • The MAC entity uses this primitive to send an RLC PDU to the upper RLC entity in the receiver peer

19.1.8.3. AM RLC

The processing of the data transfer in the Acknowledge Mode (AM) RLC entity is explained in section 5.1.3 of [TS36322]. In this section we describe some details of the implementation of the RLC entity.

19.1.8.3.1. Buffers for the transmit operations

Our implementation of the AM RLC entity maintains 3 buffers for the transmit operations:

  • Transmission Buffer: it is the RLC SDU queue. When the AM RLC entity receives a SDU in the TransmitPdcpPdu service primitive from the upper PDCP entity, it enqueues it in the Transmission Buffer. We put a limit on the RLC buffer size and the LteRlc TxDrop trace source is called when a drop due to a full buffer occurs.

  • Transmitted PDUs Buffer: it is the queue of transmitted RLC PDUs for which an ACK/NACK has not been received yet. When the AM RLC entity sends a PDU to the MAC entity, it also puts a copy of the transmitted PDU in the Transmitted PDUs Buffer.

  • Retransmission Buffer: it is the queue of RLC PDUs which are considered for retransmission (i.e., they have been NACKed). The AM RLC entity moves this PDU to the Retransmission Buffer, when it retransmits a PDU from the Transmitted Buffer.

19.1.8.3.6. Calculation of the buffer size

The Transmission Buffer contains RLC SDUs. A RLC PDU is one or more SDU segments plus an RLC header. The size of the RLC header of one RLC PDU depends on the number of SDU segments the PDU contains.

The 3GPP standard (section 6.1.3.1 of [TS36321]) says clearly that, for the uplink, the RLC and MAC headers are not considered in the buffer size that is to be report as part of the Buffer Status Report. For the downlink, the behavior is not specified. Neither [FFAPI] specifies how to do it. Our RLC model works by assuming that the calculation of the buffer size in the downlink is done exactly as in the uplink, i.e., not considering the RLC and MAC header size.

We note that this choice affects the interoperation with the MAC scheduler, since, in response to the Notify_Tx_Opportunity service primitive, the RLC is expected to create a PDU of no more than the size requested by the MAC, including RLC overhead. Hence, unneeded fragmentation can occur if (for example) the MAC notifies a transmission exactly equal to the buffer size previously reported by the RLC. We assume that it is left to the Scheduler to implement smart strategies for the selection of the size of the transmission opportunity, in order to eventually avoid the inefficiency of unneeded fragmentation.

19.1.8.3.7. Concatenation and Segmentation

The AM RLC entity generates and sends exactly one RLC PDU for each transmission opportunity even if it is smaller than the size reported by the transmission opportunity. So for instance, if a STATUS PDU is to be sent, then only this PDU will be sent in that transmission opportunity.

The segmentation and concatenation for the SDU queue of the AM RLC entity follows the same philosophy as the same procedures of the UM RLC entity but there are new state variables (see [TS36322] section 7.1) only present in the AM RLC entity.

It is noted that, according to the 3GPP specs, there is no concatenation for the Retransmission Buffer.

19.1.8.3.8. Re-segmentation

The current model of the AM RLC entity does not support the re-segmentation of the retransmission buffer. Rather, the AM RLC entity just waits to receive a big enough transmission opportunity.

19.1.8.3.9. Unsupported features

We do not support the following procedures of [TS36322] :

  • “Send an indication of successful delivery of RLC SDU” (See section 5.1.3.1.1)

  • “Indicate to upper layers that max retransmission has been reached” (See section 5.2.1)

  • “SDU discard procedures” (See section 5.3)

  • “Re-establishment procedure” (See section 5.4)

We do not support any of the additional primitives of RLC SAP for AM RLC entity. In particular:

  • no SDU discard notified by PDCP

  • no notification of successful / failed delivery by AM RLC entity to PDCP entity

19.1.8.4. UM RLC

In this section we describe the implementation of the Unacknowledged Mode (UM) RLC entity.

19.1.8.4.1. Transmit operations in downlink

The transmit operations of the UM RLC are similar to those of the AM RLC previously described in Section Transmit operations in downlink, with the difference that, following the specifications of [TS36322], retransmission are not performed, and there are no STATUS PDUs.

19.1.8.4.2. Transmit operations in uplink

The transmit operations in the uplink are similar to those of the downlink, with the main difference that the Report_Buffer_Status is sent from the UE MAC to the MAC Scheduler in the eNB over the air using the control channel.

19.1.8.4.3. Calculation of the buffer size

The calculation of the buffer size for the UM RLC is done using the same approach of the AM RLC, please refer to section Calculation of the buffer size for the corresponding description.

19.1.8.5. TM RLC

In this section we describe the implementation of the Transparent Mode (TM) RLC entity.

19.1.8.5.1. Transmit operations in downlink

In the simulator, the TM RLC still provides to the upper layers the same service interface provided by the AM and UM RLC entities to the PDCP layer; in practice, this interface is used by an RRC entity (not a PDCP entity) for the transmission of RLC SDUs. This choice is motivated by the fact that the services provided by the TM RLC to the upper layers, according to [TS36322], is a subset of those provided by the UM and AM RLC entities to the PDCP layer; hence, we reused the same interface for simplicity.

The transmit operations in the downlink are performed as follows. When the Transmit_PDCP_PDU service primitive is called by the upper layers, the TM RLC does the following:

  • put the SDU in the Transmission Buffer

  • compute the size of the Transmission Buffer

  • call the Report_Buffer_Status service primitive of the eNB MAC entity

Afterwards, when the MAC scheduler decides that some data can be sent by the logical channel to which the TM RLC entity belongs, the MAC entity notifies it to the TM RLC entity by calling the Notify_Tx_Opportunity service primitive. Upon reception of this primitive, the TM RLC entity does the following:

  • if the TX opportunity has a size that is greater than or equal to the size of the head-of-line SDU in the Transmission Buffer

    • dequeue the head-of-line SDU from the Transmission Buffer

    • create one RLC PDU that contains entirely that SDU, without any RLC header

    • Call the Transmit_PDU primitive in order to send the RLC PDU to the MAC entity.

19.1.8.5.2. Transmit operations in uplink

The transmit operations in the uplink are similar to those of the downlink, with the main difference that a transmission opportunity can also arise from the assignment of the UL GRANT as part of the Random Access procedure, without an explicit Buffer Status Report issued by the TM RLC entity.

19.1.8.5.3. Calculation of the buffer size

As per the specifications [TS36322], the TM RLC does not add any RLC header to the PDUs being transmitted. Because of this, the buffer size reported to the MAC layer is calculated simply by summing the size of all packets in the transmission buffer, thus notifying to the MAC the exact buffer size.

19.1.8.6. SM RLC

In addition to the AM, UM and TM implementations that are modeled after the 3GPP specifications, a simplified RLC model is provided, which is called Saturation Mode (SM) RLC. This RLC model does not accept PDUs from any above layer (such as PDCP); rather, the SM RLC takes care of the generation of RLC PDUs in response to the notification of transmission opportunities notified by the MAC. In other words, the SM RLC simulates saturation conditions, i.e., it assumes that the RLC buffer is always full and can generate a new PDU whenever notified by the scheduler.

The SM RLC is used for simplified simulation scenarios in which only the LTE Radio model is used, without the EPC and hence without any IP networking support. We note that, although the SM RLC is an unrealistic traffic model, it still allows for the correct simulation of scenarios with multiple flows belonging to different (non real-time) QoS classes, in order to test the QoS performance obtained by different schedulers. This can be done since it is the task of the Scheduler to assign transmission resources based on the characteristics (e.g., Guaranteed Bit Rate) of each Radio Bearer, which are specified upon the definition of each Bearer within the simulation program.

As for schedulers designed to work with real-time QoS traffic that has delay constraints, the SM RLC is probably not an appropriate choice. This is because the absence of actual RLC SDUs (replaced by the artificial generation of Buffer Status Reports) makes it not possible to provide the Scheduler with meaningful head-of-line-delay information, which is often the metric of choice for the implementation of scheduling policies for real-time traffic flows. For the simulation and testing of such schedulers, it is advisable to use either the UM or the AM RLC models instead.

19.1.9. PDCP

19.1.9.1. PDCP Model Overview

The reference document for the specification of the PDCP entity is [TS36323]. With respect to this specification, the PDCP model implemented in the simulator supports only the following features:

  • transfer of data (user plane or control plane);

  • maintenance of PDCP SNs;

  • transfer of SN status (for use upon handover);

The following features are currently not supported:

  • header compression and decompression of IP data flows using the ROHC protocol;

  • in-sequence delivery of upper layer PDUs at re-establishment of lower layers;

  • duplicate elimination of lower layer SDUs at re-establishment of lower layers for radio bearers mapped on RLC AM;

  • ciphering and deciphering of user plane data and control plane data;

  • integrity protection and integrity verification of control plane data;

  • timer based discard;

  • duplicate discarding.

19.1.9.2. PDCP Service Interface

The PDCP service interface is divided into two parts:

  • the PdcpSapProvider part is provided by the PDCP layer and used by the upper layer and

  • the PdcpSapUser part is provided by the upper layer and used by the PDCP layer.

19.1.9.2.1. PDCP Service Primitives

The following list specifies which service primitives are provided by the PDCP service interfaces:

  • PdcpSapProvider::TransmitPdcpSdu

    • The RRC entity uses this primitive to send an RRC PDU to the lower PDCP entity in the transmitter peer

  • PdcpSapUser::ReceivePdcpSdu

    • The PDCP entity uses this primitive to send an RRC PDU to the upper RRC entity in the receiver peer

19.1.10. RRC

19.1.10.1. Features

The RRC model implemented in the simulator provides the following functionality:

  • generation (at the eNB) and interpretation (at the UE) of System Information (in particular the Master Information Block and, at the time of this writing, only System Information Block Type 1 and 2)

  • initial cell selection

  • RRC connection establishment procedure

  • RRC reconfiguration procedure, supporting the following use cases: + reconfiguration of the SRS configuration index + reconfiguration of the PHY TX mode (MIMO) + reconfiguration of UE measurements + data radio bearer setup + handover

  • RRC connection re-establishment, supporting the following use cases: + handover

19.1.10.2. Architecture

The RRC model is divided into the following components:

  • the RRC entities LteUeRrc and LteEnbRrc, which implement the state machines of the RRC entities respectively at the UE and the eNB;

  • the RRC SAPs LteUeRrcSapProvider, LteUeRrcSapUser, LteEnbRrcSapProvider, LteEnbRrcSapUser, which allow the RRC entities to send and receive RRC messages and information elmenents;

  • the RRC protocol classes LteUeRrcProtocolIdeal, LteEnbRrcProtocolIdeal, LteUeRrcProtocolReal, LteEnbRrcProtocolReal, which implement two different models for the transmission of RRC messages.

Additionally, the RRC components use various other SAPs in order to interact with the rest of the protocol stack. A representation of all the SAPs that are used is provided in the figures LTE radio protocol stack architecture for the UE on the data plane, LTE radio protocol stack architecture for the UE on the control plane, LTE radio protocol stack architecture for the eNB on the data plane and LTE radio protocol stack architecture for the eNB on the control plane.

19.1.10.3. UE RRC State Machine

In Figure UE RRC State Machine we represent the state machine as implemented in the RRC UE entity.

_images/lte-ue-rrc-states.png

UE RRC State Machine

All the states are transient, however, the UE in “CONNECTED_NORMALLY” state will only switch to the IDLE state if the downlink SINR is below a defined threshold, which would lead to radio link failure Radio Link Failure. One the other hand, the UE would not be able switch to IDLE mode due to a handover failure, as mentioned in X2.

19.1.10.4. ENB RRC State Machine

The eNB RRC maintains the state for each UE that is attached to the cell. From an implementation point of view, the state of each UE is contained in an instance of the UeManager class. The state machine is represented in Figure ENB RRC State Machine for each UE.

_images/lte-enb-rrc-states.png

ENB RRC State Machine for each UE

19.1.10.5. Initial Cell Selection

Initial cell selection is an IDLE mode procedure, performed by UE when it has not yet camped or attached to an eNodeB. The objective of the procedure is to find a suitable cell and attach to it to gain access to the cellular network.

It is typically done at the beginning of simulation, as depicted in Figure Sample runs of initial cell selection in UE and timing of related events below. The time diagram on the left side is illustrating the case where initial cell selection succeed on first try, while the diagram on the right side is for the case where it fails on the first try and succeed on the second try. The timing assumes the use of real RRC protocol model (see RRC protocol models) and no transmission error.

_images/lte-cell-selection-timeline.png

Sample runs of initial cell selection in UE and timing of related events

The functionality is based on 3GPP IDLE mode specifications, such as in [TS36300], [TS36304], and [TS36331]. However, a proper implementation of IDLE mode is still missing in the simulator, so we reserve several simplifying assumptions:

  • multiple carrier frequency is not supported;

  • multiple Public Land Mobile Network (PLMN) identities (i.e. multiple network operators) is not supported;

  • RSRQ measurements are not utilized;

  • stored information cell selection is not supported;

  • “Any Cell Selection” state and camping to an acceptable cell is not supported;

  • marking a cell as barred or reserved is not supported;

  • Idle cell reselection is not supported, hence it is not possible for UE to camp to a different cell after the initial camp has been placed; and

  • UE’s Closed Subscriber Group (CSG) white list contains only one CSG identity.

Also note that initial cell selection is only available for EPC-enabled simulations. LTE-only simulations must use the manual attachment method. See section Network Attachment of the User Documentation for more information on their differences in usage.

The next subsections cover different parts of initial cell selection, namely cell search, broadcast of system information, and cell selection evaluation.

19.1.10.5.2. Broadcast of System Information

System information blocks are broadcasted by eNodeB to UEs at predefined time intervals, adapted from Section 5.2.1.2 of [TS36331]. The supported system information blocks are:

  • Master Information Block (MIB)

    Contains parameters related to the PHY layer, generated during cell configuration and broadcasted every 10 ms at the beginning of radio frame as a control message.

  • System Information Block Type 1 (SIB1)

    Contains information regarding network access, broadcasted every 20 ms at the middle of radio frame as a control message. Not used in manual attachment method. UE must have decoded MIB before it can receive SIB1.

  • System Information Block Type 2 (SIB2)

    Contains UL- and RACH-related settings, scheduled to transmit via RRC protocol at 16 ms after cell configuration, and then repeats every 80 ms (configurable through LteEnbRrc::SystemInformationPeriodicity attribute. UE must be camped to a cell in order to be able to receive its SIB2.

Reception of system information is fundamental for UE to advance in its lifecycle. MIB enables the UE to increase the initial DL bandwidth of 6 RBs to the actual operating bandwidth of the network. SIB1 provides information necessary for cell selection evaluation (explained in the next section). And finally SIB2 is required before the UE is allowed to switch to CONNECTED state.

19.1.10.5.3. Cell Selection Evaluation

UE RRC reviews the measurement report produced in Cell Search and the cell access information provided by SIB1. Once both information is available for a specific cell, the UE triggers the evaluation process. The purpose of this process is to determine whether the cell is a suitable cell to camp to.

The evaluation process is a slightly simplified version of Section 5.2.3.2 of [TS36304]. It consists of the following criteria:

  • Rx level criterion; and

  • closed subscriber group (CSG) criterion.

The first criterion, Rx level, is based on the cell’s measured RSRP Q_{rxlevmeas}, which has to be higher than a required minimum Q_{rxlevmin} in order to pass the criterion:

Q_{rxlevmeas} - Q_{rxlevmin} > 0

where Q_{rxlevmin} is determined by each eNodeB and is obtainable by UE from SIB1.

The last criterion, CSG, is a combination of a true-or-false parameter called CSG indication and a simple number CSG identity. The basic rule is that UE shall not camp to eNodeB with a different CSG identity. But this rule is only enforced when CSG indication is valued as true. More details are provided in Section Network Attachment of the User Documentation.

When the cell passes all the above criteria, the cell is deemed as suitable. Then UE camps to it (IDLE_CAMPED_NORMALLY state).

After this, upper layer may request UE to enter CONNECTED mode. Please refer to section RRC connection establishment for details on this.

On the other hand, when the cell does not pass the CSG criterion, then the cell is labeled as acceptable (Section 10.1.1.1 [TS36300]). In this case, the RRC entity will tell the PHY entity to synchronize to the second strongest cell and repeat the initial cell selection procedure using that cell. As long as no suitable cell is found, the UE will repeat these steps while avoiding cells that have been identified as acceptable.

19.1.10.6. Radio Admission Control

Radio Admission Control is supported by having the eNB RRC reply to an RRC CONNECTION REQUEST message sent by the UE with either an RRC CONNECTION SETUP message or an RRC CONNECTION REJECT message, depending on whether the new UE is to be admitted or not. In the current implementation, the behavior is determined by the boolean attribute ns3::LteEnbRrc::AdmitRrcConnectionRequest. There is currently no Radio Admission Control algorithm that dynamically decides whether a new connection shall be admitted or not.

19.1.10.7. Radio Bearer Configuration

Some implementation choices have been made in the RRC regarding the setup of radio bearers:

  • three Logical Channel Groups (out of four available) are configured for uplink buffer status report purposes, according to the following policy:

    • LCG 0 is for signaling radio bearers

    • LCG 1 is for GBR data radio bearers

    • LCG 2 is for Non-GBR data radio bearers

19.1.10.9. UE RRC Measurements Model
19.1.10.9.1. UE RRC measurements support

The UE RRC entity provides support for UE measurements; in particular, it implements the procedures described in Section 5.5 of [TS36331], with the following simplifying assumptions:

  • only E-UTRA intra-frequency measurements are supported, which implies:

    • only one measurement object is used during the simulation;

    • measurement gaps are not needed to perform the measurements;

    • Event B1 and B2 are not implemented;

  • only reportStrongestCells purpose is supported, while reportCGI and reportStrongestCellsForSON purposes are not supported;

  • s-Measure is not supported;

  • carrier aggregation is now supported in the LTE module - Event A6 is not implemented;

  • speed dependent scaling of time-to-trigger (Section 5.5.6.2 of [TS36331]) is not supported.

19.1.10.9.2. Overall design

The model is based on the concept of UE measurements consumer, which is an entity that may request an eNodeB RRC entity to provide UE measurement reports. Consumers are, for example, Handover algorithm, which compute handover decision based on UE measurement reports. Test cases and user’s programs may also become consumers. Figure Relationship between UE measurements and its consumers depicts the relationship between these entities.

_images/ue-meas-consumer.png

Relationship between UE measurements and its consumers

The whole UE measurements function at the RRC level is divided into 4 major parts:

  1. Measurement configuration (handled by LteUeRrc::ApplyMeasConfig)

  2. Performing measurements (handled by LteUeRrc::DoReportUeMeasurements)

  3. Measurement report triggering (handled by LteUeRrc::MeasurementReportTriggering)

  4. Measurement reporting (handled by LteUeRrc::SendMeasurementReport)

The following sections will describe each of the parts above.

19.1.10.9.3. Measurement configuration

An eNodeB RRC entity configures UE measurements by sending the configuration parameters to the UE RRC entity. This set of parameters are defined within the MeasConfig Information Element (IE) of the RRC Connection Reconfiguration message (RRC connection reconfiguration).

The eNodeB RRC entity implements the configuration parameters and procedures described in Section 5.5.2 of [TS36331], with the following simplifying assumption:

  • configuration (i.e. addition, modification, and removal) can only be done before the simulation begins;

  • all UEs attached to the eNodeB will be configured the same way, i.e. there is no support for configuring specific measurement for specific UE; and

  • it is assumed that there is a one-to-one mapping between the PCI and the E-UTRAN Global Cell Identifier (EGCI). This is consistent with the PCI modeling assumptions described in UE PHY Measurements Model.

The eNodeB RRC instance here acts as an intermediary between the consumers and the attached UEs. At the beginning of simulation, each consumer provides the eNodeB RRC instance with the UE measurements configuration that it requires. After that, the eNodeB RRC distributes the configuration to attached UEs.

Users may customize the measurement configuration using several methods. Please refer to Section Configure UE measurements of the User Documentation for the description of these methods.

19.1.10.9.4. Performing measurements

UE RRC receives both RSRP and RSRQ measurements on periodical basis from UE PHY, as described in UE PHY Measurements Model. Layer 3 filtering will be applied to these received measurements. The implementation of the filtering follows Section 5.5.3.2 of [TS36331]:

F_n = (1 - a) \times F_{n-1} + a \times M_n

where:

  • M_n is the latest received measurement result from the physical layer;

  • F_n is the updated filtered measurement result;

  • F_{n-1} is the old filtered measurement result, where F_0 = M_1 (i.e. the first measurement is not filtered); and

  • a = (\frac{1}{2})^{\frac{k}{4}}, where k is the configurable filterCoefficient provided by the QuantityConfig;

k = 4 is the default value, but can be configured by setting the RsrpFilterCoefficient and RsrqFilterCoefficient attributes in LteEnbRrc.

Therefore k = 0 will disable Layer 3 filtering. On the other hand, past measurements can be granted more influence on the filtering results by using larger value of k.

19.1.10.9.5. Measurement reporting triggering

In this part, UE RRC will go through the list of active measurement configuration and check whether the triggering condition is fulfilled in accordance with Section 5.5.4 of [TS36331]. When at least one triggering condition from all the active measurement configuration is fulfilled, the measurement reporting procedure (described in the next subsection) will be initiated.

3GPP defines two kinds of triggerType: periodical and event-based. At the moment, only event-based criterion is supported. There are various events that can be selected, which are briefly described in the table below:

List of supported event-based triggering criteria

Name

Description

Event A1

Serving cell becomes better than threshold

Event A2

Serving cell becomes worse than threshold

Event A3

Neighbour becomes offset dB better than serving cell

Event A4

Neighbour becomes better than threshold

Event A5

Serving becomes worse than threshold1 AND neighbour becomes better than threshold2

Two main conditions to be checked in an event-based trigger are the entering condition and the leaving condition. More details on these two can be found in Section 5.5.4 of [TS36331].

An event-based trigger can be further configured by introducing hysteresis and time-to-trigger. Hysteresis (Hys) defines the distance between the entering and leaving conditions in dB. Similarly, time-to-trigger introduces delay to both entering and leaving conditions, but as a unit of time.

The periodical type of reporting trigger is not supported, but its behavior can be easily obtained by using an event-based trigger. This can be done by configuring the measurement in such a way that the entering condition is always fulfilled, for example, by setting the threshold of Event A1 to zero (the minimum level). As a result, the measurement reports will always be triggered at every certain interval, as determined by the reportInterval field within LteRrcSap::ReportConfigEutra, therefore producing the same behaviour as periodical reporting.

As a limitation with respect to 3GPP specifications, the current model does not support any cell-specific configuration. These configuration parameters are defined in measurement object. As a consequence, incorporating a list of black cells into the triggering process is not supported. Moreover, cell-specific offset (i.e., O_{cn} and O_{cp} in Event A3, A4, and A5) are not supported as well. The value equal to zero is always assumed in place of them.

19.1.10.9.6. Measurement reporting

This part handles the submission of measurement report from the UE RRC entity to the serving eNodeB entity via RRC protocol. Several simplifying assumptions have been adopted:

  • reportAmount is not applicable (i.e. always assumed to be infinite);

  • in measurement reports, the reportQuantity is always assumed to be BOTH, i.e., both RSRP and RSRQ are always reported, regardless of the triggerQuantity.

19.1.10.10. Handover

The RRC model supports UE mobility in CONNECTED mode by invoking the X2-based handover procedure. The model is intra-EUTRAN and intra-frequency, as based on Section 10.1.2.1 of [TS36300].

This section focuses on the process of triggering a handover. The handover execution procedure itself is covered in Section X2.

There are two ways to trigger the handover procedure:

  • explicitly (or manually) triggered by the simulation program by scheduling an execution of the method LteEnbRrc::SendHandoverRequest; or

  • automatically triggered by the eNodeB RRC entity based on UE measurements and according to the selected handover algorithm.

Section X2-based handover of the User Documentation provides some examples on using both explicit and automatic handover triggers in simulation. The next subsection will take a closer look on the automatic method, by describing the design aspects of the handover algorithm interface and the available handover algorithms.

19.1.10.10.1. Handover algorithm

Handover in 3GPP LTE has the following properties:

  • UE-assisted

    The UE provides input to the network in the form of measurement reports. This is handled by the UE RRC Measurements Model.

  • Network-controlled

    The network (i.e. the source eNodeB and the target eNodeB) decides when to trigger the handover and oversees its execution.

The handover algorithm operates at the source eNodeB and is responsible in making handover decisions in an “automatic” manner. It interacts with an eNodeB RRC instance via the Handover Management SAP interface. These relationships are illustrated in Figure Relationship between UE measurements and its consumers from the previous section.

The handover algorithm interface consists of the following methods:

  • AddUeMeasReportConfigForHandover

    (Handover Algorithm -> eNodeB RRC) Used by the handover algorithm to request measurement reports from the eNodeB RRC entity, by passing the desired reporting configuration. The configuration will be applied to all future attached UEs.

  • ReportUeMeas

    (eNodeB RRC -> Handover Algorithm) Based on the UE measurements configured earlier in AddUeMeasReportConfigForHandover, UE may submit measurement reports to the eNodeB. The eNodeB RRC entity uses the ReportUeMeas interface to forward these measurement reports to the handover algorithm.

  • TriggerHandover

    (Handover Algorithm -> eNodeB RRC) After examining the measurement reports (but not necessarily), the handover algorithm may declare a handover. This method is used to notify the eNodeB RRC entity about this decision, which will then proceed to commence the handover procedure.

One note for the AddUeMeasReportConfigForHandover. The method will return the measId (measurement identity) of the newly created measurement configuration. Typically a handover algorithm would store this unique number. It may be useful in the ReportUeMeas method, for example when more than one configuration has been requested and the handover algorithm needs to differentiate incoming reports based on the configuration that triggered them.

A handover algorithm is implemented by writing a subclass of the LteHandoverAlgorithm abstract superclass and implementing each of the above mentioned SAP interface methods. Users may develop their own handover algorithm this way, and then use it in any simulation by following the steps outlined in Section X2-based handover of the User Documentation.

Alternatively, users may choose to use one of the 3 built-in handover algorithms provided by the LTE module: no-op, A2-A4-RSRQ, and strongest cell handover algorithm. They are ready to be used in simulations or can be taken as an example of implementing a handover algorithm. Each of these built-in algorithms is covered in each of the following subsections.

19.1.10.10.2. No-op handover algorithm

The no-op handover algorithm (NoOpHandoverAlgorithm class) is the simplest possible implementation of handover algorithm. It basically does nothing, i.e., does not call any of the Handover Management SAP interface methods. Users may choose this handover algorithm if they wish to disable automatic handover trigger in their simulation.

19.1.10.10.3. A2-A4-RSRQ handover algorithm

The A2-A4-RSRQ handover algorithm provides the functionality of the default handover algorithm originally included in LENA M6 (ns-3.18), ported to the Handover Management SAP interface as the A2A4RsrqHandoverAlgorithm class.

As the name implies, the algorithm utilizes the Reference Signal Received Quality (RSRQ) measurements acquired from Event A2 and Event A4. Thus, the algorithm will add 2 measurement configuration to the corresponding eNodeB RRC instance. Their intended use are described as follows:

  • Event A2 (serving cell’s RSRQ becomes worse than threshold) is leveraged to indicate that the UE is experiencing poor signal quality and may benefit from a handover.

  • Event A4 (neighbour cell’s RSRQ becomes better than threshold) is used to detect neighbouring cells and acquire their corresponding RSRQ from every attached UE, which are then stored internally by the algorithm. By default, the algorithm configures Event A4 with a very low threshold, so that the trigger criteria are always true.

Figure A2-A4-RSRQ handover algorithm below summarizes this procedure.

_images/lte-legacy-handover-algorithm.png

A2-A4-RSRQ handover algorithm

Two attributes can be set to tune the algorithm behaviour:

  • ServingCellThreshold

    The threshold for Event A2, i.e. a UE must have an RSRQ lower than this threshold to be considered for a handover.

  • NeighbourCellOffset

    The offset that aims to ensure that the UE would receive better signal quality after the handover. A neighbouring cell is considered as a target cell for the handover only if its RSRQ is higher than the serving cell’s RSRQ by the amount of this offset.

The value of both attributes are expressed as RSRQ range (Section 9.1.7 of [TS36133]), which is an integer between 0 and 34, with 0 as the lowest RSRQ.

19.1.10.10.4. Strongest cell handover algorithm

The strongest cell handover algorithm, or also sometimes known as the traditional power budget (PBGT) algorithm, is developed using [Dimou2009] as reference. The idea is to provide each UE with the best possible Reference Signal Received Power (RSRP). This is done by performing a handover as soon as a better cell (i.e. with stronger RSRP) is detected.

Event A3 (neighbour cell’s RSRP becomes better than serving cell’s RSRP) is chosen to realize this concept. The A3RsrpHandoverAlgorithm class is the result of the implementation. Handover is triggered for the UE to the best cell in the measurement report.

A simulation which uses this algorithm is usually more vulnerable to ping-pong handover (consecutive handover to the previous source eNodeB within short period of time), especially when the Fading Model is enabled. This problem is typically tackled by introducing a certain delay to the handover. The algorithm does this by including hysteresis and time-to-trigger parameters (Section 6.3.5 of [TS36331]) to the UE measurements configuration.

Hysteresis (a.k.a. handover margin) delays the handover in regard of RSRP. The value is expressed in dB, ranges between 0 to 15 dB, and have a 0.5 dB accuracy, e.g., an input value of 2.7 dB is rounded to 2.5 dB.

On the other hand, time-to-trigger delays the handover in regard of time. 3GPP defines 16 valid values for time-to-trigger (all in milliseconds): 0, 40, 64, 80, 100, 128, 160, 256, 320, 480, 512, 640, 1024, 1280, 2560, and 5120.

The difference between hysteresis and time-to-trigger is illustrated in Figure Effect of hysteresis and time-to-trigger in strongest cell handover algorithm below, which is taken from the lena-x2-handover-measures example. It depicts the perceived RSRP of serving cell and a neighbouring cell by a UE which moves pass the border of the cells.

_images/lte-strongest-cell-handover-algorithm.png

Effect of hysteresis and time-to-trigger in strongest cell handover algorithm

By default, the algorithm uses a hysteresis of 3.0 dB and time-to-trigger of 256 ms. These values can be tuned through the Hysteresis and TimeToTrigger attributes of the A3RsrpHandoverAlgorithm class.

19.1.10.11. Neighbour Relation

LTE module supports a simplified Automatic Neighbour Relation (ANR) function. This is handled by the LteAnr class, which interacts with an eNodeB RRC instance through the ANR SAP interface.

19.1.10.11.1. Neighbour Relation Table

The ANR holds a Neighbour Relation Table (NRT), similar to the description in Section 22.3.2a of [TS36300]. Each entry in the table is called a Neighbour Relation (NR) and represents a detected neighbouring cell, which contains the following boolean fields:

  • No Remove

    Indicates that the NR shall not be removed from the NRT. This is true by default for user-provided NR and false otherwise.

  • No X2

    Indicates that the NR shall not use an X2 interface in order to initiate procedures towards the eNodeB parenting the target cell. This is false by default for user-provided NR, and true otherwise.

  • No HO

    Indicates that the NR shall not be used by the eNodeB for handover reasons. This is true in most cases, except when the NR is both user-provided and network-detected.

Each NR entry may have at least one of the following properties:

  • User-provided

    This type of NR is created as instructed by the simulation user. For example, a NR is created automatically upon a user-initiated establishment of X2 connection between 2 eNodeBs, e.g. as described in Section X2-based handover. Another way to create a user-provided NR is to call the AddNeighbourRelation function explicitly.

  • Network-detected

    This type of NR is automatically created during the simulation as a result of the discovery of a nearby cell.

In order to automatically create network-detected NR, ANR utilizes UE measurements. In other words, ANR is a consumer of UE measurements, as depicted in Figure Relationship between UE measurements and its consumers. RSRQ and Event A4 (neighbour becomes better than threshold) are used for the reporting configuration. The default Event A4 threshold is set to the lowest possible, i.e., maximum detection capability, but can be changed by setting the Threshold attribute of LteAnr class. Note that the A2-A4-RSRQ handover algorithm also utilizes a similar reporting configuration. Despite the similarity, when both ANR and this handover algorithm are active in the eNodeB, they use separate reporting configuration.

Also note that automatic setup of X2 interface is not supported. This is the reason why the No X2 and No HO fields are true in a network-detected but not user-detected NR.

19.1.10.11.2. Role of ANR in Simulation

The ANR SAP interface provides the means of communication between ANR and eNodeB RRC. Some interface functions are used by eNodeB RRC to interact with the NRT, as shown below:

  • AddNeighbourRelation

    (eNodeB RRC -> ANR) Add a new user-provided NR entry into the NRT.

  • GetNoRemove

    (eNodeB RRC -> ANR) Get the value of No Remove field of an NR entry of the given cell ID.

  • GetNoHo

    (eNodeB RRC -> ANR) Get the value of No HO field of an NR entry of the given cell ID.

  • GetNoX2

    (eNodeB RRC -> ANR) Get the value of No X2 field of an NR entry of the given cell ID.

Other interface functions exist to support the role of ANR as a UE measurements consumer, as listed below:

  • AddUeMeasReportConfigForAnr

    (ANR -> eNodeB RRC) Used by the ANR to request measurement reports from the eNodeB RRC entity, by passing the desired reporting configuration. The configuration will be applied to all future attached UEs.

  • ReportUeMeas

    (eNodeB RRC -> ANR) Based on the UE measurements configured earlier in AddUeMeasReportConfigForAnr, UE may submit measurement reports to the eNodeB. The eNodeB RRC entity uses the ReportUeMeas interface to forward these measurement reports to the ANR.

Please refer to the corresponding API documentation for LteAnrSap class for more details on the usage and the required parameters.

The ANR is utilized by the eNodeB RRC instance as a data structure to keep track of the situation of nearby neighbouring cells. The ANR also helps the eNodeB RRC instance to determine whether it is possible to execute a handover procedure to a neighbouring cell. This is realized by the fact that eNodeB RRC will only allow a handover procedure to happen if the NR entry of the target cell has both No HO and No X2 fields set to false.

ANR is enabled by default in every eNodeB instance in the simulation. It can be disabled by setting the AnrEnabled attribute in LteHelper class to false.

19.1.10.12. RRC sequence diagrams

In this section we provide some sequence diagrams that explain the most important RRC procedures being modeled.

19.1.10.12.1. RRC connection establishment

Figure Sequence diagram of the RRC Connection Establishment procedure shows how the RRC Connection Establishment procedure is modeled, highlighting the role of the RRC layer at both the UE and the eNB, as well as the interaction with the other layers.

_images/rrc-connection-establishment.png

Sequence diagram of the RRC Connection Establishment procedure

There are several timeouts related to this procedure, which are listed in the following Table Timers in RRC connection establishment procedure. If any of these timers expired, the RRC connection establishment procedure is terminated in failure. At the UE side, if T300 timer has expired a consecutive connEstFailCount times on the same cell it performs the cell selection again [TS36331]. Else, the upper layer (UE NAS) will immediately attempt to retry the procedure.

Timers in RRC connection establishment procedure

Name

Location

Timer starts

Timer stops

Default duration

When timer expired

Connection request timeout

eNodeB RRC

New UE context added

Receive RRC CONNECTION REQUEST

15 ms (Max)

Remove UE context

Connection timeout (T300 timer)

UE RRC

Send RRC CONNECTION REQUEST

Receive RRC CONNECTION SETUP or REJECT

100 ms

Reset UE MAC

Connection setup timeout

eNodeB RRC

Send RRC CONNECTION SETUP

Receive RRC CONNECTION SETUP COMPLETE

100 ms

Remove UE context

Connection rejected timeout

eNodeB RRC

Send RRC CONNECTION REJECT

Never

30 ms

Remove UE context

Note: The value of connection request timeout timer at the eNB RRC should not be higher than the T300 timer at UE RRC. It is to make sure that the UE context is already removed at the eNB, once the UE will perform cell selection upon reaching the connEstFailCount count. Moreover, at the time of writing this document the Cell Selection Evaluation does not include the Qoffset_{temp} parameter, thus, it is not applied while selecting the same cell again.

Counters in RRC connection establishment procedure

Name

Location

Msg

Monitored by

Default value

Limit not reached

Limit reached

ConnEstFailCount

eNB MAC

RachConfigCommon in SIB2, HO REQ and HO Ack

UE RRC

1

Increment the local counter. Invalided the prev SIB2 msg, and try random access with the same cell.

Reset the local counter and perform cell selection.

19.1.10.12.2. RRC connection reconfiguration

Figure Sequence diagram of the RRC Connection Reconfiguration procedure shows how the RRC Connection Reconfiguration procedure is modeled for the case where MobilityControlInfo is not provided, i.e., handover is not performed.

_images/rrc-connection-reconfiguration.png

Sequence diagram of the RRC Connection Reconfiguration procedure

Figure Sequence diagram of the RRC Connection Reconfiguration procedure for the handover case shows how the RRC Connection Reconfiguration procedure is modeled for the case where MobilityControlInfo is provided, i.e., handover is to be performed. As specified in [TS36331], After receiving the handover message, the UE attempts to access the target cell at the first available RACH occasion according to Random Access resource selection defined in [TS36321]_, i.e. the handover is asynchronous. Consequently, when allocating a dedicated preamble for the random access in the target cell, E-UTRA shall ensure it is available from the first RACH occasion the UE may use. Upon successful completion of the handover, the UE sends a message used to confirm the handover. Note that the random access procedure in this case is non-contention based, hence in a real LTE system it differs slightly from the one used in RRC connection established. Also note that the RA Preamble ID is signaled via the Handover Command included in the X2 Handover Request ACK message sent from the target eNB to the source eNB; in particular, the preamble is included in the RACH-ConfigDedicated IE which is part of MobilityControlInfo.

_images/rrc-connection-reconfiguration-handover.png

Sequence diagram of the RRC Connection Reconfiguration procedure for the handover case

19.1.10.13. RRC protocol models

As previously anticipated, we provide two different models for the transmission and reception of RRC messages: Ideal and Real. Each of them is described in one of the following subsections.

19.1.10.13.1. Ideal RRC protocol model

According to this model, implemented in the classes and LteUeRrcProtocolIdeal and LteEnbRrcProtocolIdeal, all RRC messages and information elements are transmitted between the eNB and the UE in an ideal fashion, without consuming radio resources and without errors. From an implementation point of view, this is achieved by passing the RRC data structure directly between the UE and eNB RRC entities, without involving the lower layers (PDCP, RLC, MAC, scheduler).

19.1.10.13.2. Real RRC protocol model

This model is implemented in the classes LteUeRrcProtocolReal and LteEnbRrcProtocolReal and aims at modeling the transmission of RRC PDUs as commonly performed in real LTE systems. In particular:

  • for every RRC message being sent, a real RRC PDUs is created following the ASN.1 encoding of RRC PDUs and information elements (IEs) specified in [TS36331]. Some simplification are made with respect to the IEs included in the PDU, i.e., only those IEs that are useful for simulation purposes are included. For a detailed list, please see the IEs defined in lte-rrc-sap.h and compare with [TS36331].

  • the encoded RRC PDUs are sent on Signaling Radio Bearers and are subject to the same transmission modeling used for data communications, thus including scheduling, radio resource consumption, channel errors, delays, retransmissions, etc.

19.1.10.13.2.1. Signaling Radio Bearer model

We now describe the Signaling Radio Bearer model that is used for the Real RRC protocol model.

  • SRB0 messages (over CCCH):

    • RrcConnectionRequest: in real LTE systems, this is an RLC TM SDU sent over resources specified in the UL Grant in the RAR (not in UL DCIs); the reason is that C-RNTI is not known yet at this stage. In the simulator, this is modeled as a real RLC TM RLC PDU whose UL resources are allocated by the scheduler upon call to SCHED_DL_RACH_INFO_REQ.

    • RrcConnectionSetup: in the simulator this is implemented as in real LTE systems, i.e., with an RLC TM SDU sent over resources indicated by a regular UL DCI, allocated with SCHED_DL_RLC_BUFFER_REQ triggered by the RLC TM instance that is mapped to LCID 0 (the CCCH).

  • SRB1 messages (over DCCH):

    • All the SRB1 messages modeled in the simulator (e.g., RrcConnectionCompleted) are implemented as in real LTE systems, i.e., with a real RLC SDU sent over RLC AM using DL resources allocated via Buffer Status Reports. See the RLC model documentation for details.

  • SRB2 messages (over DCCH):

    • According to [TS36331], “SRB1 is for RRC messages (which may include a piggybacked NAS message) as well as for NAS messages prior to the establishment of SRB2, all using DCCH logical channel”, whereas “SRB2 is for NAS messages, using DCCH logical channel” and “SRB2 has a lower-priority than SRB1 and is always configured by E-UTRAN after security activation”. Modeling security-related aspects is not a requirement of the LTE simulation model, hence we always use SRB1 and never activate SRB2.

19.1.10.13.2.2. ASN.1 encoding of RRC IE’s

The messages defined in RRC SAP, common to all Ue/Enb SAP Users/Providers, are transported in a transparent container to/from a Ue/Enb. The encoding format for the different Information Elements are specified in [TS36331], using ASN.1 rules in the unaligned variant. The implementation in Ns3/Lte has been divided in the following classes:

  • Asn1Header : Contains the encoding / decoding of basic ASN types

  • RrcAsn1Header : Inherits Asn1Header and contains the encoding / decoding of common IE’s defined in [TS36331]

  • Rrc specific messages/IEs classes : A class for each of the messages defined in RRC SAP header

19.1.10.13.2.3. Asn1Header class - Implementation of base ASN.1 types

This class implements the methods to Serialize / Deserialize the ASN.1 types being used in [TS36331], according to the packed encoding rules in ITU-T X.691. The types considered are:

  • Boolean : a boolean value uses a single bit (1=true, 0=false).

  • Integer : a constrained integer (with min and max values defined) uses the minimum amount of bits to encode its range (max-min+1).

  • Bitstring : a bistring will be copied bit by bit to the serialization buffer.

  • Octetstring : not being currently used.

  • Sequence : the sequence generates a preamble indicating the presence of optional and default fields. It also adds a bit indicating the presence of extension marker.

  • Sequence…Of : the sequence…of type encodes the number of elements of the sequence as an integer (the subsequent elements will need to be encoded afterwards).

  • Choice : indicates which element among the ones in the choice set is being encoded.

  • Enumeration : is serialized as an integer indicating which value is used, among the ones in the enumeration, with the number of elements in the enumeration as upper bound.

  • Null : the null value is not encoded, although its serialization function is defined to provide a clearer map between specification and implementation.

The class inherits from ns-3 Header, but Deserialize() function is declared pure virtual, thus inherited classes having to implement it. The reason is that deserialization will retrieve the elements in RRC messages, each of them containing different information elements.

Additionally, it has to be noted that the resulting byte length of a specific type/message can vary, according to the presence of optional fields, and due to the optimized encoding. Hence, the serialized bits will be processed using PreSerialize() function, saving the result in m_serializationResult Buffer. As the methods to read/write in a ns3 buffer are defined in a byte basis, the serialization bits are stored into m_serializationPendingBits attribute, until the 8 bits are set and can be written to buffer iterator. Finally, when invoking Serialize(), the contents of the m_serializationResult attribute will be copied to Buffer::Iterator parameter

19.1.10.13.2.4. RrcAsn1Header : Common IEs

As some Information Elements are being used for several RRC messages, this class implements the following common IE’s:

  • SrbToAddModList

  • DrbToAddModList

  • LogicalChannelConfig

  • RadioResourceConfigDedicated

  • PhysicalConfigDedicated

  • SystemInformationBlockType1

  • SystemInformationBlockType2

  • RadioResourceConfigCommonSIB

19.1.10.13.2.5. Rrc specific messages/IEs classes

The following RRC SAP have been implemented:

  • RrcConnectionRequest

  • RrcConnectionSetup

  • RrcConnectionSetupCompleted

  • RrcConnectionReconfiguration

  • RrcConnectionReconfigurationCompleted

  • HandoverPreparationInfo

  • RrcConnectionReestablishmentRequest

  • RrcConnectionReestablishment

  • RrcConnectionReestablishmentComplete

  • RrcConnectionReestablishmentReject

  • RrcConnectionRelease

19.1.11. NAS

The focus of the LTE-EPC model is on the NAS Active state, which corresponds to EMM Registered, ECM connected, and RRC connected. Because of this, the following simplifications are made:

  • EMM and ECM are not modeled explicitly; instead, the NAS entity at the UE will interact directly with the MME to perform actions that are equivalent (with gross simplifications) to taking the UE to the states EMM Connected and ECM Connected;

  • the NAS also takes care of multiplexing uplink data packets coming from the upper layers into the appropriate EPS bearer by using the Traffic Flow Template classifier (TftClassifier).

  • the NAS does not support PLMN and CSG selection

  • the NAS does not support any location update/paging procedure in idle mode

Figure Sequence diagram of the attach procedure shows how the simplified NAS model implements the attach procedure. Note that both the default and eventual dedicated EPS bearers are activated as part of this procedure.

_images/nas-attach.png

Sequence diagram of the attach procedure

19.1.12. S1, S5 and S11

19.1.12.1. S1-U and S5 (user plane)

The S1-U and S5 interfaces are modeled in a realistic way by encapsulating data packets over GTP/UDP/IP, as done in real LTE-EPC systems. The corresponding protocol stack is shown in Figure LTE-EPC data plane protocol stack. As shown in the figure, there are two different layers of IP networking. The first one is the end-to-end layer, which provides end-to-end connectivity to the users; this layer involves the UEs, the PGW and the remote host (including eventual internet routers and hosts in between), but does not involve the eNB and the SGW. In this version of LTE, the EPC supports both IPv4 and IPv6 type users. The 3GPP unique 64 bit IPv6 prefix allocation process for each UE and PGW is followed here. Each EPC is assigned a unique 16 bit IPv4 and a 48 bit IPv6 network address from the pool of 7.0.0.0/8 and 7777:f00d::/32 respectively. In the end-to-end IP connection between UE and PGW, all addresses are configured using these prefixes. The PGW’s address is used by all UEs as the gateway to reach the internet.

The second layer of IP networking is the EPC local area network. This involves all eNB nodes, SGW nodes and PGW nodes. This network is implemented as a set of point-to-point links which connect each eNB with its corresponding SGW node and a point-to-point link which connect each SGW node with its corresponding PGW node; thus, each SGW has a set of point-to-point devices, each providing connectivity to a different eNB. By default, a 10.x.y.z/30 subnet is assigned to each point-to-point link (a /30 subnet is the smallest subnet that allows for two distinct host addresses).

As specified by 3GPP, the end-to-end IP communications is tunneled over the local EPC IP network using GTP/UDP/IP. In the following, we explain how this tunneling is implemented in the EPC model. The explanation is done by discussing the end-to-end flow of data packets.

_images/epc-data-flow-dl-with-split.png

Data flow in the downlink between the internet and the UE

To begin with, we consider the case of the downlink, which is depicted in Figure Data flow in the downlink between the internet and the UE. Downlink IPv4/IPv6 packets are generated from a generic remote host, and addressed to one of the UE device. Internet routing will take care of forwarding the packet to the generic NetDevice of the PGW node which is connected to the internet (this is the Gi interface according to 3GPP terminology). The PGW has a VirtualNetDevice which is assigned the base IPv4 address of the EPC network; hence, static routing rules will cause the incoming packet from the internet to be routed through this VirtualNetDevice. In case of IPv6 address as destination, a manual route towards the VirtualNetDevice is inserted in the routing table, containing the 48 bit IPv6 prefix from which all the IPv6 addresses of the UEs and PGW are configured. Such device starts the GTP/UDP/IP tunneling procedure, by forwarding the packet to a dedicated application in the PGW node which is called EpcPgwApplication. This application does the following operations:

  1. it determines the SGW node to which it must route the traffic for this UE, by looking at the IP destination address (which is the address of the UE);

  2. it classifies the packet using Traffic Flow Templates (TFTs) to identify to which EPS Bearer it belongs. EPS bearers have a one-to-one mapping to S5 Bearers, so this operation returns the GTP-U Tunnel Endpoint Identifier (TEID) to which the packet belongs;

  3. it adds the corresponding GTP-U protocol header to the packet;

  4. finally, it sends the packet over a UDP socket to the S5 point-to-point NetDevice, addressed to the appropriate SGW.

As a consequence, the end-to-end IP packet with newly added IP, UDP and GTP headers is sent through one of the S5 links to the SGW, where it is received and delivered locally (as the destination address of the outermost IP header matches the SGW IP address). The local delivery process will forward the packet, via an UDP socket, to a dedicated application called EpcSgwApplication. This application then performs the following operations:

  1. it determines the eNB node to which the UE is attached, by looking at the S5 TEID;

  2. it maps the S5 TEID to get the S1 TEID. EPS bearers have a one-to-one mapping to S1-U Bearers, so this operation returns the S1 GTP-U Tunnel Endpoint Identifier (TEID) to which the packet belongs;

  3. it adds a new GTP-U protocol header to the packet;

  4. finally, it sends the packet over a UDP socket to the S1-U point-to-point NetDevice, addressed to the eNB to which the UE is attached.

Finally, the end-to-end IP packet with newly added IP, UDP and GTP headers is sent through one of the S1 links to the eNB, where it is received and delivered locally (as the destination address of the outermost IP header matches the eNB IP address). The local delivery process will forward the packet, via an UDP socket, to a dedicated application called EpcEnbApplication. This application then performs the following operations:

  1. it removes the GTP header and retrieves the S1 TEID which is contained in it;

  2. leveraging on the one-to-one mapping between S1-U bearers and Radio Bearers (which is a 3GPP requirement), it determines the Bearer ID (BID) to which the packet belongs;

  3. it records the BID in a dedicated tag called EpsBearerTag, which is added to the packet;

  4. it forwards the packet to the LteEnbNetDevice of the eNB node via a raw packet socket

Note that, at this point, the outmost header of the packet is the end-to-end IP header, since the IP/UDP/GTP headers of the S1 protocol stack have already been stripped. Upon reception of the packet from the EpcEnbApplication, the LteEnbNetDevice will retrieve the BID from the EpsBearerTag, and based on the BID will determine the Radio Bearer instance (and the corresponding PDCP and RLC protocol instances) which are then used to forward the packet to the UE over the LTE radio interface. Finally, the LteUeNetDevice of the UE will receive the packet, and delivery it locally to the IP protocol stack, which will in turn delivery it to the application of the UE, which is the end point of the downlink communication.

_images/epc-data-flow-ul-with-split.png

Data flow in the uplink between the UE and the internet

The case of the uplink is depicted in Figure Data flow in the uplink between the UE and the internet. Uplink IP packets are generated by a generic application inside the UE, and forwarded by the local TCP/IP stack to the LteUeNetDevice of the UE. The LteUeNetDevice then performs the following operations:

  1. it classifies the packet using TFTs and determines the Radio Bearer to which the packet belongs (and the corresponding RBID);

  2. it identifies the corresponding PDCP protocol instance, which is the entry point of the LTE Radio Protocol stack for this packet;

  3. it sends the packet to the eNB over the LTE Radio Protocol stack.

The eNB receives the packet via its LteEnbNetDevice. Since there is a single PDCP and RLC protocol instance for each Radio Bearer, the LteEnbNetDevice is able to determine the BID of the packet. This BID is then recorded onto an EpsBearerTag, which is added to the packet. The LteEnbNetDevice then forwards the packet to the EpcEnbApplication via a raw packet socket.

Upon receiving the packet, the EpcEnbApplication performs the following operations:

  1. it retrieves the BID from the EpsBearerTag in the packet;

  2. it determines the corresponding EPS Bearer instance and GTP-U TEID by leveraging on the one-to-one mapping between S1-U bearers and Radio Bearers;

  3. it adds a GTP-U header on the packet, including the TEID determined previously;

  4. it sends the packet to the SGW node via the UDP socket connected to the S1-U point-to-point net device.

At this point, the packet contains the S1-U IP, UDP and GTP headers in addition to the original end-to-end IP header. When the packet is received by the corresponding S1-U point-to-point NetDevice of the SGW node, it is delivered locally (as the destination address of the outmost IP header matches the address of the point-to-point net device). The local delivery process will forward the packet to the EpcSgwApplication via the corresponding UDP socket. The EpcSgwApplication then performs the following operations:

  1. it removes the GTP header and retrieves the S1-U TEID;

  2. it maps the S1-U TEID to get the S5 TEID to which the packet belongs;

  3. it determines the PGW to which it must send the packet from the TEID mapping;

  4. it add a new GTP-U protocol header to the packet;

  5. finally, it sends the packet over a UDP socket to the S5 point-to-point NetDevice, addressed to the corresponding PGW.

At this point, the packet contains the S5 IP, UDP and GTP headers in addition to the original end-to-end IP header. When the packet is received by the corresponding S5 point-to-point NetDevice of the PGW node, it is delivered locally (as the destination address of the outmost IP header matches the address of the point-to-point net device). The local delivery process will forward the packet to the EpcPgwApplication via the corresponding UDP socket. The EpcPgwApplication then removes the GTP header and forwards the packet to the VirtualNetDevice. At this point, the outmost header of the packet is the end-to-end IP header. Hence, if the destination address within this header is a remote host on the internet, the packet is sent to the internet via the corresponding NetDevice of the PGW. In the event that the packet is addressed to another UE, the IP stack of the PGW will redirect the packet again to the VirtualNetDevice, and the packet will go through the downlink delivery process in order to reach its destination UE.

Note that the EPS Bearer QoS is not enforced on the S1-U and S5 links, it is assumed that the overprovisioning of the link bandwidth is sufficient to meet the QoS requirements of all bearers.

19.1.12.2. S1AP

The S1-AP interface provides control plane interaction between the eNB and the MME. In the simulator, this interface is modeled in a realistic fashion transmitting the encoded S1AP messages and information elements specified in [TS36413] on the S1-MME link.

The S1-AP primitives that are modeled are:

  • INITIAL UE MESSAGE

  • INITIAL CONTEXT SETUP REQUEST

  • INITIAL CONTEXT SETUP RESPONSE

  • PATH SWITCH REQUEST

  • PATH SWITCH REQUEST ACKNOWLEDGE

19.1.12.3. S5 and S11

The S5 interface provides control plane interaction between the SGW and the PGW. The S11 interface provides control plane interaction between the SGw and the MME. Both interfaces use the GPRS Tunneling Protocol (GTPv2-C) to tunnel signalling messages [TS29274] and use UDP as transport protocol. In the simulator, these interfaces and protocol are modeled in a realistic fashion transmitting the encoded GTP-C messages.

The GTPv2-C primitives that are modeled are:

  • CREATE SESSION REQUEST

  • CREATE SESSION RESPONSE

  • MODIFY BEARER REQUEST

  • MODIFY BEARER RESPONSE

  • DELETE SESSION REQUEST

  • DELETE SESSION RESPONSE

  • DELETE BEARER COMMAND

  • DELETE BEARER REQUEST

  • DELETE BEARER RESPONSE

Of these primitives, the first two are used upon initial UE attachment for the establishment of the S1-U and S5 bearers. Section NAS shows the implementation of the attach procedure. The other primitives are used during the handover to switch the S1-U bearers from the source eNB to the target eNB as a consequence of the reception by the MME of a PATH SWITCH REQUEST S1-AP message.

19.1.13. X2

The X2 interface interconnects two eNBs [TS36420]. From a logical point of view, the X2 interface is a point-to-point interface between the two eNBs. In a real E-UTRAN, the logical point-to-point interface should be feasible even in the absence of a physical direct connection between the two eNBs. In the X2 model implemented in the simulator, the X2 interface is a point-to-point link between the two eNBs. A point-to-point device is created in both eNBs and the two point-to-point devices are attached to the point-to-point link.

For a representation of how the X2 interface fits in the overall architecture of the LENA simulation model, the reader is referred to the figure Overview of the LTE-EPC simulation model.

The X2 interface implemented in the simulator provides detailed implementation of the following elementary procedures of the Mobility Management functionality [TS36423]:

  • Handover Request procedure

  • Handover Request Acknowledgement procedure

  • SN Status Transfer procedure

  • UE Context Release procedure

These procedures are involved in the X2-based handover. You can find the detailed description of the handover in section 10.1.2.1 of [TS36300]. We note that the simulator model currently supports only the seamless handover as defined in Section 2.6.3.1 of [Sesia2009]; in particular, lossless handover as described in Section 2.6.3.2 of [Sesia2009] is not supported at the time of this writing.

Figure Sequence diagram of the X2-based handover below shows the interaction of the entities of the X2 model in the simulator. The shaded labels indicate the moments when the UE or eNodeB transition to another RRC state.

_images/lte-epc-x2-handover-seq-diagram.png

Sequence diagram of the X2-based handover

The figure also shows two timers within the handover procedure: the handover leaving timer is maintained by the source eNodeB, while the handover joining timer by the target eNodeB. The duration of the timers can be configured in the HandoverLeavingTimeoutDuration and HandoverJoiningTimeoutDuration attributes of the respective LteEnbRrc instances. When one of these timers expire, the handover procedure is considered as failed.

However, there is no proper handling of handover failure in the current version of LTE module. Users should tune the simulation properly in order to avoid handover failure, otherwise unexpected behaviour may occur. Please refer to Section Tuning simulation with handover of the User Documentation for some tips regarding this matter.

The X2 model is an entity that uses services from:

  • the X2 interfaces,

    • They are implemented as Sockets on top of the point-to-point devices.

    • They are used to send/receive X2 messages through the X2-C and X2-U interfaces (i.e. the point-to-point device attached to the point-to-point link) towards the peer eNB.

  • the S1 application.

    • Currently, it is the EpcEnbApplication.

    • It is used to get some information needed for the Elementary Procedures of the X2 messages.

and it provides services to:

  • the RRC entity (X2 SAP)

    • to send/receive RRC messages. The X2 entity sends the RRC message as a transparent container in the X2 message. This RRC message is sent to the UE.

Figure Implementation Model of X2 entity and SAPs shows the implementation model of the X2 entity and its relationship with all the other entities and services in the protocol stack.

_images/lte-epc-x2-entity-saps.png

Implementation Model of X2 entity and SAPs

The RRC entity manages the initiation of the handover procedure. This is done in the Handover Management submodule of the eNB RRC entity. The target eNB may perform some Admission Control procedures. This is done in the Admission Control submodule. Initially, this submodule will accept any handover request.

19.1.13.1. X2 interfaces

The X2 model contains two interfaces:

  • the X2-C interface. It is the control interface and it is used to send the X2-AP PDUs (i.e. the elementary procedures).

  • the X2-U interface. It is used to send the bearer data when there is DL forwarding.

Figure X2 interface protocol stacks shows the protocol stacks of the X2-U interface and X2-C interface modeled in the simulator.

_images/lte-epc-x2-interface.png

X2 interface protocol stacks

19.1.13.1.1. X2-C

The X2-C interface is the control part of the X2 interface and it is used to send the X2-AP PDUs (i.e. the elementary procedures).

In the original X2 interface control plane protocol stack, SCTP is used as the transport protocol but currently, the SCTP protocol is not modeled in the ns-3 simulator and its implementation is out-of-scope of the project. The UDP protocol is used as the datagram oriented protocol instead of the SCTP protocol.

19.1.13.1.2. X2-U

The X2-U interface is used to send the bearer data when there is DL forwarding during the execution of the X2-based handover procedure. Similarly to what done for the S1-U interface, data packets are encapsulated over GTP/UDP/IP when being sent over this interface. Note that the EPS Bearer QoS is not enforced on the X2-U links, it is assumed that the overprovisioning of the link bandwidth is sufficient to meet the QoS requirements of all bearers.

19.1.13.2. X2 Service Interface

The X2 service interface is used by the RRC entity to send and receive messages of the X2 procedures. It is divided into two parts:

  • the EpcX2SapProvider part is provided by the X2 entity and used by the RRC entity and

  • the EpcX2SapUser part is provided by the RRC entity and used by the RRC entity.

The primitives that are supported in our X2-C model are described in the following subsections.

19.1.13.2.1. X2-C primitives for handover execution

The following primitives are used for the X2-based handover:

  • HANDOVER REQUEST

  • HANDOVER REQUEST ACK

  • HANDOVER PREPARATION FAILURE

  • SN STATUS STRANSFER

  • UE CONTEXT RELEASE

all the above primitives are used by the currently implemented RRC model during the preparation and execution of the handover procedure. Their usage interacts with the RRC state machine; therefore, they are not meant to be used for code customization, at least unless it is desired to modify the RRC state machine.

19.1.13.2.2. X2-C SON primitives

The following primitives can be used to implement Self-Organized Network (SON) functionalities:

  • LOAD INFORMATION

  • RESOURCE STATUS UPDATE

note that the current RRC model does not actually use these primitives, they are included in the model just to make it possible to develop SON algorithms included in the RRC logic that make use of them.

As a first example, we show here how the load information primitive can be used. We assume that the LteEnbRrc has been modified to include the following new member variables:

std::vector<EpcX2Sap::UlInterferenceOverloadIndicationItem>
  m_currentUlInterferenceOverloadIndicationList;
std::vector<EpcX2Sap::UlHighInterferenceInformationItem>
  m_currentUlHighInterferenceInformationList;
EpcX2Sap::RelativeNarrowbandTxBand m_currentRelativeNarrowbandTxBand;

for a detailed description of the type of these variables, we suggest to consult the file epc-x2-sap.h, the corresponding doxygen documentation, and the references therein to the relevant sections of 3GPP TS 36.423. Now, assume that at run time these variables have been set to meaningful values following the specifications just mentioned. Then, you can add the following code in the LteEnbRrc class implementation in order to send a load information primitive:

EpcX2Sap::CellInformationItem cii;
cii.sourceCellId = m_cellId;
cii.ulInterferenceOverloadIndicationList = m_currentUlInterferenceOverloadIndicationList;
cii.ulHighInterferenceInformationList = m_currentUlHighInterferenceInformationList;
cii.relativeNarrowbandTxBand = m_currentRelativeNarrowbandTxBand;

EpcX2Sap::LoadInformationParams params;
params.targetCellId = cellId;
params.cellInformationList.push_back(cii);
m_x2SapProvider->SendLoadInformation(params);

The above code allows the source eNB to send the message. The method LteEnbRrc::DoRecvLoadInformation will be called when the target eNB receives the message. The desired processing of the load information should therefore be implemented within that method.

In the following second example we show how the resource status update primitive is used. We assume that the LteEnbRrc has been modified to include the following new member variable:

EpcX2Sap::CellMeasurementResultItem m_cmri;

similarly to before, we refer to epc-x2-sap.h and the references therein for detailed information about this variable type. Again, we assume that the variable has been already set to a meaningful value. Then, you can add the following code in order to send a resource status update:

EpcX2Sap::ResourceStatusUpdateParams params;
params.targetCellId = cellId;
params.cellMeasurementResultList.push_back(m_cmri);
m_x2SapProvider->SendResourceStatusUpdate(params);

The method eEnbRrc::DoRecvResourceStatusUpdate will be called when the target eNB receives the resource status update message. The desired processing of this message should therefore be implemented within that method.

Finally, we note that the setting and processing of the appropriate values for the variable passed to the above described primitives is deemed to be specific of the SON algorithm being implemented, and hence is not covered by this documentation.

19.1.13.2.3. Unsupported primitives

Mobility Robustness Optimization primitives such as Radio Link Failure indication and Handover Report are not supported at this stage.

19.1.14. S11

The S11 interface provides control plane interaction between the SGW and the MME using the GTPv2-C protocol specified in [TS29274]. In the simulator, this interface is modeled in an ideal fashion, with direct interaction between the SGW and the MME objects, without actually implementing the encoding of the messages and without actually transmitting any PDU on any link.

The S11 primitives that are modeled are:

  • CREATE SESSION REQUEST

  • CREATE SESSION RESPONSE

  • MODIFY BEARER REQUEST

  • MODIFY BEARER RESPONSE

Of these primitives, the first two are used upon initial UE attachment for the establishment of the S1-U bearers; the other two are used during handover to switch the S1-U bearers from the source eNB to the target eNB as a consequence of the reception by the MME of a PATH SWITCH REQUEST S1-AP message.

19.1.15. Power Control

This section describes the ns-3 implementation of Downlink and Uplink Power Control.