Statistical Framework for Network Simulation: Difference between revisions
| Line 298: | Line 298: | ||
| ==== End Result ==== | ==== End Result ==== | ||
| The resulting graph provides no evidence that the default WiFi model's performance is necessarily unreasonable and lends some confidence to an at least token faithfulness to reality.  More importantly, this simple investigation has been carried all the way through using the statistical framework.  Success! | |||
| [[Image:Wifi-default.png|378px|center|Output from the WiFi distance simulation study.]] | |||
| == Previous Work == | == Previous Work == | ||
Revision as of 13:49, 31 May 2008
This page outlines work on simulation data collection and a statistical framework for ns-3.
Goals
Primary objectives for this effort are the following:
- Provide functionality to record, calculate, and present data and statistics for analysis of network simulations.
- Boost simulation performance by reducing the need to generate extensive trace logs in order to collect data.
- Enable simulation control via online statistics, e.g. terminating simulations or repeating trials.
Derived sub-goals and other target features include the following:
- Integration with the existing ns-3 tracing system as the basic instrumentation framework of the internal simulation engine, e.g. network stacks, net devices, and channels.
- Enabling users to utilize the statistics framework without requiring use of the tracing system.
- Helping users create, aggregate, and analyze data over multiple trials.
- Support for user created instrumentation, e.g. of application specific events and measures.
- Low memory and CPU overhead when the package is not in use.
- Leveraging existing analysis and output tools as much as possible. The framework may provide some basic statistics, but the focus is on collecting data and making it accessible for manipulation in established tools.
- Eventual support for distributing independent replications is important but not included in the first round of features.
Milestones
- 2008/05/30---A draft framework has been made available at [1].  It includes the following features:
- The core framework and two basic data collectors: A counter, and a min/max/avg/total observer.
- Extensions of those to easily work with times and packets.
- Plaintext output formatted for omnetpp.
- Database output using sqlite3, a standalone, lightweight, high performance SQL engine.
- Mandatory and open ended metadata for describing and working with runs.
- An example based on the notional experiment of examining the properties of NS-3's default ad hoc WiFi performance.  It incorporates the following:
- Constructs of a two node ad hoc WiFi network, with the nodes a parameterized distance apart.
- UDP traffic source and sink applications with slightly different behavior and measurement hooks than the stock classes.
- Data collection from the NS-3 core via existing trace signals, in particular data on frames transmitted and received by the WiFi MAC objects.
- Instrumentation of custom applications by connecting new trace signals to the stat framework, as well as via direct updates. Information is recorded about total packets sent and received, bytes transmitted, and end-to-end delay.
- An example of using packet tags to track end-to-end delay.
- A simple control script which runs a number of trials of the experiment at varying distances and queries the resulting database to produce a graph using GNUPlot.
 
 
To-Do
High priority items include:
- Inclusion of Vincent's online statistics code, e.g. for confidence intervals.
- Provisions in the data collectors for terminating runs, i.e. when a threshold or confidence is met.
- Data collectors for logging samples over time, and output to the various formats.
Each of those is straightforward to incorporate in the current framework.
Approach
The framework is based around the following core principles:
- One experiment trial is conducted by one instance of a simulation program, whether in parallel or serially.
- A control script executes instances of the simulation, varying parameters as necessary.
- Data is collected and stored for plotting and analysis using external scripts and existing tools.
- Measures within the ns-3 core are taken by connecting the stat framework to existing trace signals.
- Trace signals or direct manipulation of the framework may be used to instrument custom simulation code.
Those basic components of the framework and their interactions are depicted in the following figure.
Example
This section goes through the process of constructing an experiment in the framework and producing data for analysis (graphs) from it, demonstrating the structure and API along the way.
Question
What is the (simulated) performance of ns-3's WiFi NetDevices (using the default settings)? How far apart can wireless nodes be in a simulation before they cannot communicate reliably?
- Hypothesis: Based on knowledge of real life performance, the nodes should communicate reasonably well to at least 100m apart. Communication beyond 200m shouldn't be feasible.
Although not a very common question in simulation contexts, this is an important property of which simulation developers should have a basic understanding. It is also a common study done on live hardware.
Simulation Program
The first step in implementing this experiment is developing the simulation program. An example is available in src/test/test02.cc in the draft code. This has the following main steps:
- Declaring parameters and parsing the command line using ns3::CommandLine.
  CommandLine cmd;
  cmd.AddValue("distance", "Distance apart to place nodes (in meters).",
               distance);
  cmd.AddValue("format", "Format to use for data output.",
               format);
  cmd.AddValue("experiment", "Identifier for experiment.",
               experiment);
  cmd.AddValue("strategy", "Identifier for strategy.",
               strategy);
  cmd.AddValue("run", "Identifier for run.",
               runID);
  cmd.Parse (argc, argv);
- Creating nodes and network stacks using ns3::NodeContainer, ns3::WiFiHelper, and ns3::InternetStackHelper.
  NodeContainer nodes;
  nodes.Create(2);
  WifiHelper wifi;
  wifi.SetMac("ns3::AdhocWifiMac");
  wifi.SetPhy("ns3::WifiPhy");
  NetDeviceContainer nodeDevices = wifi.Install(nodes);
  InternetStackHelper internet;
  internet.Install(nodes);
  Ipv4AddressHelper ipAddrs;
  ipAddrs.SetBase("192.168.0.0", "255.255.255.0");
  ipAddrs.Assign(nodeDevices);
- Positioning the nodes using ns3::MobilityHelper. By default the nodes have static mobility and won't move, but we need to set their positions the given distance apart. There are several ways to do this; it is done here using ns3::ListPositionAllocator, which draws positions from a given list.
  MobilityHelper mobility;
  Ptr<ListPositionAllocator> positionAlloc =
    CreateObject<ListPositionAllocator>();
  positionAlloc->Add(Vector(0.0, 0.0, 0.0));
  positionAlloc->Add(Vector(0.0, distance, 0.0));
  mobility.SetPositionAllocator(positionAlloc);
  mobility.Install(nodes);
- Installing a traffic generator and a traffic sink. The stock Applications could be used, but the example includes custom objects in src/test/test02-apps.(cc|h). These have a simple model, generating a given number of packets spaced at a given interval. As there is only one of each they are installed manually; for a larger set the ns3::ApplicationHelper could be used. The commented-out Config::Set line changes the destination of the packets, set to broadcast by default in this example. Note that in general WiFi may have different performance for broadcast and unicast frames due to different rate control and MAC retransmission policies.
  Ptr<Node> appSource = NodeList::GetNode(0);  
  Ptr<Sender> sender = CreateObject<Sender>();
  appSource->AddApplication(sender);
  sender->Start(Seconds(1));
  Ptr<Node> appSink = NodeList::GetNode(1);  
  Ptr<Receiver> receiver = CreateObject<Receiver>();
  appSink->AddApplication(receiver);
  receiver->Start(Seconds(0));
  //  Config::Set("/NodeList/*/ApplicationList/*/$Sender/Destination",
  //              Ipv4AddressValue("192.168.0.2"));
- Configuring the data and statistics to be collected.  The basic paradigm is that an ns3::DataCollector object is created to hold information about this particular run.  Importantly, this includes labels for the experiment, strategy, input, and run.  These are used to identify and easily group data from multiple trials later:
- The experiment is the study of which this trial is a member. Here it is on WiFi performance and distance.
- The strategy is the code or parameters being examined in this trial. In this example it is fixed, but an obvious extension would be to investigate different WiFi bit rates, each of which would be a different strategy.
- The input is the particular problem given to this trial. Here it is simply the distance between the two nodes.
- The runID is a unique identifier for this trial with which it's information is tagged for identification in later analysis. If no run ID is given the example program makes a (weak) run ID using the current time.
 
- Those four pieces of metadata are required, but more may be desired. They may be added to the record using the ns3::DataCollector::AddMetadata() method.
  DataCollector data;
  data.DescribeRun(experiment,
                   strategy,
                   input,
                   runID);
  data.AddMetadata("author", "tjkopena");
- Actual observation and calculating is done by ns3::DataCalculator objects, of which several different types exist. These are created by the simulation program, attached to reporting code, and then registered with the ns3::DataCollector so they will be queried later for their output. One easy observation mechanism is to use existing trace sources, for example to instrument objects in the ns-3 core without changing their code. Here a counter is attached directly to a trace signal in the WiFi MAC layer on the target node.
  Ptr<PacketCounterCalculator> totalRx =
    CreateObject<PacketCounterCalculator>();
  totalRx->SetKey("wifi-rx-frames");
  Config::Connect("/NodeList/1/DeviceList/*/$ns3::WifiNetDevice/Rx",
                  MakeCallback(&PacketCounterCalculator::FrameUpdate,
                                    totalRx));
  data.AddDataCalculator(totalRx);
- Calculators may also be manipulated directly. In this example, a counter is created and passed to the traffic sink application to be updated when packets are received.
  Ptr<CounterCalculator<> > appRx =
    CreateObject<CounterCalculator<> >();
  appRx->SetKey("receiver-rx-packets");
  receiver->SetCounter(appRx);
  data.AddDataCalculator(appRx);
- To increment the count, the sink's packet processing code then calls one of the ns3::CounterCalculator::Update() methods.
      m_calc->Update();
- The program includes several other examples as well, using both the primitive calculators such as ns3::CounterCalculator and those adapted for observing packets and times. In src/test/test02-apps.(cc|h) it also creates a simple custom tag which it uses to track end-to-end delay for generated packets, reporting results to a ns3::TimeMinMaxAvgTotalCalculator data calculator.
- Running and then destroying the simulation, which is very straightforward once constructed.
  Simulator::Run();    
  Simulator::Destroy();
- Generating either omnetpp or sqlite output, depending on the command line arguments. To do this a ns3::DataOutputInterface object is created and configured. The specific type of this will determine the output format. This object is then given the ns3::DataCollector object which it interrogates to produce the output.
  Ptr<DataOutputInterface> output;
  if (format == "omnet") {
    NS_LOG_INFO("Creating omnet formatted data output.");
    output = CreateObject<OmnetDataOutput>();
  } else {
    #ifdef STAT_USE_DB
      NS_LOG_INFO("Creating sqlite formatted data output.");
      output = CreateObject<SqliteDataOutput>();
    #endif
  }
  output->Output(data);
Logging
To see what the example program, applications, and stat framework are doing in detail, set the NS_LOG variable appropriately. The following will provide copious output from all three.
export NS_LOG=StatFramework:WiFiDistanceExperiment:WiFiDistanceApps
Note that this slows down the simulation extraordinarily.
Sample Output
Compiling and simply running the test program will append omnet++ formatted output such as the following to data.sca.
run run-1212239121
attr experiment "wifi-distance-test"
attr strategy "wifi-default"
attr input "50"
attr description ""
attr "author" "tjkopena"
scalar wifi-tx-frames count 30
scalar wifi-rx-frames count 30
scalar sender-tx-packets count 30
scalar receiver-rx-packets count 30
scalar tx-pkt-size count 30
scalar tx-pkt-size total 1920
scalar tx-pkt-size average 64
scalar tx-pkt-size max 64
scalar tx-pkt-size min 64
scalar delay count 30
scalar delay total 5884980ns
scalar delay average 196166ns
scalar delay max 196166ns
scalar delay min 196166ns
Control Script
In order to automate data collection at a variety of inputs (distances), a simple Bash script is used to execute a series of simulations. It is available in scripts/wifi-example in the draft package. The script runs through a set of distances, collecting the results into an sqlite3 database. At each distance five trials are conducted to give a better picture of expected performance. The entire experiment takes a few seconds to run on a low end machine as there is no output during the simulation and little traffic is generated.
#!/bin/sh
DISTANCES="25 50 75 100 125 145 147 150 152 155 157 160 162 165 167 170 172 175 177 180"
TRIALS="1 2 3 4 5"
echo WiFi Experiment Example
if [ -e data.db ]
then
  echo Kill data.db?
  read ANS
  if [ "$ANS" = "yes" -o "$ANS" = "y" ]
  then
    echo Deleting database
    rm data.db
  fi
fi
for trial in $TRIALS
do
  for distance in $DISTANCES
  do
    echo Trial $trial, distance $distance
    ./bin/test02 --format=db --distance=$distance --run=run-$distance-$trial
  done
done
Analysis and Conclusion
Once all trials have been conducted, the script executes a simple SQL query over the database using the sqlite3 command line program. The query computes average packet loss in each set of trials associated with each distance. It does not take into account different strategies, but the information is present in the database to make some simple extensions and do so. The collected data is then passed to GNUPlot for graphing.
CMD="select exp.input,avg(100-((rx.value*100)/tx.value)) \
    from Singletons rx, Singletons tx, Experiments exp \
    where rx.run = tx.run AND \
          rx.run = exp.run AND \
          rx.name='receiver-rx-packets' AND \
          tx.name='sender-tx-packets' \
    group by exp.input \
    order by abs(exp.input) ASC;"
sqlite3 -echo -noheader data.db "$CMD" > wifi-default.data
sed -i "s/|/   /" wifi-default.data
gnuplot scripts/wifi-example.gnuplot
The GNUPlot script simply defines the output format and some basic formatting for the graph.
set terminal postscript portrait enhanced lw 2 "Helvetica" 14
set size 1.0, 0.66
#-------------------------------------------------------
set out "wifi-default.eps"
#set title "Packet Loss Over Distance"
set xlabel "Distance (m) --- average of 5 trials per point"
set xrange [0:200]
set ylabel "% Packet Loss"
set yrange [0:110]
plot "wifi-default.data" with lines title "WiFi Defaults"
End Result
The resulting graph provides no evidence that the default WiFi model's performance is necessarily unreasonable and lends some confidence to an at least token faithfulness to reality. More importantly, this simple investigation has been carried all the way through using the statistical framework. Success!
Previous Work
Several components and packages have been made for ns-2 to collect and manage data and statistics. A variety of these are listed in the ns-2 wiki. The following are notes on particular efforts.
- ns2measure provides a data collection framework for ns-2 and support for calculating statistics over that data, including multiple runs. The main component is a global observer object incorporated into ns-2. Several generic types of measures are supported, e.g. time averaged and discrete rate. Observed samples are recorded via an explicit call to the observer object, identified by a measure label and particular identifier such as a flow or host. Post-simulation scripts provide for analyzing collected data and generating statistics. A control script is provided such that runs may be repeated until a statistical goal such as a confidence level is met. Data from independent runs may be incorporated in generation of the statistics.
- simd executes distributed, independent simulation runs and collects data from them. A set of python scripts is used to push simulations out to client nodes, with a standardized set of scripts used to parameterize runs. Scripts are expected to produce output as comma seperated values, which are collected and concatenated by the master control script.
- ns-2/akaroa-2 provides support for executing distributed, independent replications, with significant statistical support for working with collected data and managing the runs. A master program runs on one computer and a set of clients on other machines that execute received simulations. Within each ns-2 instance a global observer is created. Samples are reported to that observer, which forwards them to the master computer. Measures are identified in simulation scripts by numeric identifiers and consist of particular observations, e.g. delay or packet size. The master program receives these observations and calculates statistics such as the mean and confidence interval over them. That data is used both for final output, and to conduct more simulations at the client machines if confidence is low. Another addition to ns-2 is incorporation of a different random number generator with better guarantees for independent streams.
- tracegraph is a Matlab based package for producing a wide variety of plots from trace files.
- rpi ns2graph provides several observation objects, some for generating traces to be used in graphing, others producing only summary statistics. A number of classes are provided for collecting data on common network statistics, such as round trip time. An API is also given for controlling graph output to a variety of tools, such as GNUPLOT.
- ns2 jtrana parses an ns-2 trace file into MySQL, and provides an interface to interrogate the database and produce graphs and other output in several formats. The DB scheme is largely a straightforward encoding of trace data.
- Samer Bali's scripts generate statistics from traces, including averages over multiple runs, for a number of measures.
Summary
The following table charts five features of these packages:
- Run Mgmt: Whether or not the package provides support for conducting multiple trials and varying parameters.
- Data Mgmt: Whether or not the package helps manage data generated from multiple trials.
- Replicaton: Whether or not the packages supports distributing trials across multiple hosts.
- Trace Analysis: Whether or not the package supports producing statistics from recorded trace logs.
- Runtime Obsv: Whether or not the package provides hooks to observe data and generate statistics during a trial.
| Package | Run Mgmt | Data Mgmt | Replication | Trace Analysis | Runtime Obsv | 
|---|---|---|---|---|---|
| ns2measure | Yes | Yes | No | No | Yes | 
| simd | Yes | No | Yes | No | No | 
| ns-2/akaroa-2 | Yes | Yes | Yes | No | Yes | 
| tracegraph | No | No | No | Yes | No | 
| rpi ns2graph | No | No | No | No | Yes | 
| jtrana | No | No | No | Yes | No | 
| bali scripts | No | No | No | Yes | No | 
Raw Notes
An important benefit of independent replications, particularly distributed setups, is fault tolerance, e.g. not losing all your data when someone kicks the cord out of the machine...

