Emulation and Realtime Scheduler

From Nsnam
Revision as of 14:30, 21 May 2008 by Craigdo (Talk | contribs) (Current Status)

Jump to: navigation, search

Emulation and Realtime Scheduler

There are two large pieces to the emulation and realtime scheduler project. Not too suprisingly they are the emulation part and the realtime scheduler part.

Realtime Scheduler

The purpose of the realtime scheduler is to cause the progression of the simulation clock to occur synchronously with respect to some external time base. Without the presence of an external time base (wall clock), simulation time jumps instantly from one simulated time to the next.

In order to create a realtime scheduler, to a first approximation you just want to cause simulation time jumps to consume real time. We propose doing this using a combination of sleep- and busy- waits. Sleep-waits cause the calling process (thread) to yield the processor for some amount of time. Even though this specified amount of time can be passed to nanosecond resolution, it is actually converted to an OS-specific granularity. In Linux, the granularity is called a Jiffy. Typically this resolution is insufficient for our needs (on the order of a ten milliseconds), so we round down and sleep for some smaller number of Jiffies. The process is then awakened after the specified number of Jiffies has passed. At this time, we have some residual time to wait. This time is generally smaller than the minimum sleep time, so we busy-wait for the remainder of the time. This means that the thread just sits in a for loop consuming cycles until the desired time arrives. After the combination of sleep- and busy-waits, the elapsed realtime (wall) clock should agree with the simulation time of the next event and the simulation proceeds.

Emulation

Using the realtime scheduler you can create an N second simulation that will consume N seconds of real time. This is not a terrifically useful feature by itself. To make it useful you need to connect the simulation to the real world that is proceeding according to that realtime clock. In the case of network emulation, this means connecting ns-3 to a at least one real device.

We are planning on using pcap and packet sockets to create an ns-3 NetDevice that will look like a usual ns-3 simulated device from the top, but will connect to a real physical network underneath. By itself this presents little challenge, but the interaction with the simulator scheduler is a bit tricky.

Emulation plus Realtime Scheduler

If you consider how an ns-3 simulation might drive a physical net device, it is straightforward. When you want the physical net device to drive the simulation, you need to be able to allow the net device to schedule events in the simulator. Since the basic model for pcap is synchronous, we will need to have a thread reading data from a physical net device in real time. We also want to have at least one thread executing the simulation. It seems clear that we are going to have a multithreaded solution. This means we are going to need to serialize the sheduler.

Also. since there is going to be a need for the net device thread to affect the main scheduler thread we are going to have some form of inter-thread communication. Typically, the scheduler will be sleeping waiting to synchronize with real time. IF a net device receives a packet, we need to be able to schedule a new packet reception event and to wake the scheduler and force it to re-evaluate what it needs to do. This will most likely mean the scheduler will have to execute the newly scheduled event (and derived events) and then go back to sleep waiting for the next event time.

A useful OS construct for doing exactly this kind of wait is a timed contitional wait. Posix Threads (pthreads) implements this functionality as pthread_cond_timedwait. We use the condition as a request for interruption by a realtime device. If a pthread_cond_timedwait is requested, a sleep-wait is performed for a specified time. If the condition does not become true during that time (a timeout) then the function call will return and we will have accomplished our needed sleep-wait. If an external (to the simulation proper) device sets the condition, then the sleep-wait will return early. The scheduler then wakes up and re-evaluates its next action.

Since multiple threads are wandering around in the scheduler, we need to be careful about protecting shared data structures of course.

Higher Order Effects

Since we are attempting to run our simulations at realtime, and have already committed to a multithreaded environment, we have plenty of need and new opportunity for performance optimizations. Others (Mahrenholz et al.) have considered the kinds of optimizations that can be done so I won't duplicate this here, but at this stage of the game we consider them higher order effects that we'll worry about when the time comes.

Current Status

This has been a high priority background task for me. Its continues to be swapped in and out as other things heat up and cool off. I spent a few days up until May 20, 2008 abusing the multithreaded simulator code and it is looking pretty solid. As of today, May 21, 2008, I'm off working on other things again.

I have prototyped the synchronizer, which is the part of the realtime scheduler that synchronizes the simulation time with wall clock time. I have done this with three different time sources and I believe the basic idea has proved itself. The code to implement this prototype has been commented so someone else could pick up this work. See the ns-3-emu/src/simulator directory and grep for synchronizer. The simulator has been made multithread safe so schedule requests can come in from threads managing real net devices or sockets to real networks.

The basic kernel primitives required for the emulator have been coded and fairly extensively tested. These include various condition, mutual exclusion, thread and critical section classes. You can find these in the ns-3-emu/src/core subdirectory.

I wrote an "abuse the simulator" program that schedules many functions then spins up a worker thread and runs the simulation. The thread tosses in many schedule-now events while the simulation is running as if it were a net device. The program loops forever looking for problems caused by the multithreading. This has successfully run overnight.

The pcap trace and packet socket functionality was prototyped in another separate place and is basically throw-away code. There are two use cases we've defined so far: using ns-3 nodes and stacks but substituting real net devices and real networks in a simulation; and the converse which is using an ns-3 simulated network between real nodes and stacks. I will be working to get the first scenario working next. Tom Henderson has said he might do some prototyping to enable the second.

Feel free to take a look at the ns-3-emu repository and let me know what you think.