Parallel Simulations: Difference between revisions

Revision as of 14:56, 24 March 2008

Main Page - Roadmap - Summer Projects - Project Ideas - Developer FAQ - Tools - Related Projects

HOWTOs - Installation - Troubleshooting - User FAQ - Samples - Models - Education - Contributed Code - Papers

Suggestions from Mathieu Lacage

These were posted on the ns-developers mailing-list: http://mailman.isi.edu/pipermail/ns-developers/2008-March/003829.html

@@ Line 5: / Line 5: @@
 == Suggestions from Mathieu Lacage ==
-The first task is to decide what the scope of the project should be.
+These were posted on the ns-developers mailing-list:
-There are really two possible options:
+http://mailman.isi.edu/pipermail/ns-developers/2008-March/003829.html
-* provide a parallel tool for clusters, and, more generally, for
-physically distributed systems
-* provide a parallel tool for multicore systems (numa/smp)
-Since I doubt very much that the same solution can be applied with
-useful performance to both options, I really would urge interested
-students to focus on one of the above use-cases.
-Now, since I was asked for what what I would recommend, the following is
-really a summary of one of the many implementation options you could
-chose to pursue so, don't take this is as any kind of mandatory path.
-This is merely a hint as to what might be implementable within the
-requested timeframe and what might bring interesting and useful output
-to users.
-) The basic idea is to implement a conservative distributed time
-synchronization algorithm. The classic algorithm is described in
-"Parallel simulation: parallel and distributed simulation systems" by
-Richard M. Fujimoto. The fundamental assumption here is that the
-simulation models make no use of global variables and that simulated
-nodes are connected through channels which have a non-null minimum
-propagation delay. This allows you to parallelize two events e1 and e2
-generated on two simulated nodes (n1 and n2) if e2 > e1 + min_delay
-(n1,n2) or if e1 > e2 + min_delay (n1,n2).
-) We could, of course, implement the classic algorithm in the classic
-way to allow a user to build very large scale simulations by
-distributing the simulated nodes on a large set of simulation machines.
-This kind of parallelization is useful but it has a number of drawbacks,
-the main one being that it requires the user to describe a static
-partitioning of the simulated nodes: the total set of nodes is split in
-a set of partitions, each of which contains a subset of the total set of
-nodes. Then, each partition is scheduled to run on a single machine.
-Once you have made this decision, you cannot undo it since you are not
-using a language like java which allows code migration. In practice,
-finding a partitioning which does not generate slowdowns is non-trivial.
-) So, from my perspective, a desirable solution would require that the
-user does not have to describe a partition set and it would work really
-hard to get automatically _some_ speedup from a multicore system.
-A simple solution would be to re-use a classic synchronization algorithm
-with a very fine-grained partitioning (i.e., each partition contains
-exactly one node). Then, you could schedule each partition on one of the
-cpus of the multicore machine until the synchronization algorithm tells
-you that this partition has no events to schedule anymore. At this
-point, you can pick another available partition to schedule it on one of
-the available cpus.
-Pluses:
-* no user-visible partitioning on a single multicore machine.
-* can be re-used if you have multiple multicore machines to schedule
-events within each multicore machine. So, you can reuse this work if you
-want to provide a more generic distributed simulation solution.
-Minuses:
-* have no idea if you will get any kind of measurable speedup and you
-will need to work hard to optimize local cache thrashing, etc.
-) How do you implement this ?
-The core synchronization algorithm should live in the 'simulator'
-module: src/simulator/simulator.cc should basically be replaced by a new
-version which performs the synchronization/partition scheduling. It is
-trivial to make this code dynamically replaceable by exporting something
-like SimulatorImpl defined src/simulator/simulator.cc so, I will skip
-this detail.
-The biggest problem from an API point of view is that you need to be
-able to tell which partition an event belongs to. Traditionally, this is
-not a problem because there is a single partition per physical
-machine/address-space. However, here, we have multiple partitions within
-a single address-space (physical machine). Since every partition is made
-of a single Node, we just need to tell which node an event is coming
-from. There are many options, but here are two:
-* Simulator::ScheduleWithObject (Ptr<Object> node, Time delay,
-function, ...);
-* Node::Schedule (Time delay, function, ...);
-And, then, you need to make sure that all your models use these
-scheduling functions rather than the functions they are now using.
-) A small tricky detail to get right from an algorithmic perspective is
-how you will handle events which expire at the same time. There are two
-issues to deal with here:
-* you need to ensure that the parallel simulations provide a
-deterministic ordering of these events.
-* you need to ensure that the deterministic ordering of these events
-is the same in both non-parallel and parallel simulations to ensure that
-models are trivially portable from a non-parallel simulation to a
-parallel one.
-I suspect that there are multiple solutions to this problem, but here is
-an ordering algorithm which should solve this problem:
-* all same-time events within a single node should be scheduled FIFO.
-* if you have two same-time events on node A and B, and if
-A->GetNodeId () < B->GetNodeId (), then, the event on A should be
-scheduled first.
-This means that you need to add to the event ordering key the nodeid as
-a secondary ordering key (in fact, there are already 2 keys in there,
-so, this would be a third ordering key)

Parallel Simulations: Difference between revisions

Revision as of 14:56, 24 March 2008

Suggestions from Mathieu Lacage

Navigation menu

Search