Bug 631 - RealtimeSimulatorImpl not compatible with python bindings
: RealtimeSimulatorImpl not compatible with python bindings
Status: NEW
: ns-3
python bindings
: ns-3-dev
: PC Linux
: P2 normal
Assigned To:
:
:
:
: 603
  Show dependency treegraph
 
Reported: 2009-07-08 16:34 EDT by
Modified: 2009-11-24 06:23 EDT (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2009-07-08 16:34:54 EDT
Realtime simulations do not presently work from a python program.  The reason
seems to be that in ns3module_helpers.cc, the function _wrap_Simulator_Run()
calls IsFinished()/RunOneEvent(), which is different from C++ Simulator::Run(). 

I understand that RunOneEvent() is there to allow pyviz and other GUIs to step
through simulation execution, but if someone is to use RunOneEvent() with
realtime, the responsibility lies in the caller of RunOneEvent() to maintain a
realtime clock and perform synchronization.  This is documented in
RealtimeSimulatorImpl::RunOneEvent().

A possible solution is to detect the presence of this implementation type and
behave like Simulator::Run() when RealtimeSimulatorImpl() is selected.
------- Comment #1 From 2009-07-08 17:21:52 EDT -------
The reason why RunOneEvent is used is to allow signals to be detected by python
during a simulation.  Otherwise when you hit Ctrl-C Python will not notice the
interrupt probably until the simulation ends.
------- Comment #2 From 2009-07-08 17:25:50 EDT -------
Calling ns3.Simulator.Run(signal_check_frequency=-1) will cause the wrapper to
call ns3::Simulator::Run() thereby avoiding the problem.
------- Comment #3 From 2009-07-13 00:18:55 EDT -------
(In reply to comment #2)
> Calling ns3.Simulator.Run(signal_check_frequency=-1) will cause the wrapper to
> call ns3::Simulator::Run() thereby avoiding the problem.
> 

This does remedy the problem, at the expense of disabling Control-C.  Is there
any hope of a special control-c handler for this case?
------- Comment #4 From 2009-07-13 03:26:08 EDT -------
I think that a better solution here is simply to declare Simulator::RunOneEvent
+ Simulator::IsFinished dead and force GUIs to use a thread to run ns-3:
RunOneEvent is extremly complex to implement for anything but the default
simulator implementation (i.e., parallel simulators are stuck).

If we required the use of threads, we need to figure out how to allow users to
still interact with the simulation from another thread. i.e., we need to have
some amount of thread-safety built-in. One possible commitment we could make
would be to make the simulator.h header fully thread-safe (or, mostly
thread-safe) to allow a GUI thread to insert events in the simulator to perform
queries in the simulation objects from the simulation thread and then return
them to the GUI through a user-specific queue.

The only issue I feel concerned about is that of performance. What is the cost
of locking a simulator-wide mutex on event scheduling micro-benchmarks ?
------- Comment #5 From 2009-07-13 06:56:11 EDT -------
IMHO we should avoid threads in non-realtime simulator due to performance
issues and portability issues.  I think the RunOne solution is so simple, why
don't you like it?  It is surely a lot simpler than any threaded solution you
may want to throw at it.

The realtime scheduler problem is a different one.  But a solution to it is not
trivial.  I think we should just add some way to find out whether the
implementation supports RunOne and avoid calling it if not.  Naturally if we do
not call RunOne then we'll not catch Ctrl-C, but at least the simulator will
work from python automagically.

After that we can try to get Ctrl-C to work again, maybe using threads as you
suggest, but to be honest I think it is more work than the feature is worth...
------- Comment #6 From 2009-07-13 07:23:16 EDT -------
(In reply to comment #5)
> IMHO we should avoid threads in non-realtime simulator due to performance
> issues and portability issues.  I think the RunOne solution is so simple, why
> don't you like it?  It is surely a lot simpler than any threaded solution you
> may want to throw at it.

Because it makes it impossible to support the realtime scheduler, the mpi
parallel scheduler, and, the threaded parallel scheduler. To summarize, nothing
but the simplest default simulator works with RunOne. So, yes, it's simple,
but, it kills our ability to do fancy things underneath: it is exposing too
much from the event loop so, I would rather kill it.

> The realtime scheduler problem is a different one.  But a solution to it is not
> trivial.  I think we should just add some way to find out whether the
> implementation supports RunOne and avoid calling it if not.  Naturally if we do
> not call RunOne then we'll not catch Ctrl-C, but at least the simulator will
> work from python automagically.
> 
> After that we can try to get Ctrl-C to work again, maybe using threads as you
> suggest, but to be honest I think it is more work than the feature is worth...

I have other features in mind.
------- Comment #7 From 2009-07-13 07:31:41 EDT -------
(In reply to comment #6)
> > The realtime scheduler problem is a different one.  But a solution to it is not
> > trivial.  I think we should just add some way to find out whether the
> > implementation supports RunOne and avoid calling it if not.  Naturally if we do
> > not call RunOne then we'll not catch Ctrl-C, but at least the simulator will
> > work from python automagically.
> > 
> > After that we can try to get Ctrl-C to work again, maybe using threads as you
> > suggest, but to be honest I think it is more work than the feature is worth...

what I wanted to point out but forgot to is that it should be perfectly
possible to add a Simulator::SetMonitoringCallback (Callback<void> cb);
which is called regularly by the simulation code and you could call
PyErr_CheckSignals from that callback. [other variants on this theme are easy
to code]

The key here is that it allows the simulator code much more leeway in doing
what it needs to do.
------- Comment #8 From 2009-07-13 07:32:05 EDT -------
(In reply to comment #6)
> (In reply to comment #5)
> > IMHO we should avoid threads in non-realtime simulator due to performance
> > issues and portability issues.  I think the RunOne solution is so simple, why
> > don't you like it?  It is surely a lot simpler than any threaded solution you
> > may want to throw at it.
> 
> Because it makes it impossible to support the realtime scheduler, the mpi
> parallel scheduler, and, the threaded parallel scheduler. To summarize, nothing
> but the simplest default simulator works with RunOne. So, yes, it's simple,
> but, it kills our ability to do fancy things underneath: it is exposing too
> much from the event loop so, I would rather kill it.

It does not make it impossible to do anything, you are exaggerating.  For the
other schedulers we can use Simulator::Run and not catch signals.  Not being
able to catching signals in Python is not a great loss IMHO.

Maybe what we want is to remove RunOne as a generic API and make it specific to
the default simulator implementation.  That's what I would do, if not for your
mania of having huge ABCs with many pure virtual methods that have to be
implemented even if they make no sense for some subclasses (see also class
NetDevice).
------- Comment #9 From 2009-07-13 08:49:16 EDT -------
(In reply to comment #7)
> (In reply to comment #6)
> > > The realtime scheduler problem is a different one.  But a solution to it is not
> > > trivial.  I think we should just add some way to find out whether the
> > > implementation supports RunOne and avoid calling it if not.  Naturally if we do
> > > not call RunOne then we'll not catch Ctrl-C, but at least the simulator will
> > > work from python automagically.
> > > 
> > > After that we can try to get Ctrl-C to work again, maybe using threads as you
> > > suggest, but to be honest I think it is more work than the feature is worth...
> 
> what I wanted to point out but forgot to is that it should be perfectly
> possible to add a Simulator::SetMonitoringCallback (Callback<void> cb);
> which is called regularly by the simulation code and you could call
> PyErr_CheckSignals from that callback. [other variants on this theme are easy
> to code]

Calling PyErr_CheckSignals for _every_ scheduler iteration is bad for
performance, because you also have to acquire the Python GIL (Global
Interpreter Lock).  The current code does this every 100 events by default.

Also without RunOne you will make me jump through hoops in order to get
real-time visualization to work (assuming I will have time one day to port
ns-3-pyviz to ns-3-dev).  Not nice.

> 
> The key here is that it allows the simulator code much more leeway in doing
> what it needs to do.
> 

All I ask is to keep the default (and most important one!) scheduler to work as
before.  Forget about parallel/realtime/whatever, those are cool but not
exactly what most people are going to use.  I think you are giving it excessive
importance.  No offense intended to whoever is putting hard work into
implementing them! :)
------- Comment #10 From 2009-07-13 09:02:38 EDT -------
(In reply to comment #9)
> > what I wanted to point out but forgot to is that it should be perfectly
> > possible to add a Simulator::SetMonitoringCallback (Callback<void> cb);
> > which is called regularly by the simulation code and you could call
> > PyErr_CheckSignals from that callback. [other variants on this theme are easy
> > to code]
> 
> Calling PyErr_CheckSignals for _every_ scheduler iteration is bad for
> performance, because you also have to acquire the Python GIL (Global
> Interpreter Lock).  The current code does this every 100 events by default.

I would expect you (callee) to implement a counter and avoid costly checks
based on the value of this counter.

> Also without RunOne you will make me jump through hoops in order to get
> real-time visualization to work (assuming I will have time one day to port
> ns-3-pyviz to ns-3-dev).  Not nice.

Ok, so, what are the requirements of ns-3-pyviz ? What would be the problem if
you used the above-mentioned callback to perform regular runs of the GTK/QT
mainloop ?

> > The key here is that it allows the simulator code much more leeway in doing
> > what it needs to do.
> > 
> 
> All I ask is to keep the default (and most important one!) scheduler to work as
> before.  Forget about parallel/realtime/whatever, those are cool but not
> exactly what most people are going to use.  I think you are giving it excessive
> importance.  No offense intended to whoever is putting hard work into
> implementing them! :)

If these simulators can't implement all the methods from
src/simulator/simulator.h, then, yes, people will not use them very much but
this is a self-fulfilling profecy. 

What I am trying to do is to make sure _every_ simulator can replace the other
simulators (it's not the case, already). And I am not talking about removing
functionality to do this. I am merely trying to find out what is the proper API
we can expose which gives you the functionality you need and which is
implementable correctly by every simulator.
------- Comment #11 From 2009-07-13 15:02:17 EDT -------
I agree with Gustavo in that there a simple case that we should make very
simple to use.  Using the default simulator implementation and calling in to
run one event seems to be a nice approach -- it seems to be to be such a nice
mechanism for GUIs that I think we should resist taking it away.  I agree with
Mathieu wholeheartedly in that RunOneEvent just isn't reasonable for anthing
but the simplest case, though.  There are alternatives like scheduling periodic
callbacks, but then you immediately run into complication either at the low
level since you lose control for arbitrary lengths of time, or at hgih level
since the callback comes back in the context of the calling thread. 

I hate to bring it up (okay, I don't), but I think we are really swerving into
the question of how to deal with capabilities.  For the NetDevice we have
degenerated into a bunch of pure virtual methods asking capabilities questions
like IsBroadcast, IsMulticast, IsBridge, IsPointToPoint (that are sometimes
mixed in with state questions IsLinkUp).  Using this approach you would do
something like

  if (nd->IsBlah ())
    {
      nd->DoSomething ();
    }

I hope we don't start adding similar things to Simulator.  I originally wanted
to give QueryObject powers to the simulator to resolve some of this kind of
issue.  A similar thing came up in the context of getting the actual underlying
real time clock and scheduling real time events.  A compromise was reached and
we gave clients the ability to get and cast a pointer to the underlying
implementation.  You can see this in the (not very robust code from) emu
device,

  DynamicCast<RealtimeSimulatorImpl> (
    Simulator::GetImplementation ())->ScheduleRealtimeNow (
      MakeEvent (&EmuNetDevice::ForwardUp, this, buf, len));

One way to deal with this problem is to QueryObject for a single event
"Interface" and use it if you need it.  The default simulator implementation
could aggregate an interface for single stepping the simulation, while the more
complex implementations could decide based on their own issues whether or not
to provide it.  

  pi = nd->QueryObject<SimulatorSingleStepper> ();

  ...

  if (pi)
    {
      pi->RunOneEvent ();
    }

If we do it this way, we should think about making all of our similar
capabilities checks work the same way (i.e. net device)

Another way to deal with this issue is through some form of traits, templated
or inherited.  Perhaps

  if (ns3::simulator_capabilities<RealTimeSimulatorTraits>::CanRunOneEvent ())
    {
       Simulator::RunOneEvent ();
    }

This requires the client to know the concrete type of the class, and keep the
trait queries coherent with the current implementation.

If we inherit capabilities classes (ADT) then we end up with something like

  if (ns3::Simulator::Capabilities::CanRunOneEvent ())
    {
       Simulator::RunOneEvent ();
    }

But then you have to deal with plumbing the capabilities from the Simulator
wrapper down into the implementation, which I hate.  We just got done pulling a
lot of that nonsense out of Ipv4, so it should really be more like the emu
example above

  pi = Simulator::GetImplementation ();
  if (pi->Capabilities::CanRunOneEvent ())
    {
       Simulator::RunOneEvent ();
    }

But if you think about it, this is quite similar to a QI but lacks the
clean-ness of the interface separation.

Anyway, we are at the point again, where contemplating this stuff always seems
like too much work to get feature X alone working, but if we don't deal with
it, we end up with loads of silly and annoying IsBlah () and IsYadda ()
functions that need to be plumbed and implemented constantly.

It might not surprise most of you, but I like the QueryObject option.
------- Comment #12 From 2009-07-13 15:29:39 EDT -------
(In reply to comment #11)
> I agree with Gustavo in that there a simple case that we should make very
> simple to use.  Using the default simulator implementation and calling in to

Writing a GUI is not what I could call a "simple case". The common case is
writing a simulation script, not an interactive simulation GUI. I think that it
is very reasonable to ask a GUI developer to use a special API which is not
_the most simple API_ to use but which is still ok. A callback seems really
very reasonable, unless, of course, someone can point out a fundamental flaw
with that idea.

> run one event seems to be a nice approach -- it seems to be to be such a nice
> mechanism for GUIs that I think we should resist taking it away.  I agree with
> Mathieu wholeheartedly in that RunOneEvent just isn't reasonable for anthing
> but the simplest case, though.  There are alternatives like scheduling periodic
> callbacks, but then you immediately run into complication either at the low
> level since you lose control for arbitrary lengths of time, or at hgih level
> since the callback comes back in the context of the calling thread. 

Potentially failing to work in some cases where the user has abused the
monitoring callback seems much less problematic than having ::RunOneEvent not
work at all.

[IsBlah ()]

> I hope we don't start adding similar things to Simulator.  I originally wanted

*shudder*. I feel confident: "over my dead body" :)

> to give QueryObject powers to the simulator to resolve some of this kind of
> issue.  A similar thing came up in the context of getting the actual underlying
> real time clock and scheduling real time events.  A compromise was reached and
> we gave clients the ability to get and cast a pointer to the underlying
> implementation.  You can see this in the (not very robust code from) emu
> device,
> 
>   DynamicCast<RealtimeSimulatorImpl> (
>     Simulator::GetImplementation ())->ScheduleRealtimeNow (
>       MakeEvent (&EmuNetDevice::ForwardUp, this, buf, len));

Simulator::Schedule* needs to be thread-safe to get rid of this, I think, right
?. i.e., we need to profile whether the cost of taking a lock in each method of
DefaultSimulatorImpl and making EventImpl::Ref/Unref threadsafe is not a
performance problem (I have local patches for the latter). If not, then, we
should kill this ugly code (which I am responsible for, I know).

> One way to deal with this problem is to QueryObject for a single event
> "Interface" and use it if you need it.  The default simulator implementation
> could aggregate an interface for single stepping the simulation, while the more
> complex implementations could decide based on their own issues whether or not
> to provide it.  

[capabilities]

> > But if you think about it, this is quite similar to a QI but lacks the
> clean-ness of the interface separation.

I have a very hard time understanding how this is any different from using
Simulator::GetImplementation ()->GetObject<MyCapability> () which I could live
with but which, really, I would like to avoid.
------- Comment #13 From 2009-07-13 17:02:28 EDT -------
> Writing a GUI is not what I could call a "simple case".

Not that it's super-important, but I'm not suggesting that writing a GUI is
simple.  I'm suggesting that we should continue to support a simple way to
single step the simulation, which in turn, makes this kind of application
easier to write.  We have a user suggesting that this makes his life easier,
and I can see his point clearly.

[ ... ]

>>   DynamicCast<RealtimeSimulatorImpl> (
>>     Simulator::GetImplementation ())->ScheduleRealtimeNow (
>>       MakeEvent (&EmuNetDevice::ForwardUp, this, buf, len));
>
> Simulator::Schedule* needs to be thread-safe to get rid of this,
> I think, right?. 

Yes.  These methods are thread-safe in the realtime simulator, but not in the
default case.  However, IMO, having ScheduleReal method along side of Schedule
methods is dangerous since there are side-effects that may bite users (cf,
non-discrete-event simulator anymore makes certain models stop working).

>i.e., we need to profile whether the cost of taking a lock 
> in each method of DefaultSimulatorImpl and making 
> EventImpl::Ref/Unref threadsafe is not a performance problem
> (I have local patches for the latter). If not, then, we
> should kill this ugly code (which I am responsible for, I know).

Having thread safe methods seems like a good idea, but it probably means a
kernel call in the MP case.  We might be able to get away with a TAS
instruction optimization, but this is pretty much the definition of
non-portable :-)

I still think the realtime versions should be broken out so that non-real-time
simulations can't be lulled into thinking they are or doing something possibly
completely unexpected.  It makes sense to me to break out the single step
stuff.

[ ... ]

> A callback seems really very reasonable, unless, of course, someone 
> can point out a fundamental flaw with that idea.

I always thought that these callbacks could be scheduled just like any other
event, BTW.  With the realtime scheduler, you could just arrange to have
yourself called back every 100ms, for example.

If you write an event loop with a message pump, your callback can just send a
message to the event loop, but this implies a thread for running the message
pump and a thread for calling into the simulator.  So you begin talking
multithreaded pretty quickly.

If you just have a single step method, you can call it out of your event loop
and don't have to deal with multithreading.

It's just easier to deal with, IMO.

> > > But if you think about it, this is quite similar to a QI but lacks the
> > clean-ness of the interface separation.

> I have a very hard time understanding how this is any different 
> from using Simulator::GetImplementation ()->GetObject<MyCapability> () 
> which I could live with but which, really, I would like to avoid.

Like I said, it's "quite similar."  The difference is that you can share one
interface class across simulator implelemtations and then don't have a
dependency on the concrete class of the implementation in the client.  More
than one concrete implementation class can provide a single SingleStep
interface and the client doesn't have to know every possible concrete class to
work with multiple of them.