Object Start Stop specification
Note: Initial specification taken mostly from here: http://mailman.isi.edu/pipermail/ns-developers/2012-September/010658.html
Features
Note: this page is not internally consistent at the moment. It will be updated once there is some further discussion on the May 2014 developer meeting discussions.
Mostly agreed upon (prior to May 2014)
- Object::Start and Object::Stop each can be called multiple times (no need for separate Reset).
If Object::Start() is called on a started object, it will perform an implicit Stop() then Start().This breaks a lot of existing code. Better: If Object::Start() is called on a started object, it is a no-op.- If Object::Stop() is called on a stopped object, it is a no-op.
- Note, though, that it is important to call Start (not Stop) from the right context. Right now, this is ensured by NodeListPriv::Add when a node is created which propagates to all Start methods. So, a user who wants to 'restart' a node should not call Node::Start directly. He should call ScheduleWithContext(Node::GetId(), &Node::Start).
- Maybe we need to find a way to make this less error prone by defining a Node-specific Start/Stop methods and making it illegal for users to call
directly the Start/Stop methods of other objects.
- For both Start and Stop, an Object subclass will call the corresponding method on all of its aggregates, and call up to the parent class.
- A question here is whether or not you should Stop a Channel if you Stop one of the attached NetDevices. Mathieu Lacage
- No: Doesn't make much sense to do it in general, IMHO it should be model-dependant. Vedran Miletić
- A question here is whether or not you should Stop a Channel if you Stop one of the attached NetDevices. Mathieu Lacage
- For member variables (including attributes), there is no automation; the subclass must explicitly decide what to do with its members.
- It should be possible for a NetDevice to asynchronously learn of a stopped Channel, depending on the needs of the model.
- It would be better if we wanted something more general, i.e. it should be possible to query the status of any Object with Object::IsActive() or a similar metod. Vedran Miletić
- What is a 'stopped' channel? Mathieu Lacage
Under discussion (prior to May 2014)
- Events are (are not?) removed from the scheduler pertaining to stopped objects.
- Comment: I'm not sure how to decide what to remove and what not; object might be restarted and ready in time of a scheduled event, and I believe it depends on the kind of an object whether it should or should not process those events. Soft "are not" on this one; it should be up to the model to decide what object does or doesn't when stopped. Vedran Miletić
- Are not: not automatically. I would say that it is the responsability of each simulation object to cancel (or not) relevant events upon Stop in its DoStop method. Mathieu Lacage
- Stopped netdevices are (are not?) removed from the channel
- Are not: stopped netdevices should be kept on the channel, just not Tx/Rx. Vedran Miletić
- Are not: It seems a lot of work to make sure that all NetDevice objects do this properly.
Example use cases
- An energy model aggregated to a Node depletes all of its energy. It then stops itself, which (by virtue of aggregation) will call Node::Stop, which will stop everything on the node. If the energy model later obtains more energy, it could restart, which should restart the whole Node.
- If it does this, it needs to be careful to call ScheduleWithContext for the restart. Mathieu Lacage
- User calls Node::Stop() which stops all applications and NetDevices on the node, as well as any objects that have been aggregated to the node. User can later call Node::Start() to restart everything.
- It seems to me from this discussion that there are 3 kinds of objects on which users might be tempted to call Stop or Start: Node, NetDevice, Channel, and maybe Application? All other objects should _never_ receive a call to Start or Stop
directly by a user. Now, the question is: what are the expected user-visible semantics of calling Start/Stop on each of these objects? Mathieu Lacage
- User calls NetDevice::Stop()/Start() which just manipulates that device and any objects associated to that device, but does not impact other devices or the Node itself.
- User calls Channel::Stop() which causes channel to stop transmitting packets, and makes NetDevices able to find out about the channel status.
May 2014 developer discussions
We reviewed Vedran Miletic's latest patch for providing Failure/Repair capabilities to Objects, and the wiki page concerning the design goals and previous agreements.
Questions again arose as to what we are trying to model. We list below a few illustrative use-cases that were discussed:
- rover vehicle stops when batteries drained, reboots (or wakes) when solar kicks in: mobility model stops in this case
- satellite sleeps/wakes: mobility continues while asleep
- excluded volume mobility models need to know there is a dead object, so you can't just delete a non-restarting Node and all its contents. It needs to preserve its position.
Peter emphasized the clarity that a state diagram might lend to the discussion. We reached consensus on one such diagram (attached). It seems like we need to distinguish between Stop/Restart and Sleep/Wake, and probably provide these as separate capabilities (that may be combined).
In this approach, Sleep/Wake would probably be used in energy-aware contexts, and would mainly correspond to the behavior that event processing (packet reception and generation) would be suspended while in a sleep state, but most object state would be preserved in the transition. However, Stop/Restart would restore the object to its original post-construction state, and some objects may just be stoppable and not restartable. If an object were to provide Restart(), it would need to do some additional work to make sure that all post-construction state was cached upon first initialization.
We discussed the implementation approach. In order to optionally provide this such that the trait is queryable via GetObject(), there seem to be two approaches:
- use the mixin approach but extend the Object::DoGetObject to search a directed graph structure and not a tree structure when it searches for TypeId matches
- else provide a set of new base class objects such as “StoppableObject”, “ResumableObject”, and “StoppableAndResumableObject” that mixed in the appropriate interfaces
We suggest that 2) would be preferable to avoid TypeId changes. These classes StartStopFunctionality and SleepWakeFunctionality could also be used (mixed in) for non ns3::Object classes if desired.
We had some discussion about the difficulty in automating the stopping of associated objects, such as whether calling stop on a Node should stop its motion or not (it may depend). We initially considered that, we ought to focus on just stopping objects individually and not trying to have the Stop() propagate to other aggregated objects. However, in the example use case of a Node running out of energy then later awakening, we may need to in fact to provide a Node-level Sleep/Wake that calls into the constituent objects. So we concluded that we ought to perform a node-level example.
We discussed specifically that handling of Sleep and Stop events may need to be handled in different ways by any mobility models; there may be some scenarios where energy depletion results in motion pausing or continuing, depending on the desired effect. So, it is not necessarily the case that Node::Stop() or Node::Sleep() should cause those operations to occur on the mobility model; we'll need to be able to handle different policies. This affects the previously-held assumption that Node::Stop() should stop all aggregated objects.
Suggested next steps:
- implement test objects to instantiate
- try to implement Sleep/Wake example of a Node running out of energy, then later getting more energy, and doing something sensible. Either WiFi or LrWpan devices are probably the best ones to attempt this with.
A final note was that there were some comments in the wiki noting the need to call "ScheduleWithContext()" upon restart events, and people were unaware of the status of the threaded scheduler, so we agreed to look into this.