Bug 903

Summary: Tap Bridge and Emu Net Devices Do Not Shut Down Properly
Product: ns-3 Reporter: Craig Dowell <craigdo>
Component: emulationAssignee: ns-bugs <ns-bugs>
Status: RESOLVED FIXED    
Severity: normal CC: mathieu.lacage, sebastien.deronne, tgoff, tomh, tommaso.pecorella
Priority: P5    
Version: ns-3.7   
Hardware: All   
OS: All   
Attachments: Patch to Gracefully Shut Down Read Threads in Tap Bridge
Possible alternate fix
patch to extend previous patch

Description Craig Dowell 2010-04-30 14:05:25 EDT
Created attachment 859 [details]
Patch to Gracefully Shut Down Read Threads in Tap Bridge

Tap Bridge and Emu NetDevices don't schedule a shutdown event to gracefully exit their read threads when the simulator is destroyed.  This can result in segfaults when the simulator is destroyed, but the read threads are still running, and try to schedule a packet reception.

This is not noticed in most scripts since the last line of main() is typically Simulator::Destroy() and the process exits before the bug manifests.

Patch attached.

Should probably be applied after ns-3.8 ships since nobody has noticed this for the past year or so.

Similar treatment required for emu net device.
Comment 1 Tom Goff 2010-12-22 20:21:38 EST
Created attachment 1019 [details]
Possible alternate fix

This patch extends the earlier patch and reorganizes the code some.

A new class is added to asynchronously read from a file descriptor.
An open file descriptor and callback are passed to the Start() method
which creates a new system thread; the Stop() method cancels the read
thread and cleans up.

This is essentially a combination of the the previous patch and the
fix proposed in http://www.nsnam.org/bugzilla/show_bug.cgi?id=975.

The TapBridge device is adapted to use this approach.  Something
similar could be done for the EmuNetDevice although this code does not
currently include a way to limit the amount of pending data.
Comment 2 Mathieu Lacage 2010-12-29 04:43:21 EST
(In reply to comment #1)

> A new class is added to asynchronously read from a file descriptor.
> An open file descriptor and callback are passed to the Start() method
> which creates a new system thread; the Stop() method cancels the read
> thread and cleans up.

I like the patch very much. If you commit it, I would like to see the SystemThread::Shutdown and SystemThread::Break methods go away because they are not needed anymore (this would mean killing the UnixSystemThread signal handling code too).

> The TapBridge device is adapted to use this approach.  Something
> similar could be done for the EmuNetDevice although this code does not
> currently include a way to limit the amount of pending data.

pending on the rx or tx path ? (sorry if this is obvious but I have not been tracking bugzilla stuff recently)
Comment 3 Tom Henderson 2010-12-29 09:36:41 EST
> > The TapBridge device is adapted to use this approach.  Something
> > similar could be done for the EmuNetDevice although this code does not
> > currently include a way to limit the amount of pending data.
> 
> pending on the rx or tx path ? (sorry if this is obvious but I have not been
> tracking bugzilla stuff recently)

I believe the comment refers to the fix of bug 939 in changeset 29512368dd2e
Comment 4 Tom Henderson 2010-12-29 16:59:10 EST
Created attachment 1020 [details]
patch to extend previous patch


> If you commit it, I would like to see the
> SystemThread::Shutdown and SystemThread::Break methods go away because they are
> not needed anymore (this would mean killing the UnixSystemThread signal
> handling code too).

Is the attached what you are asking for?  

Do you think it is safe to commit now with the previous patch, even though EmuNetDevice is not being patched at the moment?  Or, should this second patch wait until EmuNetDevice is similarly patched?
Comment 5 Mathieu Lacage 2010-12-30 03:27:48 EST
(In reply to comment #4)
> > If you commit it, I would like to see the
> > SystemThread::Shutdown and SystemThread::Break methods go away because they are
> > not needed anymore (this would mean killing the UnixSystemThread signal
> > handling code too).
> 
> Is the attached what you are asking for?  

yes.

> Do you think it is safe to commit now with the previous patch, even though
> EmuNetDevice is not being patched at the moment?  Or, should this second patch
> wait until EmuNetDevice is similarly patched?

Ideally, it would be similarly patched now, yes (because now if when the patch submitter has incentive to fix it). If you don't do it yourself, I can do this later tonight and commit the cumulative patches in ns-3-dev.
Comment 6 Tom Henderson 2010-12-31 00:33:44 EST
> > Do you think it is safe to commit now with the previous patch, even though
> > EmuNetDevice is not being patched at the moment?  Or, should this second patch
> > wait until EmuNetDevice is similarly patched?
> 
> Ideally, it would be similarly patched now, yes (because now if when the patch
> submitter has incentive to fix it). If you don't do it yourself, I can do this
> later tonight and commit the cumulative patches in ns-3-dev.

I had a look at this tonight; it seems that something similar (unix-socket-reader) could be built but I am leery of doing a rush job on it and destabilizing EmuNetDevice right before the release.  Mainly, I don't have any good Emu test scripts to make sure that it doesn't regress.

I'd like to propose merging Tom's TapBridge patch (which has been tested in the Linux namespaces context this month) and delay EmuNetDevice changes and SystemThread cleanup until post-release.  I could also be convinced to hold all of these till after release, if you want.
Comment 7 sebastien.deronne 2013-09-16 05:34:09 EDT
Is this bug resolved? 

I get the same error on the last ns-3 version: my simulation doesn't stop since EmuNetDevice is waiting for reception.

Does someone has a patch to resolve this please?
Comment 8 Tom Goff 2013-09-24 11:23:24 EDT
(In reply to comment #7)
> Is this bug resolved? 
> 
> I get the same error on the last ns-3 version: my simulation doesn't stop since
> EmuNetDevice is waiting for reception.
> 
> Does someone has a patch to resolve this please?

I'm not aware of any change in status with respect to EmuNetDevice; I
believe the issue has been resolved for the TapBridge case.

It could also help to post a simple example that demonstrates the
problem.
Comment 9 Tommaso Pecorella 2017-01-31 21:22:20 EST
The "emu" module is no more. Should we close this bug ?
Comment 10 Tom Henderson 2017-01-31 23:41:47 EST
Marking resolved fix since the patch made its way into the FdNetDevice/FdReader, and emu device is removed.