Bugzilla – Bug 903
Tap Bridge and Emu Net Devices Do Not Shut Down Properly
Last modified: 2017-01-31 23:41:47 UTC
Created attachment 859 [details] Patch to Gracefully Shut Down Read Threads in Tap Bridge Tap Bridge and Emu NetDevices don't schedule a shutdown event to gracefully exit their read threads when the simulator is destroyed. This can result in segfaults when the simulator is destroyed, but the read threads are still running, and try to schedule a packet reception. This is not noticed in most scripts since the last line of main() is typically Simulator::Destroy() and the process exits before the bug manifests. Patch attached. Should probably be applied after ns-3.8 ships since nobody has noticed this for the past year or so. Similar treatment required for emu net device.
Created attachment 1019 [details] Possible alternate fix This patch extends the earlier patch and reorganizes the code some. A new class is added to asynchronously read from a file descriptor. An open file descriptor and callback are passed to the Start() method which creates a new system thread; the Stop() method cancels the read thread and cleans up. This is essentially a combination of the the previous patch and the fix proposed in http://www.nsnam.org/bugzilla/0. The TapBridge device is adapted to use this approach. Something similar could be done for the EmuNetDevice although this code does not currently include a way to limit the amount of pending data.
(In reply to comment #1) > A new class is added to asynchronously read from a file descriptor. > An open file descriptor and callback are passed to the Start() method > which creates a new system thread; the Stop() method cancels the read > thread and cleans up. I like the patch very much. If you commit it, I would like to see the SystemThread::Shutdown and SystemThread::Break methods go away because they are not needed anymore (this would mean killing the UnixSystemThread signal handling code too). > The TapBridge device is adapted to use this approach. Something > similar could be done for the EmuNetDevice although this code does not > currently include a way to limit the amount of pending data. pending on the rx or tx path ? (sorry if this is obvious but I have not been tracking bugzilla stuff recently)
> > The TapBridge device is adapted to use this approach. Something > > similar could be done for the EmuNetDevice although this code does not > > currently include a way to limit the amount of pending data. > > pending on the rx or tx path ? (sorry if this is obvious but I have not been > tracking bugzilla stuff recently) I believe the comment refers to the fix of bug 939 in changeset 29512368dd2e
Created attachment 1020 [details] patch to extend previous patch > If you commit it, I would like to see the > SystemThread::Shutdown and SystemThread::Break methods go away because they are > not needed anymore (this would mean killing the UnixSystemThread signal > handling code too). Is the attached what you are asking for? Do you think it is safe to commit now with the previous patch, even though EmuNetDevice is not being patched at the moment? Or, should this second patch wait until EmuNetDevice is similarly patched?
(In reply to comment #4) > > If you commit it, I would like to see the > > SystemThread::Shutdown and SystemThread::Break methods go away because they are > > not needed anymore (this would mean killing the UnixSystemThread signal > > handling code too). > > Is the attached what you are asking for? yes. > Do you think it is safe to commit now with the previous patch, even though > EmuNetDevice is not being patched at the moment? Or, should this second patch > wait until EmuNetDevice is similarly patched? Ideally, it would be similarly patched now, yes (because now if when the patch submitter has incentive to fix it). If you don't do it yourself, I can do this later tonight and commit the cumulative patches in ns-3-dev.
> > Do you think it is safe to commit now with the previous patch, even though > > EmuNetDevice is not being patched at the moment? Or, should this second patch > > wait until EmuNetDevice is similarly patched? > > Ideally, it would be similarly patched now, yes (because now if when the patch > submitter has incentive to fix it). If you don't do it yourself, I can do this > later tonight and commit the cumulative patches in ns-3-dev. I had a look at this tonight; it seems that something similar (unix-socket-reader) could be built but I am leery of doing a rush job on it and destabilizing EmuNetDevice right before the release. Mainly, I don't have any good Emu test scripts to make sure that it doesn't regress. I'd like to propose merging Tom's TapBridge patch (which has been tested in the Linux namespaces context this month) and delay EmuNetDevice changes and SystemThread cleanup until post-release. I could also be convinced to hold all of these till after release, if you want.
Is this bug resolved? I get the same error on the last ns-3 version: my simulation doesn't stop since EmuNetDevice is waiting for reception. Does someone has a patch to resolve this please?
(In reply to comment #7) > Is this bug resolved? > > I get the same error on the last ns-3 version: my simulation doesn't stop since > EmuNetDevice is waiting for reception. > > Does someone has a patch to resolve this please? I'm not aware of any change in status with respect to EmuNetDevice; I believe the issue has been resolved for the TapBridge case. It could also help to post a simple example that demonstrates the problem.
The "emu" module is no more. Should we close this bug ?
Marking resolved fix since the patch made its way into the FdNetDevice/FdReader, and emu device is removed.