Bug 711 - example mesh/mesh fails valgrind
example mesh/mesh fails valgrind
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: samples
pre-release
All All
: P1 blocker
Assigned To: ns-bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-10-09 02:22 EDT by Craig Dowell
Modified: 2009-10-17 00:29 EDT (History)
1 user (show)

See Also:


Attachments
fix several possible leaks (5.82 KB, patch)
2009-10-14 05:18 EDT, Andrey Mazo
Details | Diff
fix some more leaks (7.75 KB, patch)
2009-10-14 07:01 EDT, Andrey Mazo
Details | Diff
and even more fixes (8.99 KB, patch)
2009-10-14 13:01 EDT, Andrey Mazo
Details | Diff
valgrind happy!! (10.21 KB, patch)
2009-10-15 05:07 EDT, Andrey Mazo
Details | Diff
add NS_LOG_FUNCTION to constructors/destructors/DoDisposes (2.27 KB, patch)
2009-10-15 05:54 EDT, Andrey Mazo
Details | Diff
Fix the bug itself (3.33 KB, patch)
2009-10-15 06:18 EDT, Andrey Mazo
Details | Diff
remove some redundant cleanups, includes, etc (2.90 KB, patch)
2009-10-15 06:33 EDT, Andrey Mazo
Details | Diff
more correct fix for the bug (1.85 KB, patch)
2009-10-16 06:47 EDT, Andrey Mazo
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Craig Dowell 2009-10-09 02:22:53 EDT
happens on ns-regression:  configured for debug; gcc version 4.2.4 (Ubuntu
4.2.4-1ubuntu4); Linux version 2.6.24-19-server

reproduce with:

  ./waf --run mesh --valgrind
Comment 1 Andrey Mazo 2009-10-09 05:53:30 EDT
(In reply to comment #0)
> happens on ns-regression:  configured for debug; gcc version 4.2.4 (Ubuntu
> 4.2.4-1ubuntu4); Linux version 2.6.24-19-server
> 
> reproduce with:
> 
>   ./waf --run mesh --valgrind

A brief investigation showed, that valgrind is happy on x86 machine
1) gcc-4.2.4 and gcc-4.3.2
2) valgrind-3.3.1 and valgrind-3.4.1

It seems, that the problem is x86-64 specific.
Comment 2 Craig Dowell 2009-10-13 02:35:34 EDT
valgrind invalid reads are apparently due to valgrind having a problem with a particular flavor of anonymous temporary via iterator:

It doesn't like

  tag.SetAddress (*i);

but does like

  Mac48Address address = *i;
  tag.SetAddress (address);

Most memory leaks due to the old problem of trying to use stl::container.erase to release Ptr<x> without explicit zero of the Ptr.

Still another leak left.
Comment 3 Craig Dowell 2009-10-13 03:20:41 EDT
The last problem is the following if someone wants to take a crack at it:

==3120==
==3120== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==3120== malloc/free: in use at exit: 60,468 bytes in 701 blocks.
==3120== malloc/free: 1,909,817 allocs, 1,909,116 frees, 110,667,282 bytes allocated.
==3120== For counts of detected errors, rerun with: -v
==3120== searching for pointers to 701 not-freed blocks.
==3120== checked 1,112,200 bytes.
==3120==
==3120== 60,468 (56 direct, 60,412 indirect) bytes in 1 blocks are definitely lost in loss record 8 of 75
==3120==    at 0x4C23809: operator new(unsigned long) (vg_replace_malloc.c:230)
==3120==    by 0x58D7ECE: ns3::EventImpl* ns3::MakeEvent<void (ns3::YansWifiPhy::*)(ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::Interfe
renceHelper::Event>), ns3::YansWifiPhy*, ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::InterferenceHelper::Event> >(void (ns3::YansWifiPh
y::*)(ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::InterferenceHelper::Event>), ns3::YansWifiPhy*, ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::
InterferenceHelper::Event>) (make-event.h:145)
==3120==    by 0x58D7FCC: ns3::EventId ns3::Simulator::Schedule<void (ns3::YansWifiPhy::*)(ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::
InterferenceHelper::Event>), ns3::YansWifiPhy*, ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::InterferenceHelper::Event> >(ns3::TimeUnit<
1> const&, void (ns3::YansWifiPhy::*)(ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::InterferenceHelper::Event>), ns3::YansWifiPhy*, ns3::
Ptr<ns3::Packet>, ns3::Ptr<ns3::InterferenceHelper::Event>) (simulator.h:661)
==3120==    by 0x58D1AE2: ns3::YansWifiPhy::StartReceivePacket(ns3::Ptr<ns3::Packet>, double, ns3::WifiMode, ns3::WifiPreamble)
(yans-wifi-phy.cc:441)
==3120==    by 0x58D9197: ns3::YansWifiChannel::Receive(unsigned, ns3::Ptr<ns3::Packet>, double, ns3::WifiMode, ns3::WifiPreambl
e) const (yans-wifi-channel.cc:104)
==3120==    by 0x58D9282: _ZZN3ns39MakeEventIMNS_15YansWifiChannelEKFvjNS_3PtrINS_6PacketEEEdNS_8WifiModeENS_12WifiPreambleEEPKS
1_jS4_dS5_S6_EEPNS_9EventImplET_T0_T1_T2_T3_T4_T5_EN16EventMemberImpl56NotifyEv (make-event.h:230)
==3120==    by 0x552ADAE: ns3::EventImpl::Invoke() (event-impl.cc:39)
==3120==    by 0x5546592: ns3::DefaultSimulatorImpl::ProcessOneEvent() (default-simulator-impl.cc:113)
==3120==    by 0x55465DA: ns3::DefaultSimulatorImpl::Run() (default-simulator-impl.cc:143)
==3120==    by 0x5532DA6: ns3::Simulator::Run() (simulator.cc:160)
==3120==    by 0x40DE49: MeshTest::Run() (mesh.cc:221)
==3120==    by 0x40F045: main (mesh.cc:250)
==3120==
==3120== LEAK SUMMARY:
==3120==    definitely lost: 56 bytes in 1 blocks.
==3120==    indirectly lost: 60,412 bytes in 700 blocks.
==3120==      possibly lost: 0 bytes in 0 blocks.
==3120==    still reachable: 0 bytes in 0 blocks.
==3120==         suppressed: 0 bytes in 0 blocks.
Comment 4 Andrey Mazo 2009-10-13 04:21:03 EDT
(In reply to comment #2)
> Most memory leaks due to the old problem of trying to use stl::container.erase
> to release Ptr<x> without explicit zero of the Ptr.
Are there any FAQ entry about this old problem?
It's not an obvious thing, because erase() must call Ptr::~Ptr(), which in turn must call Unref() and then free the allocated memory.
I don't yet understand, why it is required to call Unref() manually by setting the Ptr to zero.
Comment 5 Andrey Mazo 2009-10-13 15:10:46 EDT
(In reply to comment #3)
Well, I'm having completely different valgrind report on i686:

==31146==
==31146== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 1)
==31146== malloc/free: in use at exit: 39,444 bytes in 701 blocks.
==31146== malloc/free: 1,892,925 allocs, 1,892,224 frees, 77,338,893 bytes allocated.
==31146== For counts of detected errors, rerun with: -v
==31146== searching for pointers to 701 not-freed blocks.
==31146== checked 389,836 bytes.
==31146==
==31146== 39,444 (60 direct, 39,384 indirect) bytes in 1 blocks are definitely lost in loss record 9 of 75
==31146==    at 0x40256F3: operator new(unsigned int) (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
==31146==    by 0x833C902: ns3::Ptr<ns3::Node> ns3::CreateObject<ns3::Node>() (object.h:484)
==31146==    by 0x8630953: ns3::NodeContainer::Create(unsigned int) (node-container.cc:96)
==31146==    by 0x804F568: MeshTest::CreateNodes() (mesh.cc:150)
==31146==    by 0x805200A: MeshTest::Run() (mesh.cc:216)
==31146==    by 0x80539E5: main (mesh.cc:250)
==31146==
==31146== LEAK SUMMARY:
==31146==    definitely lost: 60 bytes in 1 blocks.
==31146==    indirectly lost: 39,384 bytes in 700 blocks.
==31146==      possibly lost: 0 bytes in 0 blocks.
==31146==    still reachable: 0 bytes in 0 blocks.
==31146==         suppressed: 0 bytes in 0 blocks.
Comment 6 Andrey Mazo 2009-10-14 05:18:08 EDT
Created attachment 624 [details]
fix several possible leaks
Comment 7 Andrey Mazo 2009-10-14 07:01:37 EDT
Created attachment 625 [details]
fix some more leaks

This patch makes amd64 valgrind output similar to i686.
Comment 8 Andrey Mazo 2009-10-14 13:01:24 EDT
Created attachment 626 [details]
and even more fixes
Comment 9 Andrey Mazo 2009-10-15 05:07:11 EDT
Created attachment 627 [details]
valgrind happy!!
Comment 10 Andrey Mazo 2009-10-15 05:51:40 EDT
(In reply to comment #9)
> Created an attachment (id=627) [details]
> valgrind happy!!
I'll now split this patch into several patches and remove some redundancy.
Comment 11 Andrey Mazo 2009-10-15 05:54:11 EDT
Created attachment 628 [details]
add NS_LOG_FUNCTION to constructors/destructors/DoDisposes
Comment 12 Andrey Mazo 2009-10-15 06:18:49 EDT
Created attachment 629 [details]
Fix the bug itself
Comment 13 Andrey Mazo 2009-10-15 06:33:30 EDT
Created attachment 630 [details]
remove some redundant cleanups, includes, etc
Comment 14 Andrey Mazo 2009-10-15 06:43:50 EDT
(In reply to comment #13)
> Created an attachment (id=630) [details]
> remove some redundant cleanups, includes, etc
Split complete.
Apply in posted order.
Comment 15 Andrey Mazo 2009-10-15 16:16:29 EDT
Changeset 21a4f34518ff
Comment 16 Andrey Mazo 2009-10-16 06:29:44 EDT
(In reply to comment #15)
> Changeset 21a4f34518ff

Well, the more I think about this fix, the more I believe, that it's a workaround.
1) someone may not use mesh-helper and thus introduce the same circular references again
2) someone may create circular references through callbacks involving three or more objects, which will be much harder to detect
3) callback may be invoked, when the object it references is already destroyed, thus leading to a memory corruption or a segfault (though I still cannot imagine such a situation)
I think, a better solution is to assign NullCallbacks to all callbacks in DoDispose ()'s.
I'm going to attach a patch reverting this changeset and applying the correct fix.
Comment 17 Andrey Mazo 2009-10-16 06:47:32 EDT
Created attachment 631 [details]
more correct fix for the bug

currently can't test it on ns-regressions
Comment 18 Andrey Mazo 2009-10-17 00:29:29 EDT
(In reply to comment #17)
> Created an attachment (id=631) [details]
> more correct fix for the bug
> 
> currently can't test it on ns-regressions
valgrind on ns-regressions also seems to be silent.