Bug 711 - example mesh/mesh fails valgrind
: example mesh/mesh fails valgrind
Status: RESOLVED FIXED
: ns-3
samples
: pre-release
: All All
: P1 blocker
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2009-10-09 02:22 EDT by
Modified: 2009-10-17 00:29 EDT (History)


Attachments
fix several possible leaks (5.82 KB, patch)
2009-10-14 05:18 EDT, Andrey Mazo
Details | Diff
fix some more leaks (7.75 KB, patch)
2009-10-14 07:01 EDT, Andrey Mazo
Details | Diff
and even more fixes (8.99 KB, patch)
2009-10-14 13:01 EDT, Andrey Mazo
Details | Diff
valgrind happy!! (10.21 KB, patch)
2009-10-15 05:07 EDT, Andrey Mazo
Details | Diff
add NS_LOG_FUNCTION to constructors/destructors/DoDisposes (2.27 KB, patch)
2009-10-15 05:54 EDT, Andrey Mazo
Details | Diff
Fix the bug itself (3.33 KB, patch)
2009-10-15 06:18 EDT, Andrey Mazo
Details | Diff
remove some redundant cleanups, includes, etc (2.90 KB, patch)
2009-10-15 06:33 EDT, Andrey Mazo
Details | Diff
more correct fix for the bug (1.85 KB, patch)
2009-10-16 06:47 EDT, Andrey Mazo
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2009-10-09 02:22:53 EDT
happens on ns-regression:  configured for debug; gcc version 4.2.4 (Ubuntu
4.2.4-1ubuntu4); Linux version 2.6.24-19-server

reproduce with:

  ./waf --run mesh --valgrind
------- Comment #1 From 2009-10-09 05:53:30 EDT -------
(In reply to comment #0)
> happens on ns-regression:  configured for debug; gcc version 4.2.4 (Ubuntu
> 4.2.4-1ubuntu4); Linux version 2.6.24-19-server
> 
> reproduce with:
> 
>   ./waf --run mesh --valgrind

A brief investigation showed, that valgrind is happy on x86 machine
1) gcc-4.2.4 and gcc-4.3.2
2) valgrind-3.3.1 and valgrind-3.4.1

It seems, that the problem is x86-64 specific.
------- Comment #2 From 2009-10-13 02:35:34 EDT -------
valgrind invalid reads are apparently due to valgrind having a problem with a
particular flavor of anonymous temporary via iterator:

It doesn't like

  tag.SetAddress (*i);

but does like

  Mac48Address address = *i;
  tag.SetAddress (address);

Most memory leaks due to the old problem of trying to use stl::container.erase
to release Ptr<x> without explicit zero of the Ptr.

Still another leak left.
------- Comment #3 From 2009-10-13 03:20:41 EDT -------
The last problem is the following if someone wants to take a crack at it:

==3120==
==3120== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==3120== malloc/free: in use at exit: 60,468 bytes in 701 blocks.
==3120== malloc/free: 1,909,817 allocs, 1,909,116 frees, 110,667,282 bytes
allocated.
==3120== For counts of detected errors, rerun with: -v
==3120== searching for pointers to 701 not-freed blocks.
==3120== checked 1,112,200 bytes.
==3120==
==3120== 60,468 (56 direct, 60,412 indirect) bytes in 1 blocks are definitely
lost in loss record 8 of 75
==3120==    at 0x4C23809: operator new(unsigned long) (vg_replace_malloc.c:230)
==3120==    by 0x58D7ECE: ns3::EventImpl* ns3::MakeEvent<void
(ns3::YansWifiPhy::*)(ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::Interfe
renceHelper::Event>), ns3::YansWifiPhy*, ns3::Ptr<ns3::Packet>,
ns3::Ptr<ns3::InterferenceHelper::Event> >(void (ns3::YansWifiPh
y::*)(ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::InterferenceHelper::Event>),
ns3::YansWifiPhy*, ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::
InterferenceHelper::Event>) (make-event.h:145)
==3120==    by 0x58D7FCC: ns3::EventId ns3::Simulator::Schedule<void
(ns3::YansWifiPhy::*)(ns3::Ptr<ns3::Packet>, ns3::Ptr<ns3::
InterferenceHelper::Event>), ns3::YansWifiPhy*, ns3::Ptr<ns3::Packet>,
ns3::Ptr<ns3::InterferenceHelper::Event> >(ns3::TimeUnit<
1> const&, void (ns3::YansWifiPhy::*)(ns3::Ptr<ns3::Packet>,
ns3::Ptr<ns3::InterferenceHelper::Event>), ns3::YansWifiPhy*, ns3::
Ptr<ns3::Packet>, ns3::Ptr<ns3::InterferenceHelper::Event>) (simulator.h:661)
==3120==    by 0x58D1AE2:
ns3::YansWifiPhy::StartReceivePacket(ns3::Ptr<ns3::Packet>, double,
ns3::WifiMode, ns3::WifiPreamble)
(yans-wifi-phy.cc:441)
==3120==    by 0x58D9197: ns3::YansWifiChannel::Receive(unsigned,
ns3::Ptr<ns3::Packet>, double, ns3::WifiMode, ns3::WifiPreambl
e) const (yans-wifi-channel.cc:104)
==3120==    by 0x58D9282:
_ZZN3ns39MakeEventIMNS_15YansWifiChannelEKFvjNS_3PtrINS_6PacketEEEdNS_8WifiModeENS_12WifiPreambleEEPKS
1_jS4_dS5_S6_EEPNS_9EventImplET_T0_T1_T2_T3_T4_T5_EN16EventMemberImpl56NotifyEv
(make-event.h:230)
==3120==    by 0x552ADAE: ns3::EventImpl::Invoke() (event-impl.cc:39)
==3120==    by 0x5546592: ns3::DefaultSimulatorImpl::ProcessOneEvent()
(default-simulator-impl.cc:113)
==3120==    by 0x55465DA: ns3::DefaultSimulatorImpl::Run()
(default-simulator-impl.cc:143)
==3120==    by 0x5532DA6: ns3::Simulator::Run() (simulator.cc:160)
==3120==    by 0x40DE49: MeshTest::Run() (mesh.cc:221)
==3120==    by 0x40F045: main (mesh.cc:250)
==3120==
==3120== LEAK SUMMARY:
==3120==    definitely lost: 56 bytes in 1 blocks.
==3120==    indirectly lost: 60,412 bytes in 700 blocks.
==3120==      possibly lost: 0 bytes in 0 blocks.
==3120==    still reachable: 0 bytes in 0 blocks.
==3120==         suppressed: 0 bytes in 0 blocks.
------- Comment #4 From 2009-10-13 04:21:03 EDT -------
(In reply to comment #2)
> Most memory leaks due to the old problem of trying to use stl::container.erase
> to release Ptr<x> without explicit zero of the Ptr.
Are there any FAQ entry about this old problem?
It's not an obvious thing, because erase() must call Ptr::~Ptr(), which in turn
must call Unref() and then free the allocated memory.
I don't yet understand, why it is required to call Unref() manually by setting
the Ptr to zero.
------- Comment #5 From 2009-10-13 15:10:46 EDT -------
(In reply to comment #3)
Well, I'm having completely different valgrind report on i686:

==31146==
==31146== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 1)
==31146== malloc/free: in use at exit: 39,444 bytes in 701 blocks.
==31146== malloc/free: 1,892,925 allocs, 1,892,224 frees, 77,338,893 bytes
allocated.
==31146== For counts of detected errors, rerun with: -v
==31146== searching for pointers to 701 not-freed blocks.
==31146== checked 389,836 bytes.
==31146==
==31146== 39,444 (60 direct, 39,384 indirect) bytes in 1 blocks are definitely
lost in loss record 9 of 75
==31146==    at 0x40256F3: operator new(unsigned int) (in
/usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
==31146==    by 0x833C902: ns3::Ptr<ns3::Node> ns3::CreateObject<ns3::Node>()
(object.h:484)
==31146==    by 0x8630953: ns3::NodeContainer::Create(unsigned int)
(node-container.cc:96)
==31146==    by 0x804F568: MeshTest::CreateNodes() (mesh.cc:150)
==31146==    by 0x805200A: MeshTest::Run() (mesh.cc:216)
==31146==    by 0x80539E5: main (mesh.cc:250)
==31146==
==31146== LEAK SUMMARY:
==31146==    definitely lost: 60 bytes in 1 blocks.
==31146==    indirectly lost: 39,384 bytes in 700 blocks.
==31146==      possibly lost: 0 bytes in 0 blocks.
==31146==    still reachable: 0 bytes in 0 blocks.
==31146==         suppressed: 0 bytes in 0 blocks.
------- Comment #6 From 2009-10-14 05:18:08 EDT -------
Created an attachment (id=624) [details]
fix several possible leaks
------- Comment #7 From 2009-10-14 07:01:37 EDT -------
Created an attachment (id=625) [details]
fix some more leaks

This patch makes amd64 valgrind output similar to i686.
------- Comment #8 From 2009-10-14 13:01:24 EDT -------
Created an attachment (id=626) [details]
and even more fixes
------- Comment #9 From 2009-10-15 05:07:11 EDT -------
Created an attachment (id=627) [details]
valgrind happy!!
------- Comment #10 From 2009-10-15 05:51:40 EDT -------
(In reply to comment #9)
> Created an attachment (id=627) [details] [details]
> valgrind happy!!
I'll now split this patch into several patches and remove some redundancy.
------- Comment #11 From 2009-10-15 05:54:11 EDT -------
Created an attachment (id=628) [details]
add NS_LOG_FUNCTION to constructors/destructors/DoDisposes
------- Comment #12 From 2009-10-15 06:18:49 EDT -------
Created an attachment (id=629) [details]
Fix the bug itself
------- Comment #13 From 2009-10-15 06:33:30 EDT -------
Created an attachment (id=630) [details]
remove some redundant cleanups, includes, etc
------- Comment #14 From 2009-10-15 06:43:50 EDT -------
(In reply to comment #13)
> Created an attachment (id=630) [details] [details]
> remove some redundant cleanups, includes, etc
Split complete.
Apply in posted order.
------- Comment #15 From 2009-10-15 16:16:29 EDT -------
Changeset 21a4f34518ff
------- Comment #16 From 2009-10-16 06:29:44 EDT -------
(In reply to comment #15)
> Changeset 21a4f34518ff

Well, the more I think about this fix, the more I believe, that it's a
workaround.
1) someone may not use mesh-helper and thus introduce the same circular
references again
2) someone may create circular references through callbacks involving three or
more objects, which will be much harder to detect
3) callback may be invoked, when the object it references is already destroyed,
thus leading to a memory corruption or a segfault (though I still cannot
imagine such a situation)
I think, a better solution is to assign NullCallbacks to all callbacks in
DoDispose ()'s.
I'm going to attach a patch reverting this changeset and applying the correct
fix.
------- Comment #17 From 2009-10-16 06:47:32 EDT -------
Created an attachment (id=631) [details]
more correct fix for the bug

currently can't test it on ns-regressions
------- Comment #18 From 2009-10-17 00:29:29 EDT -------
(In reply to comment #17)
> Created an attachment (id=631) [details] [details]
> more correct fix for the bug
> 
> currently can't test it on ns-regressions
valgrind on ns-regressions also seems to be silent.