Difference between revisions of "HOWTO use Valgrind to debug memory problems"

From Nsnam
Jump to: navigation, search
(edit valgrind hints)
(update for ns3)
 
Line 3: Line 3:
 
Memory leaks or errors can be found with [http://www.valgrind.org Valgrind].  Support for valgrind is built into the ./waf system by typing:
 
Memory leaks or errors can be found with [http://www.valgrind.org Valgrind].  Support for valgrind is built into the ./waf system by typing:
  
   ./waf --command-template="valgrind [options] %s" --run ns-3-program-name
+
   ./ns3 run --command-template="valgrind [options] %s" ns-3-program-name
  
 
for example:
 
for example:
   ./waf --command-template="valgrind --leak-check=full --show-reachable=yes %s" --run main-propagation-loss
+
   ./ns3 run --command-template="valgrind --leak-check=full --show-reachable=yes %s" main-propagation-loss
  
 
== Supported platforms and configurations ==
 
== Supported platforms and configurations ==
  
Valgrind should be used on debug, dynamically linked code (the default in ns-3) and not on statically linked or optimized code; see https://people.gnome.org/~newren/tutorials/developing-with-gnome/html/ch03s03.html
+
Valgrind should be preferentially used on debug, dynamically linked code (the default in ns-3) and not on statically linked or optimized code; it has been reported to give false positives sometimes on optimized code.
  
 
Valgrind for ns-3 is known to work on recent Linux systems that do not have gtk enabled.  In particular, there seems to be a leak in some libraries related to gtk, which is used in ns-3 for the GTK Config Store component.  It may be possible to add some suppressions, but it is likely easier to try to disable gtk when you want to use valgrind.
 
Valgrind for ns-3 is known to work on recent Linux systems that do not have gtk enabled.  In particular, there seems to be a leak in some libraries related to gtk, which is used in ns-3 for the GTK Config Store component.  It may be possible to add some suppressions, but it is likely easier to try to disable gtk when you want to use valgrind.
Line 16: Line 16:
 
To disable gtk (even if it is on your system), provide the "--disable-gtk" option at configure time; e.g.:
 
To disable gtk (even if it is on your system), provide the "--disable-gtk" option at configure time; e.g.:
  
   ./waf configure --disable-gtk --enable-examples --enable-tests
+
   ./ns3 configure --disable-gtk --enable-examples --enable-tests
  
 
An example of clean valgrind output for the "first" tutorial example is below:
 
An example of clean valgrind output for the "first" tutorial example is below:
  
  ./waf --valgrind --run first
+
  ./ns3 run --valgrind first
  Waf: Entering directory `/path/to/ns-3-dev/build'
+
==154077== Memcheck, a memory error detector
  Waf: Leaving directory `/path/to/ns-3-dev/build'
+
==154077== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
  'build' finished successfully (2.510s)
+
==154077== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
  ==29514== Memcheck, a memory error detector
+
==154077== Command: /home/tomh/temp/ns-3-dev/build/examples/tutorial/ns3-dev-first-debug
  ==29514== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
+
==154077==  
  ==29514== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
+
At time +2s client sent 1024 bytes to 10.1.1.2 port 9
  ==29514== Command: /path/to/ns-3-dev/build/examples/tutorial/ns3-dev-first-debug
+
At time +2.00369s server received 1024 bytes from 10.1.1.1 port 49153
  ==29514==  
+
At time +2.00369s server sent 1024 bytes to 10.1.1.1 port 49153
  At time 2s client sent 1024 bytes to 10.1.1.2 port 9
+
At time +2.00737s client received 1024 bytes from 10.1.1.2 port 9
  At time 2.00369s server received 1024 bytes from 10.1.1.1 port 49153
+
==154077==  
  At time 2.00369s server sent 1024 bytes to 10.1.1.1 port 49153
+
==154077== HEAP SUMMARY:
  At time 2.00737s client received 1024 bytes from 10.1.1.2 port 9
+
==154077==    in use at exit: 0 bytes in 0 blocks
  ==29514==  
+
==154077==  total heap usage: 14,980 allocs, 14,980 frees, 1,224,372 bytes allocated
  ==29514== HEAP SUMMARY:
+
==154077==  
  ==29514==    in use at exit: 0 bytes in 0 blocks
+
==154077== All heap blocks were freed -- no leaks are possible
  ==29514==  total heap usage: 4,122 allocs, 4,122 frees, 259,597 bytes allocated
+
==154077==  
  ==29514==  
+
==154077== For lists of detected and suppressed errors, rerun with: -s
  ==29514== All heap blocks were freed -- no leaks are possible
+
==154077== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
  ==29514==  
+
  ==29514== For counts of detected and suppressed errors, rerun with: -v
+
  ==29514== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
+
  
 
If gtk is not disabled, you may get notifications of leaks that exist in third-party libraries, such as:
 
If gtk is not disabled, you may get notifications of leaks that exist in third-party libraries, such as:
Line 61: Line 58:
 
=== Failure to call Simulator::Destroy() ===
 
=== Failure to call Simulator::Destroy() ===
  
Simulator::Destroy() will free memory that is created with the ns-3 object system.  Forgetting to call Simulator::Destroy () when you are done will lead to reachable memory being reported by valgrind, such as this trace of the main-propagation-loss example:
+
Simulator::Destroy() will free memory that is created with the ns-3 object system.  Forgetting to call Simulator::Destroy () when you are done will lead to reachable memory being reported by valgrind, such as this trace of the main-propagation-loss example (from before it was fixed):
  
  
Line 94: Line 91:
 
The syntax is:
 
The syntax is:
  
   ./waf --command-template="valgrind [options] %s [test-runner options]" --run test-runner
+
   ./ns3 run --command-template="valgrind [options] %s [test-runner options]" test-runner
  
 
For example, to isolate and check the 'isotropic-antenna-model' test suite, one can invoke:
 
For example, to isolate and check the 'isotropic-antenna-model' test suite, one can invoke:
  
   ./waf --command-template="valgrind --leak-check=full --show-reachable=yes --track-origins=yes %s --suite=isotropic-antenna-model" --run "test-runner"
+
   ./ns3 run --command-template="valgrind --leak-check=full --show-reachable=yes --track-origins=yes %s --suite=isotropic-antenna-model" test-runner

Latest revision as of 20:16, 20 June 2022

Main Page - Current Development - Developer FAQ - Tools - Related Projects - Project Ideas - Summer Projects

Installation - Troubleshooting - User FAQ - HOWTOs - Samples - Models - Education - Contributed Code - Papers

Memory leaks or errors can be found with Valgrind. Support for valgrind is built into the ./waf system by typing:

 ./ns3 run --command-template="valgrind [options] %s" ns-3-program-name

for example:

 ./ns3 run --command-template="valgrind --leak-check=full --show-reachable=yes %s" main-propagation-loss

Supported platforms and configurations

Valgrind should be preferentially used on debug, dynamically linked code (the default in ns-3) and not on statically linked or optimized code; it has been reported to give false positives sometimes on optimized code.

Valgrind for ns-3 is known to work on recent Linux systems that do not have gtk enabled. In particular, there seems to be a leak in some libraries related to gtk, which is used in ns-3 for the GTK Config Store component. It may be possible to add some suppressions, but it is likely easier to try to disable gtk when you want to use valgrind.

To disable gtk (even if it is on your system), provide the "--disable-gtk" option at configure time; e.g.:

 ./ns3 configure --disable-gtk --enable-examples --enable-tests

An example of clean valgrind output for the "first" tutorial example is below:

./ns3 run --valgrind first
==154077== Memcheck, a memory error detector
==154077== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==154077== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==154077== Command: /home/tomh/temp/ns-3-dev/build/examples/tutorial/ns3-dev-first-debug
==154077== 
At time +2s client sent 1024 bytes to 10.1.1.2 port 9
At time +2.00369s server received 1024 bytes from 10.1.1.1 port 49153
At time +2.00369s server sent 1024 bytes to 10.1.1.1 port 49153
At time +2.00737s client received 1024 bytes from 10.1.1.2 port 9
==154077== 
==154077== HEAP SUMMARY:
==154077==     in use at exit: 0 bytes in 0 blocks
==154077==   total heap usage: 14,980 allocs, 14,980 frees, 1,224,372 bytes allocated
==154077== 
==154077== All heap blocks were freed -- no leaks are possible
==154077== 
==154077== For lists of detected and suppressed errors, rerun with: -s
==154077== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If gtk is not disabled, you may get notifications of leaks that exist in third-party libraries, such as:

 ==891== 2,072 bytes in 1 blocks are still reachable in loss record 242 of 246
 ==891==    at 0x4C28409: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
 ==891==    by 0xE010EFA: ??? (in /usr/lib64/libpixman-1.so.0.30.0)
 ==891==    by 0xE029ABF: ??? (in /usr/lib64/libpixman-1.so.0.30.0)
 ==891==    by 0xE0001F5: ??? (in /usr/lib64/libpixman-1.so.0.30.0)
 ==891==    by 0xE01160B: ??? (in /usr/lib64/libpixman-1.so.0.30.0)
 ==891==    by 0xDFC3508: ??? (in /usr/lib64/libpixman-1.so.0.30.0)
 ==891==    by 0x400F4F2: _dl_init (in /usr/lib64/ld-2.17.so)
 ==891==    by 0x4001459: ??? (in /usr/lib64/ld-2.17.so)

Common errors

Please list hints here as to what kind of errors have known resolution.

Failure to call Simulator::Destroy()

Simulator::Destroy() will free memory that is created with the ns-3 object system. Forgetting to call Simulator::Destroy () when you are done will lead to reachable memory being reported by valgrind, such as this trace of the main-propagation-loss example (from before it was fixed):


 ==16325== 88 bytes in 1 blocks are still reachable in loss record 4 of 4
 ==16325==    at 0x4A069D5: operator new(unsigned long) (vg_replace_malloc.c:261)
 ==16325==    by 0x4C87A6B: ns3::TypeId ns3::TypeId::AddConstructor<ns3::DefaultSimulatorImpl>()::Maker::Create() (type-id.h:429)
 ==16325==    by 0x4C778A3: ns3::FunctorCallbackImpl<ns3::ObjectBase* (*)(), ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() (callback.h:166)
 ==16325==    by 0x4CBF22C: ns3::Callback<ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() const (callback.h:407)
 ==16325==    by 0x4CBE05E: ns3::ObjectFactory::Create() const (object-factory.cc:69)
 ==16325==    by 0x4C8569E: ns3::Ptr<ns3::SimulatorImpl> ns3::ObjectFactory::Create<ns3::SimulatorImpl>() const (object-factory.h:110)
 ==16325==    by 0x4C823FB: ns3::GetImpl() (simulator.cc:93)
 ==16325==    by 0x4C831B1: ns3::Simulator::Stop(ns3::Time const&) (simulator.cc:184)
 ==16325==    by 0x404157: TestDeterministic(ns3::Ptr<ns3::PropagationLossModel>) (main-propagation-loss.cc:83)
 ==16325==    by 0x405CC1: main (main-propagation-loss.cc:225)
 ==16325== 
 ==16325== LEAK SUMMARY:
 ==16325==    definitely lost: 0 bytes in 0 blocks
 ==16325==    indirectly lost: 0 bytes in 0 blocks
 ==16325==      possibly lost: 0 bytes in 0 blocks
 ==16325==    still reachable: 200 bytes in 4 blocks
 ==16325==         suppressed: 0 bytes in 0 blocks
 ==16325== 
 ==16325== For counts of detected and suppressed errors, rerun with: -v
 ==16325== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 7)

Here, the clue is the line that says "ns3::Ptr<ns3::SimulatorImpl> ns3::ObjectFactory::Create<ns3::SimulatorImpl>() const (object-factory.h:110)" and the lines above it. If you see an error such as that reporting "still reachable" blocks, it is often the case that you forgot to call Simulator::Destroy() to free objects that have been created as factory objects.

test.py memory leaks

If a memory leak pop up on a test program (the ones enabled by "configure --enable-tests" and launched with "./test.py -g something"), they can be valgrind debugged by the test-runner program.

The syntax is:

 ./ns3 run --command-template="valgrind [options] %s [test-runner options]" test-runner

For example, to isolate and check the 'isotropic-antenna-model' test suite, one can invoke:

 ./ns3 run --command-template="valgrind --leak-check=full --show-reachable=yes --track-origins=yes %s --suite=isotropic-antenna-model" test-runner