Bug 2788 - test-runner crashes running multiple tests
test-runner crashes running multiple tests
Status: RESOLVED WONTFIX
Product: ns-3
Classification: Unclassified
Component: core
ns-3-dev
All All
: P3 normal
Assigned To: Peter Barnes
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2017-10-02 17:47 EDT by Peter Barnes
Modified: 2019-11-05 14:22 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Barnes 2017-10-02 17:47:56 EDT
On Sep 4, 2017, at 10:37 AM, Ammon, Robert (ammo6818@vandals.uidaho.edu) 
<ammo6818@vandals.uidaho.edu> wrote:

I am working with Tom on a project to set up an NS-3 Development Environment on 
Visual Studio.  I have created the Visual Studio environment and made all code 
changes necessary to build the NS-3 development branch under Visual Studio.  I 
started from John Abraham's previous work but it needed a lot of work  (the 
details of which are a 10 page long E-mail that I previously sent Tom that I 
have left out here for brevity and because I don't think the details are 
germane to the following discussion).

I have now started executing TestRunner to verify all of the tests and have 
encountered an issue.  The issue appears to be in the design of the Time class 
and by my review should be platform neutral (we should be having the same issue 
under Linux), even though it does not appear under Linux.  Since the issue only 
appears under Window, I am looking for your help in trying to understand the 
difference between Windows and Linux and why we dont see the same issue under 
Linux (since it appears that we should).

The following is the output from testrunner that is produced in the command 
window for the application when run under Windows.  The issue is the assert on 
the last line that is being triggered.  The interesting thing is that the 
assert is valid and matches the current implementation of the Time class and 
the affected test case (AttributeTestCase).

PASS angles 0.000 s
PASS cosine-antenna-model 0.013 s
PASS degrees-radians 0.000 s
PASS isotropic-antenna-model 0.000 s
PASS parabolic-antenna-model 0.011 s
PASS aodv-routing-id-cache 1.217 s
FAIL routing-aodv-regression 3.085 s
PASS routing-aodv 2.176 s
PASS routing-aodv-loopback 0.783 s
PASS udp-client-server 3.085 s
assert failed. cond="g_markingTimes != 0", msg="No MarkedTimes registry. 
Time::SetResolution () called more than once?", 
file=c:\ns3\ns-3-win-dev\src\core\model\time.cc, line=351

I have attached a PDF file that shows screen shots of the particular execution. 
 As can be seen in the PDF, the first breakpoint is from the CORE.DLL 
initialization of the static elements.  Next the simulation initialization 
executes SetDefaultNsResolution which calls SetResolution to initialize the 
time resolution for the simulation.

Then, ClearMarkedTimes is called when the first test case (AODV) is executed.  
I have left them out but ClearMarkedTimes is called for each test case shown in 
the output window.

Finally, the AttributeValue test case is executed.  In that test case, it calls 
Time::SetResolution to ensure the time units are nanoseconds.  Eventually that 
call to Time::SetResolution triggers the assertion.

In reviewing the current implementation, I cant see any way that this should 
behave any differently under Linux (the same assertion should be thrown).  The 
affected member of the Time class is g_markingTimes and this member only gets 
set on static variable initialization at program load and gets cleared (set to 
null) by Time::ClearMarkedTimes and one execution path of Time::ConvertTimes.  
Once cleared by either of these two functions, there is no way to set the value 
and since ClearMarkedTimes is called by Simulator::Run for each test cases 
executed, I cant see how Time::SetResolution could ever be successfully used in 
any of the test cases.

I suspect there is some little thing some place that I am missing here, and 
since the current implementation was implemented by you as part of Bug 954 
updates, I am hoping that you can point out the little something that I might 
be missing.

Any help or ideas you can provide would be greatly appreciated.
Comment 1 Peter Barnes 2017-10-02 17:48:51 EDT
This is indeed a puzzle.

After a quick look I agree with most of your summary of the execution paths, 
except for one point:  while Simulator::Run *does* clear the marked times, I 
don’t see where *tests* call Simulator::Run.  It that is indeed the case, then 
there shouldn’t be an issue with a test setting the resolution. 

I think we should modify execution so calling SetResolution a second time with 
the current resolution *does not* throw an exception; a no-op is a no-op, after 
all.  

But this will hide your assert, which indicates a real issue.

I’m not sure what to suggest, other than watching g_markingTimes, and/or set 
breakpoints on all the relevant functions and examine the call stack on each 
hit.
Comment 2 Peter Barnes 2017-10-02 17:49:08 EDT
On Sep 5, 2017, at 6:18 PM, Ammon, Robert (ammo6818@vandals.uidaho.edu) 
<ammo6818@vandals.uidaho.edu> wrote:

I explored this some more and here is the issue.

Under Linux, ./test.py runs a new instance of test-runner for each test being 
executed.

What I was doing on Windows was executing test-runner.exe with no parameters to 
execute every test.  In this situation, SetResolution gets called by multiple 
test cases which looks to the Time class library and multiple calls to 
SetResolution within a single program (which it is).

If I create a batch file to execute test-runner.exe once for each test case, I 
get the same behavior as under Linux.

My follow on question is why does the implementation of the Time class set the 
value of g_markingTimes to null in ConvertTime and ClearMarkTimes? I assume 
that is to prevent SetResolution from being called more than once but I am not 
sure what's the reason for that implementation.
Comment 3 Peter Barnes 2017-10-02 17:50:21 EDT
On Sep 5, 2017, at 6:18 PM, Ammon, Robert (ammo6818@vandals.uidaho.edu) 
<ammo6818@vandals.uidaho.edu> wrote:

> I explored this some more and here is the issue.
> 
> Under Linux, ./test.py runs a new instance of test-runner for each test 
> being executed.
> 
> What I was doing on Windows was executing test-runner.exe with no 
> parameters to execute every test. 

Ah, that explains it.

> In this situation, SetResolution gets called by multiple test cases which 
> looks to the Time class library and multiple calls to SetResolution within a 
> single program (which it is).

It looks like accepting SetResolution to the current resolution will only fix a 
few of those cases.  I found Time::SetResolution here:

attribute-test-suite.cc:    Time::SetResolution (Time::NS);
first.cc:                   Time::SetResolution (Time::NS);
sample-log-time-format.cc:  Time::SetResolution (search->second);
time-test-suite.cc:         Time::SetResolution (Time::PS);

So only the first two would be safe.

> If I create a batch file to execute test-runner.exe once for each test case, 
I 
> get the same behavior as under Linux.

Yes, that will have to be the real solution.  

It would be nice if TestRunnerIpml could do something clever when running all 
tests, like print a warning, omit the tests with SetResolution, only run one of 
them (I’d vote for time-test-suite).  Not obvious to to me how to do any of 
these.  Perhaps add TestCase::SetsTimeResolution(void), to mark those tests?

> My follow on question is why does the implementation of the Time class 
> set the value of g_markingTimes to null in ConvertTime and 
> ClearMarkTimes? I assume that is to prevent SetResolution from being 
> called more than once but I am not sure what's the reason for that i
> mplementation.

Since Time is used everywhere, we didn’t want the resolution option to slow 
things down while running.  Therefore we only support changing the resolution 
during setup, and freeze the resolution during execution.  The implementation 
mechanics are to set a flag, which happens to be a pointer, to indicate if 
we’re recording Time instances or not.  Notice that the Time constructors all 
check that flag.  Simulator::Run Time::SetResolution clear the flag.
Comment 4 Peter Barnes 2017-10-02 17:50:37 EDT
On Sep 28, 2017, at 10:21 AM, Tom Henderson <tomh@tomh.org> wrote:

I'm not sure this was driven to resolution; do we need to open a Bugzilla issue 
to not forget about the details?
Comment 5 Peter Barnes 2017-10-02 17:51:12 EDT
On Sep 28, 2017, at 5:40 PM, Ammon, Robert (ammo6818@vandals.uidaho.edu) 
<ammo6818@vandals.uidaho.edu> wrote:

This originally got written because when i tried to execute test-runner w/o any 
parameters in the Windows environment, it started throwing assertions failures.

Turns out that event though test-runner is designed to automatically execute 
all test cases if you execute the program without any command line parameters, 
that wont work because the of limitations on changing the time resolution.  
Without significantly changing this restriction, test-runner cant be executed 
w/o command line parameters on either Linux or Windows.

So my plan is the use the same approach to execute tests on Windows as on 
Linux, namely ./test.py.  I need to do this anyway in order to pick up the 
Examples that get executed by ./test.py so the same set of tests and examples 
get executed on both Linux and Windows.

Bottom line is that I don't ever see this behavior changing so i don't see the 
need for a Bugzilla issue.
Comment 6 Peter Barnes 2017-10-02 17:59:14 EDT
Yes, I don't think we wan to change the behavior of Simulator::Run.

We *could* change Time::SetResolution not to dump the g_markingTimes list (but 
those tests which then Simulator::Run will still fail when run together.  It 
looks like first.cc and sample-log-time-format.cc are the only ones which both 
SetResolution and Simulator::Run.  

We could hard-code an exclusive list in test-runner to run only one of those at 
a time.  Seems a little hokey, but it would prevent this error.  Which one 
should be enabled?
Comment 7 Peter Barnes 2019-11-04 19:10:53 EST
Underlying issue is running each test suite in a separate process (Linux and Mac), vs. running all in the same process (Windows port).

There didn't appear to be any good alternatives, other than run tests in Windows as in the other systems.

Can we close this?
Comment 8 Tom Henderson 2019-11-04 20:19:52 EST
I am OK to close as resolved/wontfix, and if the Windows VS port is revived someday in a similar fashion, it can be revisited then.
Comment 9 Peter Barnes 2019-11-05 14:22:53 EST
Resolving as WONTFIX.  Could also be considered as INVALID, since the Windows test framework was running all tests in a single process context, not as designed.