Bug 2091

Summary: race condition in LteUePhy::GenerateCqiRsrpRsrq
Product: ns-3 Reporter: Luciano Chaves <ljerezchaves>
Component: lteAssignee: Nicola Baldo <nicola>
Status: PATCH WANTED ---    
Severity: minor CC: ns-bugs
Priority: P4    
Version: ns-3.22   
Hardware: PC   
OS: Linux   
Attachments: Change the assert to an if statement.

Description Luciano Chaves 2015-04-08 19:08:14 EDT
Created attachment 2012 [details]
Change the assert to an if statement.

When executing long LTE simulations, I found a bug at LteUePhy::GenerateCqiRsrpRsrq, line 561:  NS_ASSERT (itMeasMap != m_ueMeasurementsMap.end ());

At a specific moment (more than 50000 simulated seconds), the m_ueMeasurementsMap get to this function with no elements, even when m_pssReceived is true and there are information in m_pssList. When this happens, the assert at line 561 fails, as there is no elements in the map.

Function LteUePhy::ReceivePss is responsible for populating both the m_pssList and m_ueMeasurementsMap, which will be used together in LteUePhy::GenerateCqiRsrpRsrq function. These functions are callbacks fired at LteSpectrumPhy and LteChunkProcessor, respectively. A third function, LteUePhy::ReportUeMeasurements, is periodically executed and reports the measured values stored into m_ueMeasurementsMap, and also erases this map (but don't change the m_pssList nor the m_pssReceived variable). 

However, there is no guarantee that this LteUePhy::ReportUeMeasurements function will be scheduled between LteUePhy::ReceivePss and LteUePhy::GenerateCqiRsrpRsrq calls. If this happens, an error may arrises.

As I'm not an expert in LTE phy implementation, I'm not sure how to handle this. But, to avoid an assert failure, I'm attaching a patch to replace this assert by a simple if. In this way, the simulation can continue despite some wrong reported measurement values.
Comment 1 Nicola Baldo 2015-04-10 07:06:25 EDT
Thanks Luciano for the detailed bug report and the patch!

After reading your description, I agree that there is a race condition among the three function calls that you mentioned. I just pushed a slightly modified version of your patch as a temporary workaround (changeset 8c68d7368185).
A final fix should remove the race condition, which would require some non-trivial refactoring of the code. I am hence leaving this bug open until someone can come up with a patch for this. 

It's not urgent to fix as the buggy behaviour only happens rarely. The effect is that a Layer 1 UE measurement is occasionally skipped.