Bug 2341 – MacTxTrace in PointToPointNetDevice needs to mark actual packet transmission

Bug 2341 - MacTxTrace in PointToPointNetDevice needs to mark actual packet transmission


Summary:	MacTxTrace in PointToPointNetDevice needs to mark actual packet transmission

Status:	RESOLVED INVALID

Product:	ns-3
Classification:	Unclassified
Component:	point-to-point
Version:	ns-3-dev
Hardware:	All All

Importance:	P5 normal
Assigned To:	ns-bugs

URL:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2016-03-18 10:43 EDT by l.salameh
Modified:	2016-09-08 05:58 EDT (History)
CC List:	4 users (show)

See Also:

Attachments
Fix to mac tracing in p2p net device. (651 bytes, text/plain) 2016-03-18 10:43 EDT, l.salameh	Details
Fix for mac tracing in p2p (671 bytes, patch) 2016-03-18 10:45 EDT, l.salameh	Details \| Diff
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description l.salameh 2016-03-18 10:43:36 EDT

Created attachment 2346 [details]
Fix to mac tracing in p2p net device.

With the new changes to the queuing disciplines, when we actually enqueue a packet at the PointToPointNetDevice, if it gets dropped, it actually gets re-enqueued.

As such, the old position of the MacTxTrace callback (m_macTxTrace) in the Enqueue function is no longer valid, and will end up logging duplicate packets which get dropped an re-enqueued.

In fact, m_macTxTrace should be called exactly when we transmit the packets, not when we enqueue them, i.e. in the TransmitStart function. The m_promiscSnifferTrace does this correctly, and m_macTxTrace should be equivalent. 

See attached diff for changes.

Comment 1 l.salameh 2016-03-18 10:45:17 EDT

Created attachment 2347 [details]
Fix for mac tracing in p2p

Comment 2 Tom Henderson 2016-03-19 01:19:54 EDT

The macTxTrace, as implemented and documented by all devices, is defined as tracing the event of a packet arriving for transmission.  I don't think we should accept this patch without redefining the trace (changing its documentation string) and also aligning across all device types, and I'm not convinced to do that.

I previously discussed with Stefano that we do have a problem about the requeuing behavior of traffic control layer causing misleading drop traces at the device layer.  For a while, I had suggested some code in the PointToPointNetDevice that suppresses the drop trace from firing if the traffic control layer was present.  I wasn't convinced by this, however, and suggested to delay this kind of patch until the next release cycle, if at all.

At present, I lean towards keeping the device traces the same, and telling future users that if they want true drop traces, that they need to trace the tc layer if such layer is present.

Comment 3 l.salameh 2016-03-21 11:33:20 EDT

Fair enough. Speaking personally though, MacTxTrace sounds like it should tell me when the packet is transmitted by the mac layer. Such a trace point would definitely be more useful for me than the current one which indicates when a packet is enqueued and re-enqueued.

Thanks!

Comment 4 Tom Henderson 2016-03-22 01:18:53 EDT

This example has been improved to show RedQueueDisc usage with PointToPointNetDevice:
http://code.nsnam.org/ns-3-dev/file/3635926b4cbf/examples/traffic-control/traffic-control.cc

Comment 5 Tom Henderson 2016-05-05 15:52:48 EDT

Based on discussion, I propose to close this as Resolved->Invalid.  Everyone OK with this?

Comment 6 natale.patriciello 2016-05-06 06:37:01 EDT

(In reply to l.salameh from comment #3)
> Fair enough. Speaking personally though, MacTxTrace sounds like it should
> tell me when the packet is transmitted by the mac layer. Such a trace point
> would definitely be more useful for me than the current one which indicates
> when a packet is enqueued and re-enqueued.

I agree here, we currently don't have *any* trace that indicates the current transmission; i.e. we don't know how many packets leave the NetDevice, actually. To do this, the only way (but I'm not sure at 100%) is to connect every trace, and then add MacTx, substracts re-enqueueing and drops from tc layer and netdevice. Wow :-)

My position is that with this re-enqueing mechanism we are following the white rabbit in the magic wonderland; a feature that since the beginning disturbed me (and caused personal issues, citing http://mailman.isi.edu/pipermail/ns-developers/2016-February/013419.html). So, this is time to ask WHY we are focusing on adding implementation-details without focusing on high-level features that interest to users. With order:

- dev_requeue_skb is not declared in ANY public header (so it's obviously an implementation detail). See this lxr.free-electrons.com/source/include/net/sch_generic.h

- The following, and logical, question is why they created this function? Let's do a search on lxr.free-electrons.com/source/include/net/sch_generic.c:

 *) at line 48 we have the definition
 *) at line 134 the first useful case: there's another CPU holding a lock on the netdevice; so, they re-queue and set a delay to retry the operation.
 *) Line 183, the driver returned NETDEV_TX_BUSY . For me, the next thing to learn is to what correspond NETDEV_TX_BUSY, and why it is used. Let's begin another search (we have finished the times dev_requeue_skb is called).

A "freetext search" pointed me to this documentation on device drivers (our NetDevice, practically):
http://lxr.free-electrons.com/source/Documentation/networking/driver.txt

Let's cut'n'paste:

1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under
   any normal circumstances.  It is considered a hard error unless
   there is no way your device can tell ahead of time when it's
   transmit function will become busy.

Uhm Uhm.. this sounds me strange. An Hard Error? In ns-3 we don't have such kinds, and of course "a queue is full in the Netdevice" could not be interpreted as hard error.. 
The second reference is an example usage:

/* This is a hard error log it. */
if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) 
  {
    netif_stop_queue(dev);
    unlock_tx(dp);
    printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", dev->name);
    return NETDEV_TX_BUSY;
  }

and the third is an advice:

If you return NETDEV_TX_BUSY from the ndo_start_xmit method, 
you must not keep any reference to that SKB and you must 
not attempt to free it up.

Here the thing is more clear, because in that document there's an explanation on how to write a "scatter-gather" driver without issues, that translated is "how to properly keep a DropTail queue with Stop/Awake calls". And there's no need to have re-enqueue, except for unintentional and dangerous and what else strange device that does not have a properly-managed queue in the presence of stop/awake! So, recap, we have COPIED like monkey an implementation detail in linux for two reasons:

*) for multi-core processing on the same node (a thing we don't have in ns-3)
*) to support a remote case, when the device is awake but the ring is full (translated in ns-3 words, a BUG in the NetDevice).

Now, we are using re-enqueue for re-enqueueing packet that have been "legitimately" dropped by NetDevice. Am I wrong? Because if you properly implement awake/stop, the TC will not flood you. And if you don't implement awake/stop, the packet SHOULD BE DROPPED BECAUSE THE NETDEVICE HAS SAID THAT (to be more precise: look at point-to-point-net-device.cc line 553).

Anyway, before I said we copied like monkey, but in reality it was all my fault, since I should have been more present and pointing these out well before. Because, and that's hard to say since I'm not arrogant and I don't want to be arrogant, I have resolved almost 2 years ago these issues with my initial, uncomplete, RFC, patches without INVALIDATING year of assumptions on the traces. So, the shame is completely on me, I am sorry for that. But, back at the time, me pointing out these issues (I have mails that pointed out issues on FlowMonitor, re-enqueue, traces, RED, CODEL, and Pfifo just when all the code was on bitbucket, but pointing out things and saying "well, this is a problem, if you look at my code you see that this problem doesn't happen for this, this and that, and you know, I fixed that problem in that way" doesn't worked at all). One more personal thing before going back in technical: I believe Stefano and Pasquale are the future; I'm near my PhD ending, Matt too, I don't know Anh and Lynne status, but for sure with Stefano and Pasquale they are a great addition to the team and an investment for the future. I'd really like to add SACK to TCP, and maybe MPTCP, but then my future begins to be cloudly. What the future deserves to people near the PhD end, we don't know, but at least they have 3 years ahead, with many things to say and add, and (proved) programming skills. But, said that, I BEG to stop following white rabbits. I BEG to start thinkin' logically and with OUR head.

Technically speaking, there's my proposal. My light is on logically allowing the class in the same layer to provide the same set of traces, in the way that the users should not be worried on the type of NetDevice, and without worrying on different trace names that mean the same thing (remember the Queue size issues?). If you want to design code, please (!!!!) read the following: https://wiki.qt.io/API_Design_Principles and http://www4.in.tum.de/~blanchet/api-design.pdf. PLEASE! In all ns-3, since the beginning, we missed a plan on that. And we are facing the consequences. An example, but please take it with an high degree of uncertainty. For me, Red Codel and Pie and whatelse could have maintained as Queue (they are born in this way) and then, adding one QueueDisc that used only one internal queue, allocated as DropTail, Codel, Red, or whatever. In this way we could have maintained backward compatibility with NetDevices not supporting stop/awake. Linux has these implemented as queue disc for the impossibility of technically placing a queue inside a REAL DEVICE, and so they emulated this in software through QueueDisc. Ns-3 scope is to simulate real devices, and what if I implement a device with an internal queue using Codel ? With the way I described, everything could be implemented smoothly, now is hard.

Anyway, probably some points should be different, giving the current status of the code (and so forgive me, provide an alternative), but I believe these steps could be achieved with a not-so-big effort:

*) Remove MacTx, MacRx, *Drop* and other similar traces from NetDevices
*) Add in NetDevice main class six new traces: MacTx, MacRx, Tx, Rx, MacDrop, ChannelDrop with the obvious meaning (not sure about ChannelDrop, if belongs to NetDevice or the Channel). Maybe the obvious meaning is not the same for all of us, well for me I imagine like IP --> Mac --> Phys and the other way round. The traces are fired in the order MacTx --> Tx and Rx --> MacRx (Tx and Tx for Phys, could be renamed to Phys*)
*) The subclasses are required to use these drop, to offer the SAME LAYER of traces to users (e.g. an user could easily trace these without caring if this is CsmaNetDevice, P2PND, LteENBNetDevice, or whatever). Force a general view from the upper and allow personalization at the bottom
*) Remove Re-Enqueue feature !!!!!!!!
*) NetDevices that don't support Stop/Awake (this is the main feature the TC layer added) will use the previous scheme without any problem: no need to have re-enqueue. If the queue is full and the packet is dropped (example) well, that's a fact of the NetDevice (like the old - but NOT WRONG - implementation)
*) NetDevices that support Stop/Awake will have NO LOSSES anymore (no need to re-enqueue). MacTx and Tx still work, MacRx and Rx as well, MacDrop will not fire anymore for anything; users should attach to the QueueDisc and its traces.

QueueDisc things does not belong here, so I will say nothing. But thanks for the attention, have a nice day!

Comment 7 Tom Henderson 2016-05-06 10:46:14 EDT

(n.b. some elements (more general comment) really should be on the dev list and not in the tracker)

(In reply to natale.patriciello from comment #6)
> (In reply to l.salameh from comment #3)
> > Fair enough. Speaking personally though, MacTxTrace sounds like it should
> > tell me when the packet is transmitted by the mac layer. Such a trace point
> > would definitely be more useful for me than the current one which indicates
> > when a packet is enqueued and re-enqueued.
> 
> I agree here, we currently don't have *any* trace that indicates the current
> transmission; i.e. we don't know how many packets leave the NetDevice,
> actually. To do this, the only way (but I'm not sure at 100%) is to connect
> every trace, and then add MacTx, substracts re-enqueueing and drops from tc
> layer and netdevice. Wow :-)

I disagree, there is a trace for this:

  m_phyTxBeginTrace (m_currentPkt);

Several years ago, Craig Dowell went through all of our NetDevice types and aligned traces with standard definitions across the device types, commenting out those that couldn't be supported for some reason on a particular device.

Lynne's proposal to move macTxTrace to the Phy transmit code here would break that alignment, so I recommend instead to use existing PhyTxBegin trace rather than change definition of MacTxTrace.

> 
> My position is that with this re-enqueing mechanism we are following the
> white rabbit in the magic wonderland; a feature that since the beginning
> disturbed me (and caused personal issues, citing
> http://mailman.isi.edu/pipermail/ns-developers/2016-February/013419.html).
> So, this is time to ask WHY we are focusing on adding implementation-details
> without focusing on high-level features that interest to users. With order:
> 
> - dev_requeue_skb is not declared in ANY public header (so it's obviously an
> implementation detail). See this
> lxr.free-electrons.com/source/include/net/sch_generic.h
> 
> - The following, and logical, question is why they created this function?
> Let's do a search on lxr.free-electrons.com/source/include/net/sch_generic.c:
> 
>  *) at line 48 we have the definition
>  *) at line 134 the first useful case: there's another CPU holding a lock on
> the netdevice; so, they re-queue and set a delay to retry the operation.
>  *) Line 183, the driver returned NETDEV_TX_BUSY . For me, the next thing to
> learn is to what correspond NETDEV_TX_BUSY, and why it is used. Let's begin
> another search (we have finished the times dev_requeue_skb is called).
> 
> A "freetext search" pointed me to this documentation on device drivers (our
> NetDevice, practically):
> http://lxr.free-electrons.com/source/Documentation/networking/driver.txt
> 
> Let's cut'n'paste:
> 
> 1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under
>    any normal circumstances.  It is considered a hard error unless
>    there is no way your device can tell ahead of time when it's
>    transmit function will become busy.
> 
> Uhm Uhm.. this sounds me strange. An Hard Error? In ns-3 we don't have such
> kinds, and of course "a queue is full in the Netdevice" could not be
> interpreted as hard error.. 
> The second reference is an example usage:
> 
> /* This is a hard error log it. */
> if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) 
>   {
>     netif_stop_queue(dev);
>     unlock_tx(dp);
>     printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n",
> dev->name);
>     return NETDEV_TX_BUSY;
>   }
> 
> and the third is an advice:
> 
> If you return NETDEV_TX_BUSY from the ndo_start_xmit method, 
> you must not keep any reference to that SKB and you must 
> not attempt to free it up.
> 
> Here the thing is more clear, because in that document there's an
> explanation on how to write a "scatter-gather" driver without issues, that
> translated is "how to properly keep a DropTail queue with Stop/Awake calls".
> And there's no need to have re-enqueue, except for unintentional and
> dangerous and what else strange device that does not have a properly-managed
> queue in the presence of stop/awake! So, recap, we have COPIED like monkey
> an implementation detail in linux for two reasons:
> 
> *) for multi-core processing on the same node (a thing we don't have in ns-3)
> *) to support a remote case, when the device is awake but the ring is full
> (translated in ns-3 words, a BUG in the NetDevice).
> 
> Now, we are using re-enqueue for re-enqueueing packet that have been
> "legitimately" dropped by NetDevice. Am I wrong? Because if you properly
> implement awake/stop, the TC will not flood you. And if you don't implement
> awake/stop, the packet SHOULD BE DROPPED BECAUSE THE NETDEVICE HAS SAID THAT
> (to be more precise: look at point-to-point-net-device.cc line 553).
> 
> Anyway, before I said we copied like monkey, but in reality it was all my
> fault, since I should have been more present and pointing these out well
> before. Because, and that's hard to say since I'm not arrogant and I don't
> want to be arrogant, I have resolved almost 2 years ago these issues with my
> initial, uncomplete, RFC, patches without INVALIDATING year of assumptions
> on the traces. So, the shame is completely on me, I am sorry for that. But,
> back at the time, me pointing out these issues (I have mails that pointed
> out issues on FlowMonitor, re-enqueue, traces, RED, CODEL, and Pfifo just
> when all the code was on bitbucket, but pointing out things and saying
> "well, this is a problem, if you look at my code you see that this problem
> doesn't happen for this, this and that, and you know, I fixed that problem
> in that way" doesn't worked at all). One more personal thing before going
> back in technical: I believe Stefano and Pasquale are the future; I'm near
> my PhD ending, Matt too, I don't know Anh and Lynne status, but for sure
> with Stefano and Pasquale they are a great addition to the team and an
> investment for the future. I'd really like to add SACK to TCP, and maybe
> MPTCP, but then my future begins to be cloudly. What the future deserves to
> people near the PhD end, we don't know, but at least they have 3 years
> ahead, with many things to say and add, and (proved) programming skills.
> But, said that, I BEG to stop following white rabbits. I BEG to start
> thinkin' logically and with OUR head.
> 
> Technically speaking, there's my proposal. My light is on logically allowing
> the class in the same layer to provide the same set of traces, in the way
> that the users should not be worried on the type of NetDevice, and without
> worrying on different trace names that mean the same thing (remember the
> Queue size issues?). If you want to design code, please (!!!!) read the
> following: https://wiki.qt.io/API_Design_Principles and
> http://www4.in.tum.de/~blanchet/api-design.pdf. PLEASE! In all ns-3, since
> the beginning, we missed a plan on that. And we are facing the consequences.
> An example, but please take it with an high degree of uncertainty. For me,
> Red Codel and Pie and whatelse could have maintained as Queue (they are born
> in this way) and then, adding one QueueDisc that used only one internal
> queue, allocated as DropTail, Codel, Red, or whatever. In this way we could
> have maintained backward compatibility with NetDevices not supporting
> stop/awake. Linux has these implemented as queue disc for the impossibility
> of technically placing a queue inside a REAL DEVICE, and so they emulated
> this in software through QueueDisc. Ns-3 scope is to simulate real devices,
> and what if I implement a device with an internal queue using Codel ? With
> the way I described, everything could be implemented smoothly, now is hard.
> 
> Anyway, probably some points should be different, giving the current status
> of the code (and so forgive me, provide an alternative), but I believe these
> steps could be achieved with a not-so-big effort:
> 
> *) Remove MacTx, MacRx, *Drop* and other similar traces from NetDevices
> *) Add in NetDevice main class six new traces: MacTx, MacRx, Tx, Rx,
> MacDrop, ChannelDrop with the obvious meaning (not sure about ChannelDrop,
> if belongs to NetDevice or the Channel). Maybe the obvious meaning is not
> the same for all of us, well for me I imagine like IP --> Mac --> Phys and
> the other way round. The traces are fired in the order MacTx --> Tx and Rx
> --> MacRx (Tx and Tx for Phys, could be renamed to Phys*)
> *) The subclasses are required to use these drop, to offer the SAME LAYER of
> traces to users (e.g. an user could easily trace these without caring if
> this is CsmaNetDevice, P2PND, LteENBNetDevice, or whatever). Force a general
> view from the upper and allow personalization at the bottom

Here, my reaction to the immediately above three bullets is "we have already a consistent tracing architecture, you are proposing a slight variation of a similar consistent architecture, what is the gain of breaking API for these changes?"  

If users do not like how MacTx is defined and find it not very useful, there are other traces that can be used.  There are about ten or so defined across the devices.

> *) Remove Re-Enqueue feature !!!!!!!!
> *) NetDevices that don't support Stop/Awake (this is the main feature the TC
> layer added) will use the previous scheme without any problem: no need to
> have re-enqueue. If the queue is full and the packet is dropped (example)
> well, that's a fact of the NetDevice (like the old - but NOT WRONG -
> implementation)
> *) NetDevices that support Stop/Awake will have NO LOSSES anymore (no need
> to re-enqueue). MacTx and Tx still work, MacRx and Rx as well, MacDrop will
> not fire anymore for anything; users should attach to the QueueDisc and its
> traces.
> 
> QueueDisc things does not belong here, so I will say nothing. But thanks for
> the attention, have a nice day!

Thanks for sharing your general comments, but I think we should move the discussion of 'should we bother with requeue, is the pain more than the gain?' to the dev list.

Again, I believe that we already have a consistent trace layout such as you are advocating for, so please review it and maybe then come back to the dev list with a counter proposal or enhancement.

Comment 8 natale.patriciello 2016-05-06 11:23:48 EDT

(In reply to Tom Henderson from comment #7)
> I disagree, there is a trace for this:
> 
>   m_phyTxBeginTrace (m_currentPkt);
> 
> Several years ago, Craig Dowell went through all of our NetDevice types and
> aligned traces with standard definitions across the device types, commenting
> out those that couldn't be supported for some reason on a particular device.
> 
> Lynne's proposal to move macTxTrace to the Phy transmit code here would
> break that alignment, so I recommend instead to use existing PhyTxBegin
> trace rather than change definition of MacTxTrace.

Ah ok, I missed that; I should have checked the code instead of a random model description, thanks for pointing out this. The fact that these traces are not in the main NetDevice class worries me a bit, but I agree this is a general question and not related to this bug.

> Thanks for sharing your general comments, but I think we should move the
> discussion of 'should we bother with requeue, is the pain more than the
> gain?' to the dev list.
> 
> Again, I believe that we already have a consistent trace layout such as you
> are advocating for, so please review it and maybe then come back to the dev
> list with a counter proposal or enhancement.

Ook, I will.
Thanks again

Comment 9 Adarsh Patel 2016-09-08 05:58:19 EDT

*** Bug 2489 has been marked as a duplicate of this bug. ***