Bug 188 - global routing does not handle ip aliasing
global routing does not handle ip aliasing
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: routing
pre-release
All All
: P2 normal
Assigned To: Tom Henderson
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-27 17:55 EDT by Mathieu Lacage
Modified: 2009-04-21 00:17 EDT (History)
3 users (show)

See Also:


Attachments
sample code (3.63 KB, text/x-c++src)
2008-05-27 17:55 EDT, Mathieu Lacage
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mathieu Lacage 2008-05-27 17:55:05 EDT
The attached program segfaults during PopulateTables.
Comment 1 Mathieu Lacage 2008-05-27 17:55:49 EDT
Created attachment 137 [details]
sample code
Comment 2 Mathieu Lacage 2008-05-27 17:59:58 EDT
What this program is doing is creating two ip interfaces which both reference the same underlying netdevice. I am sure that other things will break (most notably in the arp layer) but I was expecting that the global routing code would deal with this gracefully.
Comment 3 Tom Henderson 2008-05-28 00:09:25 EDT
I can't compile that against ns-3-dev, but I am guessing that the global routing is assuming one Ipv4Interface per NetDevice and/or that they are all on the same subnet.

But your use of IP aliasing is unusual in that you are assigning addresses from different subnets to the aliased interfaces.  Is this what you want?  Usually they are from the same network.
Comment 4 Mathieu Lacage 2008-05-28 11:10:58 EDT
(In reply to comment #3)
> I can't compile that against ns-3-dev, but I am guessing that the global

ah. I will try to make it build against ns-3-dev.

> routing is assuming one Ipv4Interface per NetDevice and/or that they are all on
> the same subnet.

yes.

> 
> But your use of IP aliasing is unusual in that you are assigning addresses from
> different subnets to the aliased interfaces.  Is this what you want?  Usually
> they are from the same network.

I don't know. I am trying to exercise the system. It seems a bit rude to get a crash for that.

> 

Comment 5 Craig Dowell 2008-06-10 11:32:45 EDT
The test case uses the Ipv4AddressHelper to try and create two virtual
networks on a csma channel.  There are three nodes and it tries to assign
two interfaces with different network/address combinations to the first
node/device.  The second node is supposed to get an interface on a
network/address corresponding to one of the first node's (virtual) networks,
and the third node gets a network/address on the other.

What happens here is that the first set of IP address assignments using
Ipv4AddressHelper::Assign() on a collection of two net devices works as
expected.  The second assignment is expected to add a second interface to
the first node, but the helper uses FindInterfaceForDevice to determine if
there already exists an interface talking to the device.  There is an
existing interface in this case, so a new one is not created and the helper
simply changes/overwrites the old IP address with the new one.  

The end result is a network with two nodes on one network number and one
node on another.

The next problem is that the global router code is run.  All of the router
LSAs are exported correctly (but recall the first node only has one
interface, not two).  In the router code, it needs to elect a designated
router for a broadcast network segment.  It does this by walking the channel
and looking for the lowest numbered IP address on the channel.  This, in
turn is done by getting each device from the channel, then doing a
FindIfIndexForDevice() and then a GetAddress().  Yes, since the concept of
one interface per device is implicitly assumed here, the designated router
for network segment 10.1.2.0 is 10.1.1.0 which doesn't make a whole lot of
sense ;-)

Eventually when we're calculating the shortest paths, the confusion comes
back to bite us and when we try to make sense of a nonsensical router ID/IP
address combination the router asserts (it's not a crash,
GetIfIndexByAddress eventually asserts, which is the subject of yet another
bug).

Discussion regarding how to resolve, requirements for new feature, etc., required
Comment 6 Craig Dowell 2008-06-10 11:37:18 EDT
Comment from Tom:

> What happens here is that the first set of IP address assignments using
> Ipv4AddressHelper::Assign() on a collection of two net devices works as
> expected.  The second assignment is expected to add a second interface to
> the first node, but the helper uses FindInterfaceForDevice to determine if
> there already exists an interface talking to the device.  There is an
> existing interface in this case, so a new one is not created and the helper
> simply changes/overwrites the old IP address with the new one.  

FWIW, I think that the helper should raise a warning or assert when this 
condition arises, because it is much more likely that this type of 
network assignment is a programming error.

If/when we decide we want to support this, maybe the low-level API 
should be required.
Comment 7 Craig Dowell 2008-06-10 11:38:15 EDT
Comment from Tom:

> one interface per device is implicitly assumed here, the designated router
> for network segment 10.1.2.0 is 10.1.1.0 which doesn't make a whole lot of
> sense ;-)

Yes.  I doubt that quagga supports this; Cisco might.  In practice, 
these two networks would be like ships-in-the-night from the perspective 
of OSPF.  There would be two network LSAs originated; one for the 
10.1.2.0 network and another for the 10.1.1.0 network.  The interfaces 
on 10.1.2.0 would ignore hellos from the 10.1.1.0 interface, and vice versa.

However, in our code, we should probably check that everyone on the same 
link belongs to the same network, since we do not have a Hello process 
to enforce that for us.
Comment 8 Craig Dowell 2008-06-10 11:39:18 EDT
Comment from Tom:

I can tell you from experience that it is tricky to get multiple 
addresses from different subnets on the same interface to work right on 
a Linux box; for instance, source address selection is non-deterministic 
and you can end up with mismatched source addresses.

However, I do not think this is a big simulation use case and I think 
that, more often than not, this will really be indicative of a 
programming error.

I do think we need to support multiple IP addresses per interface but 
mainly when we get to IPv6 where that is the norm.

Note:  this is somewhat related to bug 85, where we have ifIndexes at 
different levels of the system; ip aliaising will definitely break the 
alignment of IPv4Interface index to NetDevice index even if we tried to 
align otherwise.
Comment 9 Craig Dowell 2008-06-10 11:40:28 EDT
Comment from Tom:

I think we should consider the solution to bug 85 jointly, and API 
change is likely.  If we do "FindIndexForDevice" and it returns a single 
integer, what happens when there are more than one to return?

I also think that Ipv6 will cause us to revisit (maybe the Ipv6 group 
has already thought about this).
Comment 10 Craig Dowell 2008-06-10 11:42:26 EDT
Comment from Tom:

Here is some more food for thought that we may consider about solving 
this and bug 85.

In Linux, there is one ifIndex that refers to both the NetDevice (struct 
net_device) and Ipv4Interface (struct in_device).  Each struct in_device 
contains a list of struct in_ifaddr that actually hold the address bits.

So, it seems to me that a solution to these problems may lie in:

1) adding support for multiple IP addresses for each Ipv4Interface
- this changes the API and implementation for class Ipv4Interface, which 
assumes a single address

2) taking care to align ifIndex between NetDevice and Ipv4Interface
Comment 11 Craig Dowell 2008-06-10 13:51:22 EDT
Marked as P2/LATER per prior agreement.  Will revisit after 3.1 release.
Comment 12 Gustavo J. A. M. Carneiro 2008-06-10 14:07:37 EDT
(In reply to comment #10)
> Comment from Tom:
> 
> Here is some more food for thought that we may consider about solving 
> this and bug 85.
> 
> In Linux, there is one ifIndex that refers to both the NetDevice (struct 
> net_device) and Ipv4Interface (struct in_device).  Each struct in_device 
> contains a list of struct in_ifaddr that actually hold the address bits.
> 
> So, it seems to me that a solution to these problems may lie in:
> 
> 1) adding support for multiple IP addresses for each Ipv4Interface
> - this changes the API and implementation for class Ipv4Interface, which 
> assumes a single address
> 
> 2) taking care to align ifIndex between NetDevice and Ipv4Interface
> 

I think this is a better solution.  For one thing, having different interface indexes for L3 and L2 interfaces can be a source of confusion, especially wrt to tracing.  Also it is always nice to align with a well known implementation, Linux in this case.  Finally, for the IPv6 case it makes so much more sense, because in IPv6 conceptually you have a single interface with a list of addresses (even in ifconfig you only see eth0 with several ipv6 address, none of the eth0:1, eth0:2... crap that is in ipv4).

Also, even if we have to drop "IP aliasing" support in IPv4, forever, would anyone really be upset?
Comment 13 Craig Dowell 2008-06-10 19:02:54 EDT
Change back to P2 since there is disagreement about whether or not we should even address this.
Comment 14 Tom Henderson 2008-06-11 01:16:45 EDT
If we address this in the future, I think that we could add a class InetDevice as a container class for Ipv4Interface objects, and add additional API to iterate.  It may be that no API needs to change, but any code that relies on assumed 1:1 relationship between Ipv4Interface and NetDevice might have to change.  For instance, Ipv4::FindInterfaceForDevice() would only return the first of possibly many interfaces.
Comment 15 Tom Henderson 2009-01-31 11:50:21 EST
I've been looking at this bug again in the context of the ns-3-ip work.  AFAICS, there is no support in the real world for routing protocols to deal with IPv4 aliasing on a single net device.  Instead, how this seems to be universally handled is the creation of virtual devices on top of a physical device.  These are often called subinterfaces in routers.

Since global routing is a port of the quagga OSPFv2 implementation, which cannot handle multiple IP addresses AFAICS (but could handle virtual devices), I am currently thinking that the right way to handle this is:

- do not try to support global routing for multiple Ipv4 addresses on a single Ipv4Interface/NetDevice.  We can allow such configuration, but the global routing code should print out a warning at NS_LOG_WARN level when it encounters a multiply-addressed Ipv4 interface, and just use the first address (index zero) in such a case.  I believe that trying to support properly lots of IP addresses on a single interface will just make the code very complicated, and is low priority IMO compared to other things.

- we should instead encourage some development of virtual net devices for ns-3.  This might make a good GSOC or student project.  The global routing should then support virtual net devices for this IP aliasing functionality (like in the real world).  Virtual net devices may also be useful for other scenarios, such as vlan tagging, channel bonding, and ad hoc routing.

- downgrade this bug to P3 (waiting for virtual net device)
Comment 16 Gustavo J. A. M. Carneiro 2009-01-31 14:12:21 EST
(In reply to comment #15)
> I've been looking at this bug again in the context of the ns-3-ip work. 
> AFAICS, there is no support in the real world for routing protocols to deal
> with IPv4 aliasing on a single net device.  Instead, how this seems to be
> universally handled is the creation of virtual devices on top of a physical
> device.  These are often called subinterfaces in routers.
[...]

If you go for this, can you think of another name than "virtual devices"?  That's to avoid confusion with http://code.nsnam.org/gjc/ns-3-virtual-netdevice/
Maybe "alias devices"?  Especially since I was planning to write a paper in the next few weeks including this VirtualNetDevice (among other things, like UMTS/Wifi integration work that my colleague is working on).

I still get the feeling think these fake devices will be a source of confusion for people doing work on L2 stuff (like me).  We'll have to take extra care to check whether a device is real or fake before using it...  If you do it, I recommend to at least try to put it on a different layer, so that only layer >= 3 is affected and they do not appear in the list of Node devices (Node::GetDevice).
Comment 17 Tom Henderson 2009-01-31 22:05:58 EST
(In reply to comment #16)
> (In reply to comment #15)
> > I've been looking at this bug again in the context of the ns-3-ip work. 
> > AFAICS, there is no support in the real world for routing protocols to deal
> > with IPv4 aliasing on a single net device.  Instead, how this seems to be
> > universally handled is the creation of virtual devices on top of a physical
> > device.  These are often called subinterfaces in routers.
> [...]
> 
> If you go for this, can you think of another name than "virtual devices"? 
> That's to avoid confusion with
> http://code.nsnam.org/gjc/ns-3-virtual-netdevice/
> Maybe "alias devices"?  Especially since I was planning to write a paper in the
> next few weeks including this VirtualNetDevice (among other things, like
> UMTS/Wifi integration work that my colleague is working on).

I had forgotten about your tap-like VirtualNetDevice when I posted the above; sorry.  Do you have future merge plans for it?

> 
> I still get the feeling think these fake devices will be a source of confusion
> for people doing work on L2 stuff (like me).  We'll have to take extra care to
> check whether a device is real or fake before using it...  If you do it, I
> recommend to at least try to put it on a different layer, so that only layer >=
> 3 is affected and they do not appear in the list of Node devices
> (Node::GetDevice).
> 

It seems to me that these should not be in layer-3 but should appear to layer-3 like a real device.  For instance, it seems to me that loopback should be a L2-virtual device rather than the way it is an IPv4 only device now.  I think that people doing L2 work on 802.1q will probably want support for these types of devices.

But I agree with you that there are issues lurking with these types of devices that need to be resolved so that they do not become confusing.  

Comment 18 Gustavo J. A. M. Carneiro 2009-02-01 09:10:18 EST
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > I've been looking at this bug again in the context of the ns-3-ip work. 
> > > AFAICS, there is no support in the real world for routing protocols to deal
> > > with IPv4 aliasing on a single net device.  Instead, how this seems to be
> > > universally handled is the creation of virtual devices on top of a physical
> > > device.  These are often called subinterfaces in routers.
> > [...]
> > 
> > If you go for this, can you think of another name than "virtual devices"? 
> > That's to avoid confusion with
> > http://code.nsnam.org/gjc/ns-3-virtual-netdevice/
> > Maybe "alias devices"?  Especially since I was planning to write a paper in the
> > next few weeks including this VirtualNetDevice (among other things, like
> > UMTS/Wifi integration work that my colleague is working on).
> 
> I had forgotten about your tap-like VirtualNetDevice when I posted the above;
> sorry.  Do you have future merge plans for it?

I am unsure.  Last time I posted the code I did not get much reaction, which probably means it could be a specialized feature.  It is generally useful for IP-in-IP overlay networks, or "tunnels", but I am guessing not that many people are interested in tunneling.

So, in principle no, I have no plans to merge it.  But I am receptive to merging if pushed.

> 
> > 
> > I still get the feeling think these fake devices will be a source of confusion
> > for people doing work on L2 stuff (like me).  We'll have to take extra care to
> > check whether a device is real or fake before using it...  If you do it, I
> > recommend to at least try to put it on a different layer, so that only layer >=
> > 3 is affected and they do not appear in the list of Node devices
> > (Node::GetDevice).
> > 
> 
> It seems to me that these should not be in layer-3 but should appear to layer-3
> like a real device.  For instance, it seems to me that loopback should be a
> L2-virtual device rather than the way it is an IPv4 only device now.  I think
> that people doing L2 work on 802.1q will probably want support for these types
> of devices.

You make a good point regarding 802.1q.

> 
> But I agree with you that there are issues lurking with these types of devices
> that need to be resolved so that they do not become confusing.  
> 

One of the things that scare me is that this will probably mean we will need yet another IsLoopback () pure virtual method in the base class, or IsVirtual (), if we follow the current trend.  The number of IsXxx () methods in the base class is starting to grow too much IMHO.
Comment 19 Mathieu Lacage 2009-02-03 08:24:45 EST
(In reply to comment #18)
> > > I still get the feeling think these fake devices will be a source of confusion
> > > for people doing work on L2 stuff (like me).  We'll have to take extra care to
> > > check whether a device is real or fake before using it...  If you do it, I
> > > recommend to at least try to put it on a different layer, so that only layer >=
> > > 3 is affected and they do not appear in the list of Node devices
> > > (Node::GetDevice).
> > > 
> > 
> > It seems to me that these should not be in layer-3 but should appear to layer-3
> > like a real device.  For instance, it seems to me that loopback should be a
> > L2-virtual device rather than the way it is an IPv4 only device now.  I think
> > that people doing L2 work on 802.1q will probably want support for these types
> > of devices.

I believe that the reason why the ip loopback is implemented the way it is is because of ARP which must not be used for this loopback device: Previously, there was no NeedsArp method so, maybe that now that this method is here, we can do what you describe. I am worried that this change could introduce very subtle problems though.

> 
> You make a good point regarding 802.1q.
> 
> > 
> > But I agree with you that there are issues lurking with these types of devices
> > that need to be resolved so that they do not become confusing.  
> > 
> 
> One of the things that scare me is that this will probably mean we will need
> yet another IsLoopback () pure virtual method in the base class, or IsVirtual
> (), if we follow the current trend.  The number of IsXxx () methods in the base
> class is starting to grow too much IMHO.

If there is a need for yet another IsXXX method, this is a non-starter for me.

Comment 20 Tom Henderson 2009-02-16 01:01:46 EST
sliding to ns-3.5
Comment 21 Tom Henderson 2009-04-21 00:17:25 EDT
(In reply to comment #15)
> I've been looking at this bug again in the context of the ns-3-ip work. 
> AFAICS, there is no support in the real world for routing protocols to deal
> with IPv4 aliasing on a single net device.  Instead, how this seems to be
> universally handled is the creation of virtual devices on top of a physical
> device.  These are often called subinterfaces in routers.
> 
> Since global routing is a port of the quagga OSPFv2 implementation, which
> cannot handle multiple IP addresses AFAICS (but could handle virtual devices),
> I am currently thinking that the right way to handle this is:
> 
> - do not try to support global routing for multiple Ipv4 addresses on a single
> Ipv4Interface/NetDevice.  We can allow such configuration, but the global
> routing code should print out a warning at NS_LOG_WARN level when it encounters
> a multiply-addressed Ipv4 interface, and just use the first address (index
> zero) in such a case.  I believe that trying to support properly lots of IP
> addresses on a single interface will just make the code very complicated, and
> is low priority IMO compared to other things.

I added several such statements in the global routing code; e.g.

NS_LOG_WARN ("Warning, interface has multiple IP addresses; using only the primary one");

> 
> - we should instead encourage some development of virtual net devices for ns-3.
>  This might make a good GSOC or student project.  The global routing should
> then support virtual net devices for this IP aliasing functionality (like in
> the real world).  Virtual net devices may also be useful for other scenarios,
> such as vlan tagging, channel bonding, and ad hoc routing.
> 
> - downgrade this bug to P3 (waiting for virtual net device)
> 

I'm marking this as FIXED, in the absence of a virtual net device (the code will log warnings, as I suggested earlier).  When we get virtual net devices, we will need to make global routing work with them or else file a new bug on that topic.