Bug 188 - global routing does not handle ip aliasing
: global routing does not handle ip aliasing
Status: RESOLVED FIXED
: ns-3
routing
: pre-release
: All All
: P2 normal
Assigned To:
:
:
:
:
  Show dependency treegraph
 
Reported: 2008-05-27 17:55 EDT by
Modified: 2009-04-21 00:17 EDT (History)


Attachments
sample code (3.63 KB, text/x-c++src)
2008-05-27 17:55 EDT, Mathieu Lacage
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-05-27 17:55:05 EDT
The attached program segfaults during PopulateTables.
------- Comment #1 From 2008-05-27 17:55:49 EDT -------
Created an attachment (id=137) [details]
sample code
------- Comment #2 From 2008-05-27 17:59:58 EDT -------
What this program is doing is creating two ip interfaces which both reference
the same underlying netdevice. I am sure that other things will break (most
notably in the arp layer) but I was expecting that the global routing code
would deal with this gracefully.
------- Comment #3 From 2008-05-28 00:09:25 EDT -------
I can't compile that against ns-3-dev, but I am guessing that the global
routing is assuming one Ipv4Interface per NetDevice and/or that they are all on
the same subnet.

But your use of IP aliasing is unusual in that you are assigning addresses from
different subnets to the aliased interfaces.  Is this what you want?  Usually
they are from the same network.
------- Comment #4 From 2008-05-28 11:10:58 EDT -------
(In reply to comment #3)
> I can't compile that against ns-3-dev, but I am guessing that the global

ah. I will try to make it build against ns-3-dev.

> routing is assuming one Ipv4Interface per NetDevice and/or that they are all on
> the same subnet.

yes.

> 
> But your use of IP aliasing is unusual in that you are assigning addresses from
> different subnets to the aliased interfaces.  Is this what you want?  Usually
> they are from the same network.

I don't know. I am trying to exercise the system. It seems a bit rude to get a
crash for that.

> 
------- Comment #5 From 2008-06-10 11:32:45 EDT -------
The test case uses the Ipv4AddressHelper to try and create two virtual
networks on a csma channel.  There are three nodes and it tries to assign
two interfaces with different network/address combinations to the first
node/device.  The second node is supposed to get an interface on a
network/address corresponding to one of the first node's (virtual) networks,
and the third node gets a network/address on the other.

What happens here is that the first set of IP address assignments using
Ipv4AddressHelper::Assign() on a collection of two net devices works as
expected.  The second assignment is expected to add a second interface to
the first node, but the helper uses FindInterfaceForDevice to determine if
there already exists an interface talking to the device.  There is an
existing interface in this case, so a new one is not created and the helper
simply changes/overwrites the old IP address with the new one.  

The end result is a network with two nodes on one network number and one
node on another.

The next problem is that the global router code is run.  All of the router
LSAs are exported correctly (but recall the first node only has one
interface, not two).  In the router code, it needs to elect a designated
router for a broadcast network segment.  It does this by walking the channel
and looking for the lowest numbered IP address on the channel.  This, in
turn is done by getting each device from the channel, then doing a
FindIfIndexForDevice() and then a GetAddress().  Yes, since the concept of
one interface per device is implicitly assumed here, the designated router
for network segment 10.1.2.0 is 10.1.1.0 which doesn't make a whole lot of
sense ;-)

Eventually when we're calculating the shortest paths, the confusion comes
back to bite us and when we try to make sense of a nonsensical router ID/IP
address combination the router asserts (it's not a crash,
GetIfIndexByAddress eventually asserts, which is the subject of yet another
bug).

Discussion regarding how to resolve, requirements for new feature, etc.,
required
------- Comment #6 From 2008-06-10 11:37:18 EDT -------
Comment from Tom:

> What happens here is that the first set of IP address assignments using
> Ipv4AddressHelper::Assign() on a collection of two net devices works as
> expected.  The second assignment is expected to add a second interface to
> the first node, but the helper uses FindInterfaceForDevice to determine if
> there already exists an interface talking to the device.  There is an
> existing interface in this case, so a new one is not created and the helper
> simply changes/overwrites the old IP address with the new one.  

FWIW, I think that the helper should raise a warning or assert when this 
condition arises, because it is much more likely that this type of 
network assignment is a programming error.

If/when we decide we want to support this, maybe the low-level API 
should be required.
------- Comment #7 From 2008-06-10 11:38:15 EDT -------
Comment from Tom:

> one interface per device is implicitly assumed here, the designated router
> for network segment 10.1.2.0 is 10.1.1.0 which doesn't make a whole lot of
> sense ;-)

Yes.  I doubt that quagga supports this; Cisco might.  In practice, 
these two networks would be like ships-in-the-night from the perspective 
of OSPF.  There would be two network LSAs originated; one for the 
10.1.2.0 network and another for the 10.1.1.0 network.  The interfaces 
on 10.1.2.0 would ignore hellos from the 10.1.1.0 interface, and vice versa.

However, in our code, we should probably check that everyone on the same 
link belongs to the same network, since we do not have a Hello process 
to enforce that for us.
------- Comment #8 From 2008-06-10 11:39:18 EDT -------
Comment from Tom:

I can tell you from experience that it is tricky to get multiple 
addresses from different subnets on the same interface to work right on 
a Linux box; for instance, source address selection is non-deterministic 
and you can end up with mismatched source addresses.

However, I do not think this is a big simulation use case and I think 
that, more often than not, this will really be indicative of a 
programming error.

I do think we need to support multiple IP addresses per interface but 
mainly when we get to IPv6 where that is the norm.

Note:  this is somewhat related to bug 85, where we have ifIndexes at 
different levels of the system; ip aliaising will definitely break the 
alignment of IPv4Interface index to NetDevice index even if we tried to 
align otherwise.
------- Comment #9 From 2008-06-10 11:40:28 EDT -------
Comment from Tom:

I think we should consider the solution to bug 85 jointly, and API 
change is likely.  If we do "FindIndexForDevice" and it returns a single 
integer, what happens when there are more than one to return?

I also think that Ipv6 will cause us to revisit (maybe the Ipv6 group 
has already thought about this).
------- Comment #10 From 2008-06-10 11:42:26 EDT -------
Comment from Tom:

Here is some more food for thought that we may consider about solving 
this and bug 85.

In Linux, there is one ifIndex that refers to both the NetDevice (struct 
net_device) and Ipv4Interface (struct in_device).  Each struct in_device 
contains a list of struct in_ifaddr that actually hold the address bits.

So, it seems to me that a solution to these problems may lie in:

1) adding support for multiple IP addresses for each Ipv4Interface
- this changes the API and implementation for class Ipv4Interface, which 
assumes a single address

2) taking care to align ifIndex between NetDevice and Ipv4Interface
------- Comment #11 From 2008-06-10 13:51:22 EDT -------
Marked as P2/LATER per prior agreement.  Will revisit after 3.1 release.
------- Comment #12 From 2008-06-10 14:07:37 EDT -------
(In reply to comment #10)
> Comment from Tom:
> 
> Here is some more food for thought that we may consider about solving 
> this and bug 85.
> 
> In Linux, there is one ifIndex that refers to both the NetDevice (struct 
> net_device) and Ipv4Interface (struct in_device).  Each struct in_device 
> contains a list of struct in_ifaddr that actually hold the address bits.
> 
> So, it seems to me that a solution to these problems may lie in:
> 
> 1) adding support for multiple IP addresses for each Ipv4Interface
> - this changes the API and implementation for class Ipv4Interface, which 
> assumes a single address
> 
> 2) taking care to align ifIndex between NetDevice and Ipv4Interface
> 

I think this is a better solution.  For one thing, having different interface
indexes for L3 and L2 interfaces can be a source of confusion, especially wrt
to tracing.  Also it is always nice to align with a well known implementation,
Linux in this case.  Finally, for the IPv6 case it makes so much more sense,
because in IPv6 conceptually you have a single interface with a list of
addresses (even in ifconfig you only see eth0 with several ipv6 address, none
of the eth0:1, eth0:2... crap that is in ipv4).

Also, even if we have to drop "IP aliasing" support in IPv4, forever, would
anyone really be upset?
------- Comment #13 From 2008-06-10 19:02:54 EDT -------
Change back to P2 since there is disagreement about whether or not we should
even address this.
------- Comment #14 From 2008-06-11 01:16:45 EDT -------
If we address this in the future, I think that we could add a class InetDevice
as a container class for Ipv4Interface objects, and add additional API to
iterate.  It may be that no API needs to change, but any code that relies on
assumed 1:1 relationship between Ipv4Interface and NetDevice might have to
change.  For instance, Ipv4::FindInterfaceForDevice() would only return the
first of possibly many interfaces.
------- Comment #15 From 2009-01-31 11:50:21 EDT -------
I've been looking at this bug again in the context of the ns-3-ip work. 
AFAICS, there is no support in the real world for routing protocols to deal
with IPv4 aliasing on a single net device.  Instead, how this seems to be
universally handled is the creation of virtual devices on top of a physical
device.  These are often called subinterfaces in routers.

Since global routing is a port of the quagga OSPFv2 implementation, which
cannot handle multiple IP addresses AFAICS (but could handle virtual devices),
I am currently thinking that the right way to handle this is:

- do not try to support global routing for multiple Ipv4 addresses on a single
Ipv4Interface/NetDevice.  We can allow such configuration, but the global
routing code should print out a warning at NS_LOG_WARN level when it encounters
a multiply-addressed Ipv4 interface, and just use the first address (index
zero) in such a case.  I believe that trying to support properly lots of IP
addresses on a single interface will just make the code very complicated, and
is low priority IMO compared to other things.

- we should instead encourage some development of virtual net devices for ns-3.
 This might make a good GSOC or student project.  The global routing should
then support virtual net devices for this IP aliasing functionality (like in
the real world).  Virtual net devices may also be useful for other scenarios,
such as vlan tagging, channel bonding, and ad hoc routing.

- downgrade this bug to P3 (waiting for virtual net device)
------- Comment #16 From 2009-01-31 14:12:21 EDT -------
(In reply to comment #15)
> I've been looking at this bug again in the context of the ns-3-ip work. 
> AFAICS, there is no support in the real world for routing protocols to deal
> with IPv4 aliasing on a single net device.  Instead, how this seems to be
> universally handled is the creation of virtual devices on top of a physical
> device.  These are often called subinterfaces in routers.
[...]

If you go for this, can you think of another name than "virtual devices"? 
That's to avoid confusion with
http://code.nsnam.org/gjc/ns-3-virtual-netdevice/
Maybe "alias devices"?  Especially since I was planning to write a paper in the
next few weeks including this VirtualNetDevice (among other things, like
UMTS/Wifi integration work that my colleague is working on).

I still get the feeling think these fake devices will be a source of confusion
for people doing work on L2 stuff (like me).  We'll have to take extra care to
check whether a device is real or fake before using it...  If you do it, I
recommend to at least try to put it on a different layer, so that only layer >=
3 is affected and they do not appear in the list of Node devices
(Node::GetDevice).
------- Comment #17 From 2009-01-31 22:05:58 EDT -------
(In reply to comment #16)
> (In reply to comment #15)
> > I've been looking at this bug again in the context of the ns-3-ip work. 
> > AFAICS, there is no support in the real world for routing protocols to deal
> > with IPv4 aliasing on a single net device.  Instead, how this seems to be
> > universally handled is the creation of virtual devices on top of a physical
> > device.  These are often called subinterfaces in routers.
> [...]
> 
> If you go for this, can you think of another name than "virtual devices"? 
> That's to avoid confusion with
> http://code.nsnam.org/gjc/ns-3-virtual-netdevice/
> Maybe "alias devices"?  Especially since I was planning to write a paper in the
> next few weeks including this VirtualNetDevice (among other things, like
> UMTS/Wifi integration work that my colleague is working on).

I had forgotten about your tap-like VirtualNetDevice when I posted the above;
sorry.  Do you have future merge plans for it?

> 
> I still get the feeling think these fake devices will be a source of confusion
> for people doing work on L2 stuff (like me).  We'll have to take extra care to
> check whether a device is real or fake before using it...  If you do it, I
> recommend to at least try to put it on a different layer, so that only layer >=
> 3 is affected and they do not appear in the list of Node devices
> (Node::GetDevice).
> 

It seems to me that these should not be in layer-3 but should appear to layer-3
like a real device.  For instance, it seems to me that loopback should be a
L2-virtual device rather than the way it is an IPv4 only device now.  I think
that people doing L2 work on 802.1q will probably want support for these types
of devices.

But I agree with you that there are issues lurking with these types of devices
that need to be resolved so that they do not become confusing.  
------- Comment #18 From 2009-02-01 09:10:18 EDT -------
(In reply to comment #17)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > I've been looking at this bug again in the context of the ns-3-ip work. 
> > > AFAICS, there is no support in the real world for routing protocols to deal
> > > with IPv4 aliasing on a single net device.  Instead, how this seems to be
> > > universally handled is the creation of virtual devices on top of a physical
> > > device.  These are often called subinterfaces in routers.
> > [...]
> > 
> > If you go for this, can you think of another name than "virtual devices"? 
> > That's to avoid confusion with
> > http://code.nsnam.org/gjc/ns-3-virtual-netdevice/
> > Maybe "alias devices"?  Especially since I was planning to write a paper in the
> > next few weeks including this VirtualNetDevice (among other things, like
> > UMTS/Wifi integration work that my colleague is working on).
> 
> I had forgotten about your tap-like VirtualNetDevice when I posted the above;
> sorry.  Do you have future merge plans for it?

I am unsure.  Last time I posted the code I did not get much reaction, which
probably means it could be a specialized feature.  It is generally useful for
IP-in-IP overlay networks, or "tunnels", but I am guessing not that many people
are interested in tunneling.

So, in principle no, I have no plans to merge it.  But I am receptive to
merging if pushed.

> 
> > 
> > I still get the feeling think these fake devices will be a source of confusion
> > for people doing work on L2 stuff (like me).  We'll have to take extra care to
> > check whether a device is real or fake before using it...  If you do it, I
> > recommend to at least try to put it on a different layer, so that only layer >=
> > 3 is affected and they do not appear in the list of Node devices
> > (Node::GetDevice).
> > 
> 
> It seems to me that these should not be in layer-3 but should appear to layer-3
> like a real device.  For instance, it seems to me that loopback should be a
> L2-virtual device rather than the way it is an IPv4 only device now.  I think
> that people doing L2 work on 802.1q will probably want support for these types
> of devices.

You make a good point regarding 802.1q.

> 
> But I agree with you that there are issues lurking with these types of devices
> that need to be resolved so that they do not become confusing.  
> 

One of the things that scare me is that this will probably mean we will need
yet another IsLoopback () pure virtual method in the base class, or IsVirtual
(), if we follow the current trend.  The number of IsXxx () methods in the base
class is starting to grow too much IMHO.
------- Comment #19 From 2009-02-03 08:24:45 EDT -------
(In reply to comment #18)
> > > I still get the feeling think these fake devices will be a source of confusion
> > > for people doing work on L2 stuff (like me).  We'll have to take extra care to
> > > check whether a device is real or fake before using it...  If you do it, I
> > > recommend to at least try to put it on a different layer, so that only layer >=
> > > 3 is affected and they do not appear in the list of Node devices
> > > (Node::GetDevice).
> > > 
> > 
> > It seems to me that these should not be in layer-3 but should appear to layer-3
> > like a real device.  For instance, it seems to me that loopback should be a
> > L2-virtual device rather than the way it is an IPv4 only device now.  I think
> > that people doing L2 work on 802.1q will probably want support for these types
> > of devices.

I believe that the reason why the ip loopback is implemented the way it is is
because of ARP which must not be used for this loopback device: Previously,
there was no NeedsArp method so, maybe that now that this method is here, we
can do what you describe. I am worried that this change could introduce very
subtle problems though.

> 
> You make a good point regarding 802.1q.
> 
> > 
> > But I agree with you that there are issues lurking with these types of devices
> > that need to be resolved so that they do not become confusing.  
> > 
> 
> One of the things that scare me is that this will probably mean we will need
> yet another IsLoopback () pure virtual method in the base class, or IsVirtual
> (), if we follow the current trend.  The number of IsXxx () methods in the base
> class is starting to grow too much IMHO.

If there is a need for yet another IsXXX method, this is a non-starter for me.
------- Comment #20 From 2009-02-16 01:01:46 EDT -------
sliding to ns-3.5
------- Comment #21 From 2009-04-21 00:17:25 EDT -------
(In reply to comment #15)
> I've been looking at this bug again in the context of the ns-3-ip work. 
> AFAICS, there is no support in the real world for routing protocols to deal
> with IPv4 aliasing on a single net device.  Instead, how this seems to be
> universally handled is the creation of virtual devices on top of a physical
> device.  These are often called subinterfaces in routers.
> 
> Since global routing is a port of the quagga OSPFv2 implementation, which
> cannot handle multiple IP addresses AFAICS (but could handle virtual devices),
> I am currently thinking that the right way to handle this is:
> 
> - do not try to support global routing for multiple Ipv4 addresses on a single
> Ipv4Interface/NetDevice.  We can allow such configuration, but the global
> routing code should print out a warning at NS_LOG_WARN level when it encounters
> a multiply-addressed Ipv4 interface, and just use the first address (index
> zero) in such a case.  I believe that trying to support properly lots of IP
> addresses on a single interface will just make the code very complicated, and
> is low priority IMO compared to other things.

I added several such statements in the global routing code; e.g.

NS_LOG_WARN ("Warning, interface has multiple IP addresses; using only the
primary one");

> 
> - we should instead encourage some development of virtual net devices for ns-3.
>  This might make a good GSOC or student project.  The global routing should
> then support virtual net devices for this IP aliasing functionality (like in
> the real world).  Virtual net devices may also be useful for other scenarios,
> such as vlan tagging, channel bonding, and ad hoc routing.
> 
> - downgrade this bug to P3 (waiting for virtual net device)
> 

I'm marking this as FIXED, in the absence of a virtual net device (the code
will log warnings, as I suggested earlier).  When we get virtual net devices,
we will need to make global routing work with them or else file a new bug on
that topic.