Bugzilla – Full Text Bug Listing |
Summary: | global routing does not handle ip aliasing | ||
---|---|---|---|
Product: | ns-3 | Reporter: | Mathieu Lacage <mathieu.lacage> |
Component: | routing | Assignee: | Tom Henderson <tomh> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | craigdo, gjcarneiro, ns-bugs |
Priority: | P2 | ||
Version: | pre-release | ||
Hardware: | All | ||
OS: | All | ||
Attachments: | sample code |
Description
Mathieu Lacage
2008-05-27 17:55:05 EDT
Created attachment 137 [details]
sample code
What this program is doing is creating two ip interfaces which both reference the same underlying netdevice. I am sure that other things will break (most notably in the arp layer) but I was expecting that the global routing code would deal with this gracefully. I can't compile that against ns-3-dev, but I am guessing that the global routing is assuming one Ipv4Interface per NetDevice and/or that they are all on the same subnet. But your use of IP aliasing is unusual in that you are assigning addresses from different subnets to the aliased interfaces. Is this what you want? Usually they are from the same network. (In reply to comment #3) > I can't compile that against ns-3-dev, but I am guessing that the global ah. I will try to make it build against ns-3-dev. > routing is assuming one Ipv4Interface per NetDevice and/or that they are all on > the same subnet. yes. > > But your use of IP aliasing is unusual in that you are assigning addresses from > different subnets to the aliased interfaces. Is this what you want? Usually > they are from the same network. I don't know. I am trying to exercise the system. It seems a bit rude to get a crash for that. > The test case uses the Ipv4AddressHelper to try and create two virtual networks on a csma channel. There are three nodes and it tries to assign two interfaces with different network/address combinations to the first node/device. The second node is supposed to get an interface on a network/address corresponding to one of the first node's (virtual) networks, and the third node gets a network/address on the other. What happens here is that the first set of IP address assignments using Ipv4AddressHelper::Assign() on a collection of two net devices works as expected. The second assignment is expected to add a second interface to the first node, but the helper uses FindInterfaceForDevice to determine if there already exists an interface talking to the device. There is an existing interface in this case, so a new one is not created and the helper simply changes/overwrites the old IP address with the new one. The end result is a network with two nodes on one network number and one node on another. The next problem is that the global router code is run. All of the router LSAs are exported correctly (but recall the first node only has one interface, not two). In the router code, it needs to elect a designated router for a broadcast network segment. It does this by walking the channel and looking for the lowest numbered IP address on the channel. This, in turn is done by getting each device from the channel, then doing a FindIfIndexForDevice() and then a GetAddress(). Yes, since the concept of one interface per device is implicitly assumed here, the designated router for network segment 10.1.2.0 is 10.1.1.0 which doesn't make a whole lot of sense ;-) Eventually when we're calculating the shortest paths, the confusion comes back to bite us and when we try to make sense of a nonsensical router ID/IP address combination the router asserts (it's not a crash, GetIfIndexByAddress eventually asserts, which is the subject of yet another bug). Discussion regarding how to resolve, requirements for new feature, etc., required Comment from Tom:
> What happens here is that the first set of IP address assignments using
> Ipv4AddressHelper::Assign() on a collection of two net devices works as
> expected. The second assignment is expected to add a second interface to
> the first node, but the helper uses FindInterfaceForDevice to determine if
> there already exists an interface talking to the device. There is an
> existing interface in this case, so a new one is not created and the helper
> simply changes/overwrites the old IP address with the new one.
FWIW, I think that the helper should raise a warning or assert when this
condition arises, because it is much more likely that this type of
network assignment is a programming error.
If/when we decide we want to support this, maybe the low-level API
should be required.
Comment from Tom:
> one interface per device is implicitly assumed here, the designated router
> for network segment 10.1.2.0 is 10.1.1.0 which doesn't make a whole lot of
> sense ;-)
Yes. I doubt that quagga supports this; Cisco might. In practice,
these two networks would be like ships-in-the-night from the perspective
of OSPF. There would be two network LSAs originated; one for the
10.1.2.0 network and another for the 10.1.1.0 network. The interfaces
on 10.1.2.0 would ignore hellos from the 10.1.1.0 interface, and vice versa.
However, in our code, we should probably check that everyone on the same
link belongs to the same network, since we do not have a Hello process
to enforce that for us.
Comment from Tom: I can tell you from experience that it is tricky to get multiple addresses from different subnets on the same interface to work right on a Linux box; for instance, source address selection is non-deterministic and you can end up with mismatched source addresses. However, I do not think this is a big simulation use case and I think that, more often than not, this will really be indicative of a programming error. I do think we need to support multiple IP addresses per interface but mainly when we get to IPv6 where that is the norm. Note: this is somewhat related to bug 85, where we have ifIndexes at different levels of the system; ip aliaising will definitely break the alignment of IPv4Interface index to NetDevice index even if we tried to align otherwise. Comment from Tom: I think we should consider the solution to bug 85 jointly, and API change is likely. If we do "FindIndexForDevice" and it returns a single integer, what happens when there are more than one to return? I also think that Ipv6 will cause us to revisit (maybe the Ipv6 group has already thought about this). Comment from Tom: Here is some more food for thought that we may consider about solving this and bug 85. In Linux, there is one ifIndex that refers to both the NetDevice (struct net_device) and Ipv4Interface (struct in_device). Each struct in_device contains a list of struct in_ifaddr that actually hold the address bits. So, it seems to me that a solution to these problems may lie in: 1) adding support for multiple IP addresses for each Ipv4Interface - this changes the API and implementation for class Ipv4Interface, which assumes a single address 2) taking care to align ifIndex between NetDevice and Ipv4Interface Marked as P2/LATER per prior agreement. Will revisit after 3.1 release. (In reply to comment #10) > Comment from Tom: > > Here is some more food for thought that we may consider about solving > this and bug 85. > > In Linux, there is one ifIndex that refers to both the NetDevice (struct > net_device) and Ipv4Interface (struct in_device). Each struct in_device > contains a list of struct in_ifaddr that actually hold the address bits. > > So, it seems to me that a solution to these problems may lie in: > > 1) adding support for multiple IP addresses for each Ipv4Interface > - this changes the API and implementation for class Ipv4Interface, which > assumes a single address > > 2) taking care to align ifIndex between NetDevice and Ipv4Interface > I think this is a better solution. For one thing, having different interface indexes for L3 and L2 interfaces can be a source of confusion, especially wrt to tracing. Also it is always nice to align with a well known implementation, Linux in this case. Finally, for the IPv6 case it makes so much more sense, because in IPv6 conceptually you have a single interface with a list of addresses (even in ifconfig you only see eth0 with several ipv6 address, none of the eth0:1, eth0:2... crap that is in ipv4). Also, even if we have to drop "IP aliasing" support in IPv4, forever, would anyone really be upset? Change back to P2 since there is disagreement about whether or not we should even address this. If we address this in the future, I think that we could add a class InetDevice as a container class for Ipv4Interface objects, and add additional API to iterate. It may be that no API needs to change, but any code that relies on assumed 1:1 relationship between Ipv4Interface and NetDevice might have to change. For instance, Ipv4::FindInterfaceForDevice() would only return the first of possibly many interfaces. I've been looking at this bug again in the context of the ns-3-ip work. AFAICS, there is no support in the real world for routing protocols to deal with IPv4 aliasing on a single net device. Instead, how this seems to be universally handled is the creation of virtual devices on top of a physical device. These are often called subinterfaces in routers. Since global routing is a port of the quagga OSPFv2 implementation, which cannot handle multiple IP addresses AFAICS (but could handle virtual devices), I am currently thinking that the right way to handle this is: - do not try to support global routing for multiple Ipv4 addresses on a single Ipv4Interface/NetDevice. We can allow such configuration, but the global routing code should print out a warning at NS_LOG_WARN level when it encounters a multiply-addressed Ipv4 interface, and just use the first address (index zero) in such a case. I believe that trying to support properly lots of IP addresses on a single interface will just make the code very complicated, and is low priority IMO compared to other things. - we should instead encourage some development of virtual net devices for ns-3. This might make a good GSOC or student project. The global routing should then support virtual net devices for this IP aliasing functionality (like in the real world). Virtual net devices may also be useful for other scenarios, such as vlan tagging, channel bonding, and ad hoc routing. - downgrade this bug to P3 (waiting for virtual net device) (In reply to comment #15) > I've been looking at this bug again in the context of the ns-3-ip work. > AFAICS, there is no support in the real world for routing protocols to deal > with IPv4 aliasing on a single net device. Instead, how this seems to be > universally handled is the creation of virtual devices on top of a physical > device. These are often called subinterfaces in routers. [...] If you go for this, can you think of another name than "virtual devices"? That's to avoid confusion with http://code.nsnam.org/gjc/ns-3-virtual-netdevice/ Maybe "alias devices"? Especially since I was planning to write a paper in the next few weeks including this VirtualNetDevice (among other things, like UMTS/Wifi integration work that my colleague is working on). I still get the feeling think these fake devices will be a source of confusion for people doing work on L2 stuff (like me). We'll have to take extra care to check whether a device is real or fake before using it... If you do it, I recommend to at least try to put it on a different layer, so that only layer >= 3 is affected and they do not appear in the list of Node devices (Node::GetDevice). (In reply to comment #16) > (In reply to comment #15) > > I've been looking at this bug again in the context of the ns-3-ip work. > > AFAICS, there is no support in the real world for routing protocols to deal > > with IPv4 aliasing on a single net device. Instead, how this seems to be > > universally handled is the creation of virtual devices on top of a physical > > device. These are often called subinterfaces in routers. > [...] > > If you go for this, can you think of another name than "virtual devices"? > That's to avoid confusion with > http://code.nsnam.org/gjc/ns-3-virtual-netdevice/ > Maybe "alias devices"? Especially since I was planning to write a paper in the > next few weeks including this VirtualNetDevice (among other things, like > UMTS/Wifi integration work that my colleague is working on). I had forgotten about your tap-like VirtualNetDevice when I posted the above; sorry. Do you have future merge plans for it? > > I still get the feeling think these fake devices will be a source of confusion > for people doing work on L2 stuff (like me). We'll have to take extra care to > check whether a device is real or fake before using it... If you do it, I > recommend to at least try to put it on a different layer, so that only layer >= > 3 is affected and they do not appear in the list of Node devices > (Node::GetDevice). > It seems to me that these should not be in layer-3 but should appear to layer-3 like a real device. For instance, it seems to me that loopback should be a L2-virtual device rather than the way it is an IPv4 only device now. I think that people doing L2 work on 802.1q will probably want support for these types of devices. But I agree with you that there are issues lurking with these types of devices that need to be resolved so that they do not become confusing. (In reply to comment #17) > (In reply to comment #16) > > (In reply to comment #15) > > > I've been looking at this bug again in the context of the ns-3-ip work. > > > AFAICS, there is no support in the real world for routing protocols to deal > > > with IPv4 aliasing on a single net device. Instead, how this seems to be > > > universally handled is the creation of virtual devices on top of a physical > > > device. These are often called subinterfaces in routers. > > [...] > > > > If you go for this, can you think of another name than "virtual devices"? > > That's to avoid confusion with > > http://code.nsnam.org/gjc/ns-3-virtual-netdevice/ > > Maybe "alias devices"? Especially since I was planning to write a paper in the > > next few weeks including this VirtualNetDevice (among other things, like > > UMTS/Wifi integration work that my colleague is working on). > > I had forgotten about your tap-like VirtualNetDevice when I posted the above; > sorry. Do you have future merge plans for it? I am unsure. Last time I posted the code I did not get much reaction, which probably means it could be a specialized feature. It is generally useful for IP-in-IP overlay networks, or "tunnels", but I am guessing not that many people are interested in tunneling. So, in principle no, I have no plans to merge it. But I am receptive to merging if pushed. > > > > > I still get the feeling think these fake devices will be a source of confusion > > for people doing work on L2 stuff (like me). We'll have to take extra care to > > check whether a device is real or fake before using it... If you do it, I > > recommend to at least try to put it on a different layer, so that only layer >= > > 3 is affected and they do not appear in the list of Node devices > > (Node::GetDevice). > > > > It seems to me that these should not be in layer-3 but should appear to layer-3 > like a real device. For instance, it seems to me that loopback should be a > L2-virtual device rather than the way it is an IPv4 only device now. I think > that people doing L2 work on 802.1q will probably want support for these types > of devices. You make a good point regarding 802.1q. > > But I agree with you that there are issues lurking with these types of devices > that need to be resolved so that they do not become confusing. > One of the things that scare me is that this will probably mean we will need yet another IsLoopback () pure virtual method in the base class, or IsVirtual (), if we follow the current trend. The number of IsXxx () methods in the base class is starting to grow too much IMHO. (In reply to comment #18) > > > I still get the feeling think these fake devices will be a source of confusion > > > for people doing work on L2 stuff (like me). We'll have to take extra care to > > > check whether a device is real or fake before using it... If you do it, I > > > recommend to at least try to put it on a different layer, so that only layer >= > > > 3 is affected and they do not appear in the list of Node devices > > > (Node::GetDevice). > > > > > > > It seems to me that these should not be in layer-3 but should appear to layer-3 > > like a real device. For instance, it seems to me that loopback should be a > > L2-virtual device rather than the way it is an IPv4 only device now. I think > > that people doing L2 work on 802.1q will probably want support for these types > > of devices. I believe that the reason why the ip loopback is implemented the way it is is because of ARP which must not be used for this loopback device: Previously, there was no NeedsArp method so, maybe that now that this method is here, we can do what you describe. I am worried that this change could introduce very subtle problems though. > > You make a good point regarding 802.1q. > > > > > But I agree with you that there are issues lurking with these types of devices > > that need to be resolved so that they do not become confusing. > > > > One of the things that scare me is that this will probably mean we will need > yet another IsLoopback () pure virtual method in the base class, or IsVirtual > (), if we follow the current trend. The number of IsXxx () methods in the base > class is starting to grow too much IMHO. If there is a need for yet another IsXXX method, this is a non-starter for me. sliding to ns-3.5 (In reply to comment #15) > I've been looking at this bug again in the context of the ns-3-ip work. > AFAICS, there is no support in the real world for routing protocols to deal > with IPv4 aliasing on a single net device. Instead, how this seems to be > universally handled is the creation of virtual devices on top of a physical > device. These are often called subinterfaces in routers. > > Since global routing is a port of the quagga OSPFv2 implementation, which > cannot handle multiple IP addresses AFAICS (but could handle virtual devices), > I am currently thinking that the right way to handle this is: > > - do not try to support global routing for multiple Ipv4 addresses on a single > Ipv4Interface/NetDevice. We can allow such configuration, but the global > routing code should print out a warning at NS_LOG_WARN level when it encounters > a multiply-addressed Ipv4 interface, and just use the first address (index > zero) in such a case. I believe that trying to support properly lots of IP > addresses on a single interface will just make the code very complicated, and > is low priority IMO compared to other things. I added several such statements in the global routing code; e.g. NS_LOG_WARN ("Warning, interface has multiple IP addresses; using only the primary one"); > > - we should instead encourage some development of virtual net devices for ns-3. > This might make a good GSOC or student project. The global routing should > then support virtual net devices for this IP aliasing functionality (like in > the real world). Virtual net devices may also be useful for other scenarios, > such as vlan tagging, channel bonding, and ad hoc routing. > > - downgrade this bug to P3 (waiting for virtual net device) > I'm marking this as FIXED, in the absence of a virtual net device (the code will log warnings, as I suggested earlier). When we get virtual net devices, we will need to make global routing work with them or else file a new bug on that topic. |