[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20110323082423.321bf8ce@nehalam>
Date: Wed, 23 Mar 2011 08:24:23 -0700
From: Stephen Hemminger <shemminger@...tta.com>
To: David Miller <davem@...emloft.net>
Cc: ja@....bg, netdev@...r.kernel.org
Subject: Re: [BUG] VPN broken in net-next
On Tue, 22 Mar 2011 21:56:55 -0700 (PDT)
David Miller <davem@...emloft.net> wrote:
> From: Julian Anastasov <ja@....bg>
> Date: Fri, 4 Mar 2011 10:39:55 +0200 (EET)
>
> > On Thu, 3 Mar 2011, David Miller wrote:
> >
> >> I suspect that even if we need to handle prefixes, we can still use
> >> the hash for optimistic lookup, and fallback to a local table FIB
> >> inspection if that fails.
> >
> > Yes, as ip_route_output_slow uses __ip_dev_find for
> > fl4_src there should be some kind of fallback to local table,
> > so that traffic from 127.0.0.2 to 127.0.0.3 or other local
> > subnets on loopback can work. Another option is to use
> > inet_addr_onlink but I suspect people can add many addresses
> > on loopback: inet_addr_onlink(loopback_indev, addr, 0)
>
> I just got back to this, sorry for taking so long :-)
>
> Here is the patch I've come up with and will commit to
> net-2.6, thanks!
>
> --------------------
> ipv4: Fallback to FIB local table in __ip_dev_find().
>
> In commit 9435eb1cf0b76b323019cebf8d16762a50a12a19
> ("ipv4: Implement __ip_dev_find using new interface address hash.")
> we reimplemented __ip_dev_find() so that it doesn't have to
> do a full FIB table lookup.
>
> Instead, it consults a hash table of addresses configured to
> interfaces.
>
> This works identically to the old code in all except one case,
> and that is for loopback subnets.
>
> The old code would match the loopback device for any IP address
> that falls within a subnet configured to the loopback device.
>
> Handle this corner case by doing the FIB lookup.
>
> We could implement this via inet_addr_onlink() but:
>
> 1) Someone could configure many addresses to loopback and
> inet_addr_onlink() is a simple list traversal.
>
> 2) We know the old code works.
>
> Reported-by: Julian Anastasov <ja@....bg>
> Signed-off-by: David S. Miller <davem@...emloft.net>
> ---
> net/ipv4/devinet.c | 16 ++++++++++++++++
> 1 files changed, 16 insertions(+), 0 deletions(-)
>
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index d5a4553..5345b0b 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -64,6 +64,8 @@
> #include <net/rtnetlink.h>
> #include <net/net_namespace.h>
>
> +#include "fib_lookup.h"
> +
> static struct ipv4_devconf ipv4_devconf = {
> .data = {
> [IPV4_DEVCONF_ACCEPT_REDIRECTS - 1] = 1,
> @@ -151,6 +153,20 @@ struct net_device *__ip_dev_find(struct net *net, __be32 addr, bool devref)
> break;
> }
> }
> + if (!result) {
> + struct flowi4 fl4 = { .daddr = addr };
> + struct fib_result res = { 0 };
> + struct fib_table *local;
> +
> + /* Fallback to FIB local table so that communication
> + * over loopback subnets work.
> + */
> + local = fib_get_table(net, RT_TABLE_LOCAL);
> + if (local &&
> + !fib_table_lookup(local, &fl4, &res, FIB_LOOKUP_NOREF) &&
> + res.type == RTN_LOCAL)
> + result = FIB_RES_DEV(res);
> + }
> if (result && devref)
> dev_hold(result);
> rcu_read_unlock();
Acked-by: Stephen Hemminger <shemminger@...tta.com>
--
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists