[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150710115305.GB1855@localhost.localdomain>
Date: Fri, 10 Jul 2015 08:53:05 -0300
From: Marcelo Ricardo Leitner <mleitner@...hat.com>
To: Michael Tuexen <tuexen@...muenster.de>
Cc: netdev@...r.kernel.org, linux-sctp@...r.kernel.org,
Vlad Yasevich <vyasevich@...il.com>,
Neil Horman <nhorman@...driver.com>
Subject: Re: [RFC PATCH net-next] sctp: fix src address selection if using
secondary addresses
On Thu, Jul 09, 2015 at 09:55:19PM +0200, Michael Tuexen wrote:
> > On 09 Jul 2015, at 18:54, Marcelo Ricardo Leitner <mleitner@...hat.com> wrote:
> >
> > Cc'ing Michael too.
> I'm not familiar with the Linux kernel code, so I can't comment on it.
> But making sure to use a source address belonging to the emitting
> interface makes sense for me.
>
> Best regards
> Michael
That's pretty much what I was looking for, in case I missed something on
SCTP RFCs.
Thanks Michael!
> > On Tue, Jul 07, 2015 at 02:42:25PM -0300, Marcelo Ricardo Leitner wrote:
> >> Hi folks,
> >>
> >> This is an attempt to better choose a src address for sctp packets as
> >> peers with rp_filter could be dropping our packets in some situations.
> >> With this patch, we try to respect and use a src address that belongs to
> >> the interface we are putting the packet out.
> >>
> >> I have that feeling that there is be a better way to do this, but I
> >> just couldn't see it.
> >>
> >> This patch has been tested with and without gateways between the peers
> >> and also just two peers connected via two subnets and results were
> >> pretty good.
> >>
> >> One could think that this limits the address combination we can use, but
> >> such combinations probably are just bogus anyway. Like, if you have an
> >> host with address A1 and B1 and another with A2 and B2, you cannot
> >> expect that A can use A1 to reach B2 through subnet B, because the
> >> return path would be via the other link which, when this switch happens,
> >> we are thinking it's broken.
> >>
> >> Thanks,
> >> Marcelo
> >>
> >> ---8<---
> >>
> >> In short, sctp is likely to incorrectly choose src address if socket is
> >> bound to secondary addresses. This patch fixes it by adding a new check
> >> that tries to anticipate if the src address would be expected by the
> >> next hop/peer on this interface by doing reverse routing.
> >>
> >> Also took the shot to reduce the indentation level on this code.
> >>
> >> Details:
> >>
> >> Currently, sctp will do a routing attempt without specifying the src
> >> address and compare the returned value (preferred source) with the
> >> addresses that the socket is bound to. When using secondary addresses,
> >> this will not match.
> >>
> >> Then it will try specifying each of the addresses that the socket is
> >> bound to and re-routing, checking if that address is valid as src for
> >> that dst. Thing is, this check alone is weak:
> >>
> >> # ip r l
> >> 192.168.100.0/24 dev eth1 proto kernel scope link src 192.168.100.149
> >> 192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.147
> >>
> >> # ip a l
> >> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
> >> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >> inet 127.0.0.1/8 scope host lo
> >> valid_lft forever preferred_lft forever
> >> inet6 ::1/128 scope host
> >> valid_lft forever preferred_lft forever
> >> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >> link/ether 52:54:00:15:18:6a brd ff:ff:ff:ff:ff:ff
> >> inet 192.168.122.147/24 brd 192.168.122.255 scope global dynamic eth0
> >> valid_lft 2160sec preferred_lft 2160sec
> >> inet 192.168.122.148/24 scope global secondary eth0
> >> valid_lft forever preferred_lft forever
> >> inet6 fe80::5054:ff:fe15:186a/64 scope link
> >> valid_lft forever preferred_lft forever
> >> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >> link/ether 52:54:00:b3:91:46 brd ff:ff:ff:ff:ff:ff
> >> inet 192.168.100.149/24 brd 192.168.100.255 scope global dynamic eth1
> >> valid_lft 2162sec preferred_lft 2162sec
> >> inet 192.168.100.148/24 scope global secondary eth1
> >> valid_lft forever preferred_lft forever
> >> inet6 fe80::5054:ff:feb3:9146/64 scope link
> >> valid_lft forever preferred_lft forever
> >> 4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> >> link/ether 52:54:00:05:47:ee brd ff:ff:ff:ff:ff:ff
> >> inet6 fe80::5054:ff:fe05:47ee/64 scope link
> >> valid_lft forever preferred_lft forever
> >>
> >> # ip r g 192.168.100.193 from 192.168.122.148
> >> 192.168.100.193 from 192.168.122.148 dev eth1
> >> cache
> >>
> >> Even if you specify an interface:
> >>
> >> # ip r g 192.168.100.193 from 192.168.122.148 oif eth1
> >> 192.168.100.193 from 192.168.122.148 dev eth1
> >> cache
> >>
> >> Although this would be valid, peers using rp_filter will drop such
> >> packets as their src doesn't match the routes for that interface.
> >>
> >> So we fix this by adding an extra check, we try to do the reverse
> >> routing and check if the interface used would be the same. If not, we
> >> skip such address. If yes, we use it.
> >>
> >> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
> >> ---
> >> net/sctp/protocol.c | 55 +++++++++++++++++++++++++++++++++++++++--------------
> >> 1 file changed, 41 insertions(+), 14 deletions(-)
> >>
> >> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> >> index 59e80356672bdf89777265ae1f8c384792dfb98c..e52fd6f77963426a7cf3e83ca01a9cdae1cb2c01 100644
> >> --- a/net/sctp/protocol.c
> >> +++ b/net/sctp/protocol.c
> >> @@ -53,6 +53,7 @@
> >> #include <net/net_namespace.h>
> >> #include <net/protocol.h>
> >> #include <net/ip.h>
> >> +#include <net/ip_fib.h>
> >> #include <net/ipv6.h>
> >> #include <net/route.h>
> >> #include <net/sctp/sctp.h>
> >> @@ -487,23 +488,49 @@ static void sctp_v4_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
> >> */
> >> rcu_read_lock();
> >> list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> >> + struct flowi4 in;
> >> + struct fib_result res;
> >> +
> >> if (!laddr->valid)
> >> continue;
> >> - if ((laddr->state == SCTP_ADDR_SRC) &&
> >> - (AF_INET == laddr->a.sa.sa_family)) {
> >> - fl4->fl4_sport = laddr->a.v4.sin_port;
> >> - flowi4_update_output(fl4,
> >> - asoc->base.sk->sk_bound_dev_if,
> >> - RT_CONN_FLAGS(asoc->base.sk),
> >> - daddr->v4.sin_addr.s_addr,
> >> - laddr->a.v4.sin_addr.s_addr);
> >> -
> >> - rt = ip_route_output_key(sock_net(sk), fl4);
> >> - if (!IS_ERR(rt)) {
> >> - dst = &rt->dst;
> >> - goto out_unlock;
> >> - }
> >> + if (laddr->state != SCTP_ADDR_SRC ||
> >> + AF_INET != laddr->a.sa.sa_family)
> >> + continue;
> >> +
> >> + fl4->fl4_sport = laddr->a.v4.sin_port;
> >> + flowi4_update_output(fl4,
> >> + asoc->base.sk->sk_bound_dev_if,
> >> + RT_CONN_FLAGS(asoc->base.sk),
> >> + daddr->v4.sin_addr.s_addr,
> >> + laddr->a.v4.sin_addr.s_addr);
> >> +
> >> + rt = ip_route_output_key(sock_net(sk), fl4);
> >> + if (IS_ERR(rt))
> >> + continue;
> >> +
> >> + dst = &rt->dst;
> >> +
> >> + /* Double check if this is really expected. We simulate
> >> + * rp_filter by swapping src and dst. If interfaces are
> >> + * different, means that this src wouldn't be expected
> >> + * by the other host on this interface.
> >> + */
> >> + memcpy(&in, fl4, sizeof(in));
> >> + in.daddr = fl4->saddr;
> >> + in.saddr = fl4->daddr;
> >> + in.flowi4_iif = fl4->flowi4_oif;
> >> + in.flowi4_oif = 0;
> >> +
> >> + if (fib_lookup(sock_net(sk), &in, &res) ||
> >> + res.type != RTN_LOCAL ||
> >> + fl4->flowi4_oif != FIB_RES_DEV(res)->ifindex) {
> >> + /* Failed, so this was a false hit */
> >> + dst_release(dst);
> >> + dst = NULL;
> >> + continue;
> >> }
> >> +
> >> + break;
> >> }
> >>
> >> out_unlock:
> >> --
> >> 2.4.1
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> >> the body of a message to majordomo@...r.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists