[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3776.1642550885@famine>
Date: Tue, 18 Jan 2022 16:08:05 -0800
From: Jay Vosburgh <jay.vosburgh@...onical.com>
To: Ivan Babrou <ivan@...udflare.com>
cc: Jussi Maki <joamaki@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Veaceslav Falico <vfalico@...il.com>,
Andy Gospodarek <andy@...yhouse.net>,
kernel-team <kernel-team@...udflare.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Empty return from bond_eth_hash in 5.15
Ivan Babrou <ivan@...udflare.com> wrote:
>Hello,
>
>We noticed an issue on Linux 5.15 where it sends packets from a single
>connection via different bond members. Some of our machines are
>connected to multiple TORs, which means that BGP can attract the same
>connection to different servers, depending on which cable you
>traverse.
>
>On Linux 5.10 I can see bond_xmit_hash always return the same hash for
>the same connection:
>
>$ sudo bpftrace --include linux/ip.h -e 'kprobe:bond_xmit_hash {
>@skbs[pid] = arg1 } kretprobe:bond_xmit_hash { $skb_ptr = @skbs[pid];
>if ($skb_ptr) { $skb = (struct sk_buff *) $skb_ptr; $ipheader =
>((struct iphdr *) ($skb->head + $skb->network_header)); printf("%s
>%x\n", ntop($ipheader->daddr), retval); } }' | fgrep --line-buffered
>x.y.z.205
>x.y.z.205 9f24591
>x.y.z.205 9f24591
>x.y.z.205 9f24591
>x.y.z.205 9f24591
>x.y.z.205 9f24591
>... many more of these
>
>On Linux 5.10 I get fewer lines, mostly zeros for hash and one actual hash:
Presumably you mean 5.15 here.
>$ sudo bpftrace -e 'kprobe:bond_xmit_hash { @skbs[pid] = arg1 }
>kretprobe:bond_xmit_hash { $skb_ptr = @skbs[pid]; if ($skb_ptr) { $skb
>= (struct sk_buff *) $skb_ptr; $ipheader = ((struct iphdr *)
>($skb->head + $skb->network_header)); printf("%s %x\n",
>ntop($ipheader->daddr), retval); } }' | fgrep --line-buffered
>x.y.z.205
>x.y.z.205 0
>x.y.z.205 0
>x.y.z.205 215fec1b
>
>As I mentioned above, this ends up breaking connections for us, which
>is unfortunate.
>
>I suspect that "net, bonding: Refactor bond_xmit_hash for use with
>xdp_buff" commit a815bde56b1 has something to do with this. I don't
>think we use XDP on the machines I tested.
This sounds similar to the issue resolved by:
commit 429e3d123d9a50cc9882402e40e0ac912d88cfcf (HEAD -> master, origin/master, origin/HEAD)
Author: Moshe Tal <moshet@...dia.com>
Date: Sun Jan 16 19:39:29 2022 +0200
bonding: Fix extraction of ports from the packet headers
Wrong hash sends single stream to multiple output interfaces.
which was just committed to net a few days ago; are you in a
position that you'd be able to test this change?
-J
---
-Jay Vosburgh, jay.vosburgh@...onical.com
Powered by blists - more mailing lists