[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05707e9e-08ac-4ee1-b910-883f8b4b2636@blackwall.org>
Date: Mon, 16 Sep 2024 11:48:17 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Jiwon Kim <jiwonaid0@...il.com>, jv@...sburgh.net, andy@...yhouse.net,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, ast@...nel.org, daniel@...earbox.net, hawk@...nel.org,
john.fastabend@...il.com, joamaki@...il.com
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org,
syzbot+c187823a52ed505b2257@...kaller.appspotmail.com
Subject: Re: [PATCH net] bondig: Add bond_xdp_check for bond_xdp_xmit in
bond_main.c
On 16/09/2024 08:50, Jiwon Kim wrote:
> Add bond_xdp_check to ensure the bond interface is in a valid state.
>
> syzbot reported WARNING in bond_xdp_get_xmit_slave.
> In bond_xdp_get_xmit_slave, the comment says
> /* Should never happen. Mode guarded by bond_xdp_check() */.
> However, it does not check the status when entering bond_xdp_xmit.
>
> Reported-by: syzbot+c187823a52ed505b2257@...kaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=c187823a52ed505b2257
> Fixes: 9e2ee5c7e7c3 ("net, bonding: Add XDP support to the bonding driver")
> Signed-off-by: Jiwon Kim <jiwonaid0@...il.com>
> ---
> drivers/net/bonding/bond_main.c | 33 ++++++++++++++++++---------------
> 1 file changed, 18 insertions(+), 15 deletions(-)
>
How did you figure the problem is there? Did you take any time to actually
understand it? This patch doesn't fix anything, the warning can be easily
triggered with it. The actual fix is to remove that WARN_ON() altogether
and downgrade the netdev_err() to a ratelimited version. The reason is that
we can always get to a state where at least 1 bond device has xdp program
installed which increases bpf_master_redirect_enabled_key and another bond
device which uses xdpgeneric, then install an ebpf program that simply
returns ACT_TX on xdpgeneric bond's slave and voila - you get the warning.
setup is[1]:
$ ip l add veth0 type veth peer veth1
$ ip l add veth3 type veth peer veth4
$ ip l add bond0 type bond mode 6 # <- transmit-alb mode, unsupported by xdp
$ ip l add bond1 type bond # <- rr mode by default, supported by xdp
$ ip l set veth0 master bond1
$ ip l set bond1 up
$ ip l set dev bond1 xdpdrv object tx_xdp.o section xdp_tx # <- we need xdpdrv program to increase the static key, more below
$ ip l set veth3 master bond0
$ ip l set bond0 up
$ ip l set veth4 up
$ ip l set veth3 xdpgeneric object tx_xdp.o section xdp_tx # <- now we'll hit the codepath we need after veth3 Rx's a packet
If you take the time to look at the call stack and the actual code, you'll
see it goes something like (for the xdpgeneric bond slave, veth3):
...
bpf_prog_run_generic_xdp() for veth3
-> bpf_prog_run_xdp()
-> __bpf_prog_run() # return ACT_TX
-> xdp_master_redirect() # called because we have ACT_TX && netif_is_bond_slave(xdp->rxq->dev)
-> master->netdev_ops->ndo_xdp_get_xmit_slave(master, xdp); # and here we go, WARN_ON()
I've had a patch for awhile now about this and have taken the time to look into it.
I guess it's time to dust it off and send it out for review. :)
Thanks,
Nik
Powered by blists - more mailing lists