lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h6lxy3zq.fsf@toke.dk>
Date: Tue, 07 Nov 2023 16:31:21 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: "Nelson, Shannon" <shannon.nelson@....com>, Jesper Dangaard Brouer
 <hawk@...nel.org>, David Ahern <dsahern@...il.com>, Jakub Kicinski
 <kuba@...nel.org>, netdev@...r.kernel.org, bpf@...r.kernel.org, Daniel
 Borkmann <daniel@...earbox.net>, Alexei Starovoitov <ast@...nel.org>,
 Andrii Nakryiko <andrii@...nel.org>
Subject: Re: BPF/XDP: kernel panic when removing an interface that is an
 xdp_redirect target

"Nelson, Shannon" <shannon.nelson@....com> writes:

> While testing new code to support XDP in the ionic driver we found that 
> we could panic the kernel by running a bind/unbind loop on the target 
> interface of an xdp_redirect action.  Obviously this is a stress test 
> that is abusing the system, but it does point to a window of opportunity 
> in bq_enqueue() and bq_xmit_all().  I believe that while the validity of 
> the target interface has been checked in __xdp_enqueue(), the interface 
> can be unbound by the time either bq_enqueue() or bq_xmit_all() tries to 
> use the interface.  There is no locking or reference taken on the 
> interface to hold it in place before the target’s ndo_xdp_xmit() is called.
>
> Below is a stack trace that our tester captured while running our test 
> code on a RHEL 9.2 kernel – yes, I know, unpublished driver code on a 
> non-upstream kernel.  But if you look at the current upstream code in 
> kernel/bpf/devmap.c I think you can see what we ran into.
>
> Other than telling users to not abuse the system with a bind/unbind 
> loop, is there something we can do to limit the potential pain here? 
> Without knowing what interfaces might be targeted by the users’ XDP 
> programs, is there a step the originating driver can do to take 
> precautions?  Did we simply miss a step in the driver, or is this an 
> actual problem in the devmap code?

Sounds like a driver bug :)

The XDP redirect flow guarantees that all outstanding packets are
flushed within a single NAPI cycle, as documented here:
https://docs.kernel.org/bpf/redirect.html

So basically, the driver should be doing a two-step teardown: remove
global visibility of the resource in question, wait for all concurrent
users to finish, and *then* free the data structure. This corresponds to
the usual RCU protection: resources should be kept alive until all
concurrent RCU critical sections have exited on all CPUs. So if your
driver is removing an interface's data structure without waiting for
concurrent NAPI cycles to finish, that's a bug in the driver.

This kind of thing is what the synchronize_net() function is for; for a
usage example, see veth_napi_del_range(). My guess would be that you're
missing this as part of your driver teardown flow?

Another source of a bug like this could be that your driver does not in
fact call xdp_do_flush() before exiting its NAPI cycle, so that there
will be packets from the previous cycle in the bq queue, in which case
the assumption mentioned in the linked document obviously breaks down.
But that would also be a driver bug :)

-Toke


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ