lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <851ad28e-dc8b-da7c-66fa-ef88d684d7d2@intel.com>
Date:   Fri, 6 Dec 2019 07:25:35 +0100
From:   Björn Töpel <bjorn.topel@...el.com>
To:     William Tu <u9012063@...il.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Xdp <xdp-newbies@...r.kernel.org>
Cc:     "Karlsson, Magnus" <magnus.karlsson@...el.com>
Subject: Re: Possible race condition on xsk_socket__create/xsk_bind

On 2019-12-06 00:21, William Tu wrote:
> Hi,
> 
> While testing XSK using OVS, we hit an issue when create xsk,
> destroy xsk, create xsk in a short time window.
> The call to xsk_socket__create returns EBUSY due to
>    xsk_bind
>      xdp_umem_assign_dev
>        xdp_get_umem_from_qid --> return EBUSY
> 
> I found that when everything works, the sequence is
>    <ovs creates xsk>
>    xsk_bind
>      xdp_umem_assign_dev
>    <ovs destroy xsk> ...
>    xsk_release
>    xsk_destruct
>      xdp_umem_release_deferred
>        xdp_umem_release
>          xdp_umem_clear_dev --> avoid the error above
> 
> But sometimes xsk_destruct has not yet called, the
> next call to xsk_bind shows up, ex:
> 
>    <ovs creates xsk>
>    xsk_bind
>      xdp_umem_assign_dev
>    <ovs destroy xsk> ...
>    xsk_release
>    xsk_bind
>      xdp_umem_assign_dev
>        xdp_get_umem_from_qid (failed!)
>    ....
>    xsk_destruct
> 
> Is there a way to make sure the previous xsk is fully cleanup,
> so we can safely call xsk_socket__create()?
>

Yes, the async cleanup is annoying. I *think* it can be done 
synchronous, since the map doesn't linger on a sockref anymore -- 
0402acd683c6 ("xsk: remove AF_XDP socket from map when the socket is 
released").

So, it's not a race, it just asynch. :-(

I'll take a stab at fixing this!


Cheers,
Björn


> The error is reproduced by OVS using:
> ovs-vsctl -- set interface afxdp-p0 options:n_rxq=1 type="afxdp" 
> options:xdp-mode=native
> ovs-vsctl -- set interface afxdp-p0 options:n_rxq=1 type="afxdp" 
> options:xdp-mode=generic
> ovs-vsctl -- set interface afxdp-p0 options:n_rxq=1 type="afxdp" 
> options:xdp-mode=native
> This just keeps create and destroy xsk on the same device.
> 
> Thanks
> William

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ