lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180222184647.0e1d6631@redhat.com>
Date:   Thu, 22 Feb 2018 18:46:47 +0100
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     brouer@...hat.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, mst@...hat.com,
        christoffer.dall@...aro.org, sergei.shtylyov@...entembedded.com
Subject: Re: [PATCH net v3 2/2] tuntap: correctly add the missing xdp flush

On Thu, 22 Feb 2018 17:36:46 +0800
Jason Wang <jasowang@...hat.com> wrote:

> Commit 762c330d670e ("tuntap: add missing xdp flush") tries to fix the
> devmap stall caused by missed xdp flush by counting the pending xdp
> redirected packets and flush when it exceeds NAPI_POLL_WEIGHT or
> MSG_MORE is clear. This may lead to BUG() since xdp_do_flush() was
> called in the process context with preemption enabled. Simply
> disabling preemption may silence the warning but be not enough since
> process may move between different CPUS during a batch which cause
> xdp_do_flush() misses some CPU where the process run
> previously. Consider the fallouts, that commit was reverted. To fix
> the issue correctly, we can simply call xdp_do_flush() immediately
> after xdp_do_redirect(), a side effect is that this removes any
> possibility of batching which could be addressed in the future.
> 
> Reported-by: Christoffer Dall <christoffer.dall@...aro.org>
> Fixes: 762c330d670e ("tuntap: add missing xdp flush")
> Signed-off-by: Jason Wang <jasowang@...hat.com>
> ---
>  drivers/net/tun.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 2823a4a..a363ea2 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1662,6 +1662,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  			get_page(alloc_frag->page);
>  			alloc_frag->offset += buflen;
>  			err = xdp_do_redirect(tun->dev, &xdp, xdp_prog);
> +			xdp_do_flush_map();
>  			if (err)
>  				goto err_redirect;
>  			rcu_read_unlock();

As you have noticed, the xdp_do_redirect() + xdp_do_flush_map() rely
heavily on being executed in softirq/napi_schedule context.
Particularly the map infra devmap[1]+cpumap depend on the enqueue and
flush operation MUST happen on the same CPU (e.g. stores which
devices needs flushing in a this_cpu_ptr bitmap [1]).

What context is tun_build_skb() invoked under?

Even when you call xdp_do_redirect and xdp_do_flush_map right after
each-other, are we sure we cannot be preempted here?


[1] https://github.com/torvalds/linux/blob/master/kernel/bpf/devmap.c#L209-L215
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ