lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMDBHYJGU-8SgYnzMKJsD6QKdMBSi4evHxJi2p-fJ12+09fBsw@mail.gmail.com>
Date:   Mon, 30 Oct 2017 18:39:03 -0400
From:   Lucas Bates <lucasb@...atatu.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     netdev@...r.kernel.org, Chris Mi <chrism@...lanox.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Jiri Pirko <jiri@...nulli.us>,
        John Fastabend <john.fastabend@...il.com>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [Patch net 00/16] net_sched: fix races with RCU callbacks

e.On Thu, Oct 26, 2017 at 9:24 PM, Cong Wang <xiyou.wangcong@...il.com> wrote:
> Recently, the RCU callbacks used in TC filters and TC actions keep
> drawing my attention, they introduce at least 4 race condition bugs:
<snip>
> As suggested by Paul, we could defer the work to a workqueue and
> gain the permission of holding RTNL again without any performance
> impact, however, in tcf_block_put() we could have a deadlock when
> flushing workqueue while hodling RTNL lock, the trick here is to
> defer the work itself in workqueue and make it queued after all
> other works so that we keep the same ordering to avoid any
> use-after-free. Please see the first patch for details.

Cong, I don't believe the problem's been resolved just yet....  I have
a new kernel, compiled just today and I'm still tripping over a kernel
bug in this scenario when I run Chris' new test case.

I'm doing this on a machine where I don't have a spare device to use
on the run. Instead I created a veth pair that will have one end
migrated into the container.

The bug isn't consistent. I'm running into it anywhere between one and
four runs of the d052 test case.

Steps to reproduce:

sudo ip li add foo type veth
sudo ./tdc.py -d foo -c flower
[repeat until kernel bug encountered]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ