lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 5 Oct 2016 11:01:38 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Krister Johansen <kjlx@...pleofstupid.com>
Cc:     Jamal Hadi Salim <jhs@...atatu.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net] Panic when tc_lookup_action_n finds a partially
 initialized action.

On Tue, Oct 4, 2016 at 11:52 PM, Krister Johansen
<kjlx@...pleofstupid.com> wrote:
> On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote:
>> Please try the attached patch. I also convert the read path to RCU
>> to avoid a possible deadlock. A quick test shows no lockdep splat.
>
> I tried this patch, but it doesn't solve the problem.  I got a panic on
> my very first try:

Thanks for testing it.


> The problem here is the same as before: by using RCU the race isn't
> fixed because the module is still discoverable from act_base before the
> pernet initialization is completed.
>
> You can see from the trap frame that the first two arguments to
> tcf_hash_check were 0.  It couldn't look up the correct per-subsystem
> pointer because the id hadn't yet been registered.

I thought the problem is that we don't do pernet ops registration and
action ops registration atomically therefore chose to use mutex+RCU,
but I was wrong, the problem here is just ordering, we need to finish
the pernet initialization before making action ops visible.

If so, why not just reorder them? Does the attached patch make any
sense now? Our pernet init doesn't rely on act_base, so even we have
some race, the worst case is after we initialize the pernet netns for an
action but its ops still not visible, which seems fine (at least no crash).

Or I still miss something here?

(Sorry that I don't have the environment to reproduce your bug)

Thanks for your patience and testing!

View attachment "act_api_register.diff" of type "text/plain" (1306 bytes)

Powered by blists - more mailing lists