netdev - Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAM_iQpX7fC0HWz2X2R7uZNeiD_rEAe4QcSQiK+YZEHF8jkGBBA@mail.gmail.com>
Date:   Thu, 6 Oct 2016 12:01:15 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Krister Johansen <kjlx@...pleofstupid.com>
Cc:     Jamal Hadi Salim <jhs@...atatu.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net] Panic when tc_lookup_action_n finds a partially
 initialized action.

On Wed, Oct 5, 2016 at 11:11 PM, Krister Johansen
<kjlx@...pleofstupid.com> wrote:
>
> I'm not sure.  The reason I didn't take this approach from the outset is
> that all of TC's callers of tcf_register_action pass a pointer to a
> static structure as their *ops argument.  The existence of code that
> checks the action for uniqueness suggests that it's possible for
> tcf_register_action to get passed two identical tc_action_ops.  If that
> happens in the current code base, we'll also get passed a duplicate

Each tc action module has its own unique ops, and kernel doesn't allow
one module to register twice (either in parallel or not, see
add_unformed_module()), so we should not have a duplicated case.


> pernet_operations pointer.  The code in register_pernet_subsys() makes
> no attempt to check for duplicates.  If we add a pointer that's already
> in the list, and subsequently call unregister, the results seem
> undefined.  It looks like we'll remove the pernet_operations for the
> existing action, assuming we don't corrupt the list in the process.
>
> Is this actually safe?  If so, what corner case is the act->type /
> act->kind protecting us from?

ops->type and ops->kind should be unique too, user-space already
relies on this (tc action ls action xxx). The code exists probably just
for sanity check.

So please give that patch a try, let's see if we miss any other problem.

>
>> (Sorry that I don't have the environment to reproduce your bug)
>
> I'm sorry that I didn't do a good job of explaining how we end up in
> this situation in the first place.  I can give a few more details,
> because it may explain some of my concern about the request_module()
> call.
>
> The system that encounters this bug launches a bunch of containers from
> systemd on boot.  Each container creates a new user, net, pid, and mount
> namespace and begins its setup.  When the networking in all of these
> containers, each in a new netns, try to configure TC and no modules are
> loaded we end up with this race.
>
> I can also reproduce by unloading the modules, and then launching a
> bunch of processes that configure tc in new namespaces.
>
> Part of the desire to inhibit extra modprobe calls is that if hundreds
> of these all start at once on boot, it's really unnecessary to have all
> of the rest of them wait while lots of extra modprobe calls are forked
> by the kernel.

You can tell systemd to load these modules before starting these
containers to avoid blocking, no?

Thanks.