lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mv3kwxel.fsf@xmission.com>
Date:   Fri, 17 Nov 2017 12:52:18 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Kirill Tkhai <ktkhai@...tuozzo.com>
Cc:     Cong Wang <xiyou.wangcong@...il.com>,
        David Miller <davem@...emloft.net>, vyasevic@...hat.com,
        kstewart@...uxfoundation.org, pombredanne@...b.com,
        Vladislav Yasevich <vyasevich@...il.com>, mark.rutland@....com,
        Greg KH <gregkh@...uxfoundation.org>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Florian Westphal <fw@...len.de>,
        Nicolas Dichtel <nicolas.dichtel@...nd.com>,
        roman.kapl@...go.com, Paul Moore <paul@...l-moore.com>,
        David Ahern <dsahern@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        lucien xin <lucien.xin@...il.com>,
        Matthias Schiffer <mschiffer@...verse-factory.net>,
        rshearma@...cade.com, LKML <linux-kernel@...r.kernel.org>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        avagin@...tuozzo.com, gorcunov@...tuozzo.com
Subject: Re: [PATCH] net: Convert net_mutex into rw_semaphore and down read it on net->init/->exit

Kirill Tkhai <ktkhai@...tuozzo.com> writes:

> On 15.11.2017 19:31, Eric W. Biederman wrote:
>> Kirill Tkhai <ktkhai@...tuozzo.com> writes:
>> 
>>> On 15.11.2017 12:51, Kirill Tkhai wrote:
>>>> On 15.11.2017 06:19, Eric W. Biederman wrote:
>>>>> Kirill Tkhai <ktkhai@...tuozzo.com> writes:
>>>>>
>>>>>> On 14.11.2017 21:39, Cong Wang wrote:
>>>>>>> On Tue, Nov 14, 2017 at 5:53 AM, Kirill Tkhai <ktkhai@...tuozzo.com> wrote:
>>>>>>>> @@ -406,7 +406,7 @@ struct net *copy_net_ns(unsigned long flags,
>>>>>>>>
>>>>>>>>         get_user_ns(user_ns);
>>>>>>>>
>>>>>>>> -       rv = mutex_lock_killable(&net_mutex);
>>>>>>>> +       rv = down_read_killable(&net_sem);
>>>>>>>>         if (rv < 0) {
>>>>>>>>                 net_free(net);
>>>>>>>>                 dec_net_namespaces(ucounts);
>>>>>>>> @@ -421,7 +421,7 @@ struct net *copy_net_ns(unsigned long flags,
>>>>>>>>                 list_add_tail_rcu(&net->list, &net_namespace_list);
>>>>>>>>                 rtnl_unlock();
>>>>>>>>         }
>>>>>>>> -       mutex_unlock(&net_mutex);
>>>>>>>> +       up_read(&net_sem);
>>>>>>>>         if (rv < 0) {
>>>>>>>>                 dec_net_namespaces(ucounts);
>>>>>>>>                 put_user_ns(user_ns);
>>>>>>>> @@ -446,7 +446,7 @@ static void cleanup_net(struct work_struct *work)
>>>>>>>>         list_replace_init(&cleanup_list, &net_kill_list);
>>>>>>>>         spin_unlock_irq(&cleanup_list_lock);
>>>>>>>>
>>>>>>>> -       mutex_lock(&net_mutex);
>>>>>>>> +       down_read(&net_sem);
>>>>>>>>
>>>>>>>>         /* Don't let anyone else find us. */
>>>>>>>>         rtnl_lock();
>>>>>>>> @@ -486,7 +486,7 @@ static void cleanup_net(struct work_struct *work)
>>>>>>>>         list_for_each_entry_reverse(ops, &pernet_list, list)
>>>>>>>>                 ops_free_list(ops, &net_exit_list);
>>>>>>>>
>>>>>>>> -       mutex_unlock(&net_mutex);
>>>>>>>> +       up_read(&net_sem);
>>>>>>>
>>>>>>> After your patch setup_net() could run concurrently with cleanup_net(),
>>>>>>> given that ops_exit_list() is called on error path of setup_net() too,
>>>>>>> it means ops->exit() now could run concurrently if it doesn't have its
>>>>>>> own lock. Not sure if this breaks any existing user.
>>>>>>
>>>>>> Yes, there will be possible concurrent ops->init() for a net namespace,
>>>>>> and ops->exit() for another one. I hadn't found pernet operations, which
>>>>>> have a problem with that. If they exist, they are hidden and not clear seen.
>>>>>> The pernet operations in general do not touch someone else's memory.
>>>>>> If suddenly there is one, KASAN should show it after a while.
>>>>>
>>>>> Certainly the use of hash tables shared between multiple network
>>>>> namespaces would count.  I don't rembmer how many of these we have but
>>>>> there used to be quite a few.
>>>>
>>>> Could you please provide an example of hash tables, you mean?
>>>
>>> Ah, I see, it's dccp_hashinfo etc.
>
> JFI, I've checked dccp_hashinfo, and it seems to be safe.
>
>> 
>> The big one used to be the route cache.  With resizable hash tables
>> things may be getting better in that regard.
>
> I've checked some fib-related things, and wasn't able to find that.
> Excuse me, could you please clarify, if it's an assumption, or
> there is exactly a problem hash table, you know? Could you please
> point it me more exactly, if it's so.

Two things.
1) Hash tables are one case I know where we access data from multiple
   network namespaces.  As such it can not be asserted that is no
   possibility for problems.

2) The responsible way to handle this is one patch for each set of
   methods explaining why those methods are safe to run in parallel.

   That ensures there is opportunity for review and people are going
   slowly enough that they actually look at these issues.

The reason I want to see this broken up is that at 200ish sets of
methods it is too much to review all at once.

I completely agree that odds are that this can be made safe and that it
is mostly likely already safe in practically every instance.    My guess
would be that if there are problems that need to be addressed they
happen in one or two places and we need to find them.  If possible I
don't want to find them after the code has shipped in a stable release.

Eric



   

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ