[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06b1d740-d443-ac23-a7b0-675e7b6ff6f9@virtuozzo.com>
Date: Wed, 15 Nov 2017 15:36:34 +0300
From: Kirill Tkhai <ktkhai@...tuozzo.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Cong Wang <xiyou.wangcong@...il.com>,
David Miller <davem@...emloft.net>, vyasevic@...hat.com,
kstewart@...uxfoundation.org, pombredanne@...b.com,
Vladislav Yasevich <vyasevich@...il.com>, mark.rutland@....com,
Greg KH <gregkh@...uxfoundation.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Florian Westphal <fw@...len.de>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>,
roman.kapl@...go.com, Paul Moore <paul@...l-moore.com>,
David Ahern <dsahern@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
lucien xin <lucien.xin@...il.com>,
Matthias Schiffer <mschiffer@...verse-factory.net>,
rshearma@...cade.com, LKML <linux-kernel@...r.kernel.org>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
avagin@...tuozzo.com, gorcunov@...tuozzo.com
Subject: Re: [PATCH] net: Convert net_mutex into rw_semaphore and down read it
on net->init/->exit
On 15.11.2017 12:51, Kirill Tkhai wrote:
> On 15.11.2017 06:19, Eric W. Biederman wrote:
>> Kirill Tkhai <ktkhai@...tuozzo.com> writes:
>>
>>> On 14.11.2017 21:39, Cong Wang wrote:
>>>> On Tue, Nov 14, 2017 at 5:53 AM, Kirill Tkhai <ktkhai@...tuozzo.com> wrote:
>>>>> @@ -406,7 +406,7 @@ struct net *copy_net_ns(unsigned long flags,
>>>>>
>>>>> get_user_ns(user_ns);
>>>>>
>>>>> - rv = mutex_lock_killable(&net_mutex);
>>>>> + rv = down_read_killable(&net_sem);
>>>>> if (rv < 0) {
>>>>> net_free(net);
>>>>> dec_net_namespaces(ucounts);
>>>>> @@ -421,7 +421,7 @@ struct net *copy_net_ns(unsigned long flags,
>>>>> list_add_tail_rcu(&net->list, &net_namespace_list);
>>>>> rtnl_unlock();
>>>>> }
>>>>> - mutex_unlock(&net_mutex);
>>>>> + up_read(&net_sem);
>>>>> if (rv < 0) {
>>>>> dec_net_namespaces(ucounts);
>>>>> put_user_ns(user_ns);
>>>>> @@ -446,7 +446,7 @@ static void cleanup_net(struct work_struct *work)
>>>>> list_replace_init(&cleanup_list, &net_kill_list);
>>>>> spin_unlock_irq(&cleanup_list_lock);
>>>>>
>>>>> - mutex_lock(&net_mutex);
>>>>> + down_read(&net_sem);
>>>>>
>>>>> /* Don't let anyone else find us. */
>>>>> rtnl_lock();
>>>>> @@ -486,7 +486,7 @@ static void cleanup_net(struct work_struct *work)
>>>>> list_for_each_entry_reverse(ops, &pernet_list, list)
>>>>> ops_free_list(ops, &net_exit_list);
>>>>>
>>>>> - mutex_unlock(&net_mutex);
>>>>> + up_read(&net_sem);
>>>>
>>>> After your patch setup_net() could run concurrently with cleanup_net(),
>>>> given that ops_exit_list() is called on error path of setup_net() too,
>>>> it means ops->exit() now could run concurrently if it doesn't have its
>>>> own lock. Not sure if this breaks any existing user.
>>>
>>> Yes, there will be possible concurrent ops->init() for a net namespace,
>>> and ops->exit() for another one. I hadn't found pernet operations, which
>>> have a problem with that. If they exist, they are hidden and not clear seen.
>>> The pernet operations in general do not touch someone else's memory.
>>> If suddenly there is one, KASAN should show it after a while.
>>
>> Certainly the use of hash tables shared between multiple network
>> namespaces would count. I don't rembmer how many of these we have but
>> there used to be quite a few.
>
> Could you please provide an example of hash tables, you mean?
Ah, I see, it's dccp_hashinfo etc.
Powered by blists - more mailing lists