[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bnhv9uz9.fsf@x220.int.ebiederm.org>
Date: Fri, 08 May 2015 06:20:10 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Cong Wang <cwang@...pensource.com>
Cc: Herbert Xu <herbert@...dor.apana.org.au>,
Ying Xue <ying.xue@...driver.com>,
netdev <netdev@...r.kernel.org>, xemul@...nvz.org,
David Miller <davem@...emloft.net>,
Eric Dumazet <eric.dumazet@...il.com>, maxk@....qualcomm.com,
Stephen Hemminger <stephen@...workplumber.org>,
Thomas Graf <tgraf@...g.ch>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>,
Tom Herbert <tom@...bertland.com>,
James Chapman <jchapman@...alix.com>,
Erik Hugne <erik.hugne@...csson.com>, jon.maloy@...csson.com,
horms@...ge.net.au
Subject: Re: [RFC PATCH net-next 01/11] netns: Fix race between put_net() and netlink_kernel_create()
Cong Wang <cwang@...pensource.com> writes:
> On Thu, May 7, 2015 at 2:04 AM, Herbert Xu <herbert@...dor.apana.org.au> wrote:
>> On Thu, May 07, 2015 at 04:52:40PM +0800, Ying Xue wrote:
>>>
>>> @@ -409,12 +410,17 @@ void __put_net(struct net *net)
>>> {
>>> /* Cleanup the network namespace in process context */
>>> unsigned long flags;
>>> + bool added = false;
>>>
>>> spin_lock_irqsave(&cleanup_list_lock, flags);
>>> - list_add(&net->cleanup_list, &cleanup_list);
>>> + if (list_empty(&net->cleanup_list)) {
>>> + list_add(&net->cleanup_list, &cleanup_list);
>>> + added = true;
>>> + }
>>
>> Stop right there. If a ref count is hitting zero twice you've
>> got big problems. Because it means that after hitting zero the
>> first time it'll become positive again (if only briefly) which
>> opens you up to a race against the cleanup.
>
> That is true.
>
>>
>> So this idea is a non-starter.
>>
>> The rules for ref counts are really simple, if you hit zero then
>> you're dead. Can someone explain to me in simple terms why this
>> ref count is special and needs to hit zero twice?
>
> Because after it hits zero, it queues the final cleanup to a workqueue,
> which could be delayed (depends on when the worker will be scheduled),
> therefore there is a small window that new ->init() could come in.
The bad case in all of this is if we hit that window and increment the
count. The namespace could be removed from the lists ->exit() methods
could run. Then the reference count could drop to zero again.
Having a network namespace alive with cleanup methods having already run
is too nasty to think about. We really don't want to do that.
Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists