netdev - Re: [RFC PATCH net-next 00/11] netns: don't switch namespace while creating kernel sockets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ioc39v6v.fsf@x220.int.ebiederm.org>
Date:	Fri, 08 May 2015 06:15:36 -0500
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Ying Xue <ying.xue@...driver.com>
Cc:	Cong Wang <cwang@...pensource.com>,
	netdev <netdev@...r.kernel.org>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Pavel Emelyanov <xemul@...nvz.org>,
	David Miller <davem@...emloft.net>,
	Eric Dumazet <eric.dumazet@...il.com>, <maxk@....qualcomm.com>,
	Stephen Hemminger <stephen@...workplumber.org>,
	Thomas Graf <tgraf@...g.ch>,
	Nicolas Dichtel <nicolas.dichtel@...nd.com>,
	Tom Herbert <tom@...bertland.com>,
	James Chapman <jchapman@...alix.com>,
	Erik Hugne <erik.hugne@...csson.com>, <jon.maloy@...csson.com>,
	Simon Horman <horms@...ge.net.au>
Subject: Re: [RFC PATCH net-next 00/11] netns: don't switch namespace while creating kernel sockets

Ying Xue <ying.xue@...driver.com> writes:

> On 05/08/2015 04:01 AM, Eric W. Biederman wrote:
>> Cong Wang <cwang@...pensource.com> writes:
>> 
>>> On Thu, May 7, 2015 at 11:58 AM, Eric W. Biederman
>>> <ebiederm@...ssion.com> wrote:
>>>> Cong Wang <cwang@...pensource.com> writes:
>>>>
>>>>> On Thu, May 7, 2015 at 11:26 AM, Eric W. Biederman
>>>>> <ebiederm@...ssion.com> wrote:
>>>>>> Cong Wang <cwang@...pensource.com> writes:
>>>>>>
>>>>>>>
>>>>>>> Why does this have to be so complicated? We can simply avoid
>>>>>>> calling ops_init() by skipping those in cleanup_list, no?
>>>>>>
>>>>>> The problem is that there is a single list of methods to call and if you
>>>>>> simply skip calling the initialization methods for a struct net and add
>>>>>> yourself to the list cleanup_net will then call the cleanup methods
>>>>>> without calling the cleanup methods.
>>>>>
>>>>> If you mean pernet_list, ops->list has been already added before
>>>>> for_each_net().
>>>>>
>>>>>>
>>>>>> Simply limiting new network namespace registrations to a point when
>>>>>> network namespaces are not being registered or unregisted seems like
>>>>>> the simplest way to achieve this effect.
>>>>>>
>>>>>
>>>>> Literally, any point before ops_init().
>>>>
>>>> Think about what that what it means to add a set of operations to the
>>>> pernet_list and then to skip a network namespace with a count of 0 and
>>>> then to have that network namespace exit with those methods on
>>>> pernet_list.
>>>>
>>>
>>> That is easy to solve, isn't it?
>> 
>> Nope.  That doesn't work.
>> 
>
> Cong, although I don't know why Eric confirmed your solution did not work, in my
> view it really exists a bit fault especially in locking policy. For instance,
> net->cleanup_list may be linked to cleanup_list list and probably it's inserted
> in net_kill_list too, and the both global lists are protected by two different
> locks respectively. But when we check list_empty(&net->cleanup_list), any lock
> is not held.
>
> However, except for the point, overall, I think your idea is workable.
>
> So, Eric, can you please further explain why Cong's proposal doesn't
> work?

Because changing where the list assignment happens under the net_mutex
does not affect anything, and I already explained why it did not work
once.

In particular it is possible to call the cleanup methods on a network
namespace where the initialization methods where not called.

If we want to avoid the complexity required to wait for no network
namespaces to be exiting that was implicit in my version of
lock_network_namespace() we need to allocate some per network namespace
per network operations state.  To remember if the per network namespace
operations have been initialized on a network namespace.  Essentially
creating a per network namespace copy of the pernet_list.

My end game would be to work towards reducing the scope of the net_mutex
so potentially we could have two network namespaces exiting at the same
time.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html