[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55B7A8B5.6020305@cumulusnetworks.com>
Date: Tue, 28 Jul 2015 10:07:17 -0600
From: David Ahern <dsa@...ulusnetworks.com>
To: Eric Dumazet <eric.dumazet@...il.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>
CC: "Eric W. Biederman" <ebiederm@...ssion.com>,
netdev@...r.kernel.org, shm@...ulusnetworks.com,
roopa@...ulusnetworks.com, gospo@...ulusnetworks.com,
jtoppins@...ulusnetworks.com, nikolay@...ulusnetworks.com,
ddutt@...ulusnetworks.com, nicolas.dichtel@...nd.com,
stephen@...workplumber.org, hadi@...atatu.com, davem@...emloft.net,
svaidya@...cade.com, mingo@...nel.org, luto@...capital.net
Subject: Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On 7/28/15 10:01 AM, Eric Dumazet wrote:
> On Tue, 2015-07-28 at 14:19 +0200, Hannes Frederic Sowa wrote:
>> Hello Eric,
>>
>> On Mon, 2015-07-27 at 15:33 -0500, Eric W. Biederman wrote:
>>> David Ahern <dsa@...ulusnetworks.com> writes:
>>>
>>>> Allow tasks to have a default device index for binding sockets. If
>>>> set
>>>> the value is passed to all AF_INET/AF_INET6 sockets when they are
>>>> created.
>>>>
>>>> The task setting is passed parent to child on fork, but can be set
>>>> or
>>>> changed after task creation using prctl (if task has CAP_NET_ADMIN
>>>> permissions). The setting for a socket can be retrieved using
>>>> prctl().
>>>> This option allows an administrator to restrict a task to only
>>>> send/receive
>>>> packets through the specified device. In the case of VRF devices
>>>> this
>>>> option restricts tasks to a specific VRF.
>>>>
>>>> Correlation of the device index to a specific VRF, ie.,
>>>> ifindex --> VRF device --> VRF id
>>>> is left to userspace.
>>>
>>> Nacked-by: "Eric W. Biederman" <ebiederm@...ssion.com>
>>>
>>> Because it is broken by design. Your routing device is only safe for
>>> programs that know it's limitations it is not appropriate for general
>>> applications.
>>>
>>> Since you don't even seen to know it's limitations I think this is a
>>> bad path to walk down.
>>
>> Can you please elaborate about the broken by design?
>>
>> Different operating systems are already using this approach with good
>> success. I read your other mail regarding isolation of different VRFs
>> and I agree that all code which persists state depending solely on the
>> IP address is affected by this and this must be dealt with and fixed
>> (actually, there aren't too many).
>>
>> But I wouldn't call that broken by design. This stuff will get fixed
>> like e.g. cross-talk between fragmentation queues, icmp rate limiters
>> etc, which could already happen in the past.
>>
>> What is your opinion on the fundamental approach only from a user
>> perspective? Do you think that is broken, too?
>
> I agree with Eric here.
>
> This sk_bind_dev_if on task_struct is quite a hack.
>
> What will be added next ? An array of dev_if ? netfilter support ?
> af_packet support ? What about /proc files and netlink dumps ?
It could just as easily be a pointer to a struct (e.g., struct net_ctx)
such that the intrusion to task_struct is simply 8 bytes -- very similar
to the nsproxy used for the assorted namespaces. The struct can then
contain whatever network config is imposed on the task.
>
> We already have network namespaces. Extend this if needed, instead of
> bypassing them.
Problems with using network namespaces for VRFs has been discussed in
the past. e.g.,
http://www.spinics.net/lists/netdev/msg298368.html
David
>
> No need to add something else (with lack of proper reporting for various
> tools)
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists