netdev - Re: [PATCH v2 net-next 0/5] Add bpf support to set sk_bound_dev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <69b9c20b-9566-5f75-cebd-a0bd243c2d65@cumulusnetworks.com>
Date:   Mon, 31 Oct 2016 11:16:39 -0600
From:   David Ahern <dsa@...ulusnetworks.com>
To:     David Miller <davem@...emloft.net>
Cc:     netdev@...r.kernel.org, daniel@...que.org, ast@...com,
        daniel@...earbox.net, maheshb@...gle.com, tgraf@...g.ch
Subject: Re: [PATCH v2 net-next 0/5] Add bpf support to set sk_bound_dev_if

On 10/31/16 11:01 AM, David Miller wrote:
> From: David Ahern <dsa@...ulusnetworks.com>
> Date: Wed, 26 Oct 2016 17:58:37 -0700
> 
>> The recently added VRF support in Linux leverages the bind-to-device
>> API for programs to specify an L3 domain for a socket. While
>> SO_BINDTODEVICE has been around for ages, not every ipv4/ipv6 capable
>> program has support for it. Even for those programs that do support it,
>> the API requires processes to be started as root (CAP_NET_RAW) which
>> is not desirable from a general security perspective.
>>
>> This patch set leverages Daniel Mack's work to attach bpf programs to
>> a cgroup:
>>
>>     https://www.mail-archive.com/netdev@vger.kernel.org/msg134028.html
>>
>> to provide a capability to set sk_bound_dev_if for all AF_INET{6}
>> sockets opened by a process in a cgroup when the sockets are allocated.
>>
>> This capability enables running any program in a VRF context and is key
>> to deploying Management VRF, a fundamental configuration for networking
>> gear, with any Linux OS installation.
> 
> Ok, after some review I think I understand what's going on here.
> 
> It would initially seem simpler to just support forced sk_bound_dev_if
> in cgroups.  But I think I understand why you may have gone this way:

That's what the l3mdev cgroup patch does -- force the sk_bound_dev_if for sockets. Tejun pushed back on adding new controllers. The cgroup+bpf is another way to accomplish the end goal. The key is using the cgroup infra for parent-child inheritance of the policy, holder of the policy "data" to be applied, tracking what processes are in a group, what the group is for a specific process, and on. No need to reinvent that part.

> 
> 1) The cgroup-bpf code always has the cgroup hierarchy propagation
>    logic.
> 
> 2) The may be use cases for doing things with other sock members.
> 
> With respect to #2, do you know of any such planned use cases already?

One suggestion is the local port binding limitations that Mahesh and Anoop were looking into.

> 
> Also, any reason why you don't allow the cgroup bpf sk filter to return
> an error code so that the sock creation could be cancelled if the eBPF
> program desires that?  It could be useful, I suppose.

My first draft at this feature had that but I removed it for simplicity now. Can certainly add it back.