linux-kernel - Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <562518B8.2070401@plumgrid.com>
Date:	Mon, 19 Oct 2015 09:22:16 -0700
From:	Alexei Starovoitov <ast@...mgrid.com>
To:	Daniel Borkmann <daniel@...earbox.net>,
	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	davem@...emloft.net, viro@...IV.linux.org.uk, tgraf@...g.ch,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

On 10/19/15 7:23 AM, Daniel Borkmann wrote:
>>> The mknod is not the holder but rather the kobject which should be
>>> represented in sysfs will be. So you can still get the map major:minor
>>> by looking up the /dev file in the correspdonding sysfs directory or I
>>> think we should provide a 'unbind' file, which will drop the kobject if
>>> the user writes a '1' to it.
>>
>> I agree, this could still be done.

imo doing 'rm' is way cleaner then dealing with 'unbind' file.

> As Hannes said, under /sys/class/bpf/ an admin can see all held nodes, so
> visibility is there for free at all times. The device management (creation/
> deletion) itself and the mknod's pointing to it are simply decoupled.
>
> This whole approach looks sound to me, also integrates nicely into the
> existing Linux facilities, and works on top of every fs supporting special
> files. Much cleaner than an extra file-system that would be required by a
> syscall in order to make the syscall work.

thanks for the explanations. I think I got a complete picture now on
how such cdev will be used and I don't like it.
There is nothing in linux or any unix that creates thousands of cdevs
on the fly, but here user apps will create/destroy them back and forth
and they would need to do it quickly. Whole sysfs/kobj baggage is
completely unnecessary here. The kernel will consume more memory for
no real reason other than cdev are used to keep prog/maps around.
imo fs is cleaner and we can tailor it to be similar to cdev style.
For example we can make bpffs automount in /sys/kernel/bpf/ as standard
location and have one directory structure for all mounts (like tracefs).
Then within it have idr mechanism to crate bpf_progX and bpf_mapY
special files via BPF_PIN_FD bpf syscall with single FD argument.
At this point fs and cdev approach from user point of view look
exactly the same, but overhead of fs is significantly lower,
normal 'rm' works just fine and much faster.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/