linux-kernel - Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1444991103.2861759.411876897.42C807BD@webmail.messagingengine.com>
Date:	Fri, 16 Oct 2015 12:25:03 +0200
From:	Hannes Frederic Sowa <hannes@...essinduktion.org>
To:	Daniel Borkmann <daniel@...earbox.net>, davem@...emloft.net
Cc:	ast@...mgrid.com, viro@...IV.linux.org.uk, ebiederm@...ssion.com,
	tgraf@...g.ch, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs

On Fri, Oct 16, 2015, at 03:09, Daniel Borkmann wrote:
> This eventually leads us to this patch, which implements a minimal
> eBPF file system. The idea is a bit similar, but to the point that
> these inodes reside at one or multiple mount points. A directory
> hierarchy can be tailored to a specific application use-case from the
> various subsystem users and maps/progs pinned inside it. Two new eBPF
> commands (BPF_PIN_FD, BPF_NEW_FD) have been added to the syscall in
> order to create one or multiple special inodes from an existing file
> descriptor that points to a map/program (we call it eBPF fd pinning),
> or to create a new file descriptor from an existing special inode.
> BPF_PIN_FD requires CAP_SYS_ADMIN capabilities, whereas BPF_NEW_FD
> can also be done unpriviledged when having appropriate permissions
> to the path.

In my opinion this is very un-unixiy, I have to say at least.

Namespaces at some point dealt with the same problem, they nowadays use
bind mounts of /proc/$$/ns/* to some place in the file hierarchy to keep
the namespace alive. This at least allows someone to build up its own
hierarchy with normal unix tools and not hidden inside a C-program. For
filedescriptors we already have /proc/$$/fd/* but it seems that doesn't
work out of the box nowadays.

I don't know in terms of how many objects bpf should be able to handle
and if such a bind-mount based solution would work, I guess not.

In my opinion I still favor a user space approach. Subsystems which use
ebpf in a way that no user space program needs to be running to control
them would need to export the fds by itself. E.g. something like
sysfs/kobject for tc? The hierarchy would then be in control of the
subsystem which could also create a proper naming hierarchy or maybe
even use an already given one. Do most other eBPF users really need to
persist file descriptors somewhere without user space control and pick
them up later? 

Sorry for the rant and thanks for posting this patchset,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/