[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56214FAC.5060704@plumgrid.com>
Date: Fri, 16 Oct 2015 12:27:40 -0700
From: Alexei Starovoitov <ast@...mgrid.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>,
Daniel Borkmann <daniel@...earbox.net>
Cc: Hannes Frederic Sowa <hannes@...essinduktion.org>,
davem@...emloft.net, viro@...IV.linux.org.uk, tgraf@...g.ch,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH net-next 3/4] bpf: add support for persistent maps/progs
On 10/16/15 11:41 AM, Eric W. Biederman wrote:
> Daniel Borkmann <daniel@...earbox.net> writes:
>
>> On 10/16/2015 07:42 PM, Alexei Starovoitov wrote:
>>> On 10/16/15 10:21 AM, Hannes Frederic Sowa wrote:
>>>> Another question:
>>>> Should multiple mount of the filesystem result in an empty fs (a new
>>>> instance) or in one were one can see other ebpf-fs entities? I think
>>>> Daniel wanted to already use the mountpoint as some kind of hierarchy
>>>> delimiter. I would have used directories for that and multiple mounts
>>>> would then have resulted in the same content of the filesystem. IMHO
>>>> this would remove some ambiguity but then the question arises how this
>>>> is handled in a namespaced environment. Was there some specific reason
>>>> to do so?
>>>
>>> That's an interesting question!
>>> I think all mounts should be independent.
>>> I can see tracing using one and networking using another one
>>> with different hierarchies suitable for their own use cases.
>>> What's an advantage to have the same content everywhere?
>>> Feels harder to manage, since different users would need to
>>> coordinate.
>>
>> I initially had it as a mount_single() file system, where I was thinking
>> to have an entry under /sys/fs/bpf/, so all subsystems would work on top
>> of that mount point, but for the same reasons above I lifted that restriction.
>
> I am missing something.
>
> When I suggested using a filesystem it was my thought there would be
> exactly one superblock per map, and the map would be specified at mount
> time. You clearly are not implementing that.
I don't think it's practical to have sb per map, since that would mean
sb per prog and that won't scale.
Also map today is an fd that belongs to a process. I cannot see
an api from C program to do 'mount of FD' that wouldn't look like
ugly hack.
> A filesystem per map makes sense as you have a key-value store with one
> file per key.
>
> The idea is that something resembling your bpf_pin_fd function would be
> the mount system call for the filesystem.
>
> The the keys in the map could be read by "ls /mountpoint/".
> Key values could be inspected with "cat /mountpoint/key".
yes. that is still the goal for follow up patches, but contained
within given bpffs. Something bpf_pin_fd-like command for bpf syscall
would create files for keys in a map and allow 'cat' via open/read.
Such api would be much cleaner from C app point of view.
Potentially we can allow mount of a file created via BPF_PIN_FD
that will expand into keys/values.
All of that are our future plans.
There, actually, the main contention point is 'how to represent keys
and values'. whether key is hex representation or we need some
pretty-printers via format string or via schema? etc, etc.
We tried few ideas of representing keys in our fuse implementations,
but don't have an agreement yet.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists