netdev - Re: [PATCH net-next 0/2] act_bpf, cls_bpf: send eBPF bytecode through

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <57172ED3.30101@6wind.com>
Date:	Wed, 20 Apr 2016 09:25:07 +0200
From:	Quentin Monnet <quentin.monnet@...nd.com>
To:	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Daniel Borkmann <daniel@...earbox.net>
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH net-next 0/2] act_bpf, cls_bpf: send eBPF bytecode through

Hi Daniel, Alexei, and many thanks for your answers,

2016-04-15 (11:44 UTC-0700) ~ Alexei Starovoitov:
> On Fri, Apr 15, 2016 at 12:41:05PM +0200, Daniel Borkmann wrote:
>> Hi Quentin,
>>
>> On 04/15/2016 12:07 PM, Quentin Monnet wrote:
>>> When a new BPF traffic control filter or action is set up with tc, the
>>> bytecode is sent back to userspace through a netlink socket for cBPF, but
>>> not for eBPF (the file descriptor pointing to the object file containing
>>> the bytecode is sent instead).
>>>
>>> This patch makes cls_bpf and act_bpf modules send the bytecode for eBPF as
>>> well (in addition to the file descriptor).
>>>
[…]
>>
>> Thanks for working on this, but it's unfortunately not that easy. Let
>> me ask, what would be the intended use-case to dump the insns?
> 
> +1
> 
>> I'm asking because if you dump them as-is, then a reinject at a later
>> time of that bytecode back into the kernel will most likely be rejected
>> by the verifier.
>>
>> This is because on load time, verifier does rewrites/expansion on some
>> of the insns (f.e. map pointers, helper functions, ctx access etc, see
>> also appendix in [1]), so the code as seen in the kernel would need to
>> be sanitized first.
> 
> +1
> we had similar discussion about this in seccomp context and decided that
> the only sensible way is to keep original instructions, but it's wasteful
> to do unconditionally and snapshotting of maps is not possible,
> so there was no use for such dumping facility other than debugging.
> Is it what the patch after?
> We need to discuss it in the proper context.

I am experimenting with BPF, and so far I was just trying to dump the
bytecode sent from tc to the kernel. I had not realized that the
verifier would bring some changes to the instructions. And I agree that
a more comprehensive debugging solution could be obtained if I can find
some way to get a snapshot of the maps.

>> Also, how would you make sense/transform maps into a meaningful
>> representation (probably possible to find a scheme when they are pinned)?
>>
>> Another possibility is that such programs need to be pinned (can be done
>> easily by tc in the background) and then implement a CRIU facility into
>> the bpf(2) syscall to retrieve them. tc could make use of this w/o too
>> much effort, and at the same time it would help CRIU folks, too. It
>> also seems cleaner to have only one central api (bpf(2)) to dump them,
>> but needs a bit of thought.
> 
> +1
> any debugging or criu needs to be done in a centralized way via syscall
> and/or bpffs.

Maintaining a central API around bpf() makes sense to me. I have been
looking at the BPF filesystem to see what information I can obtain from
it, but I did not understand it well. I read the logs of Daniel's commit
b2197755b263 (“bpf: add support for persistent maps/progs”), but I am
unsure how I could use it in order to gather data about the maps and
programs (if this is possible at all). I tried to set up some BPF
filters working with maps, but I could not find any file under
/sys/fs/bpf/tc.

Would you have a pointer to some documentation about this filesystem? Or
is there only the kernel code?