netdev - Re: [PATCH net-next 0/3] eBPF Seccomp filters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGXu5jJZgrgLrhkZO33RNdOds8zwnnOZh+rqwguxJM+zm=EJ7g@mail.gmail.com>
Date:   Tue, 13 Feb 2018 12:35:46 -0800
From:   Kees Cook <keescook@...omium.org>
To:     Tom Hromatka <tom.hromatka@...cle.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        Sargun Dhillon <sargun@...gun.me>,
        Will Drewry <wad@...omium.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Linux Containers <containers@...ts.linux-foundation.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Andy Lutomirski <luto@...capital.net>
Subject: Re: [PATCH net-next 0/3] eBPF Seccomp filters

On Tue, Feb 13, 2018 at 12:33 PM, Tom Hromatka <tom.hromatka@...cle.com> wrote:
> On Tue, Feb 13, 2018 at 7:42 AM, Sargun Dhillon <sargun@...gun.me> wrote:
>>
>> This patchset enables seccomp filters to be written in eBPF. Although,
>> this patchset doesn't introduce much of the functionality enabled by
>> eBPF, it lays the ground work for it.
>>
>> It also introduces the capability to dump eBPF filters via the PTRACE
>> API in order to make it so that CHECKPOINT_RESTORE will be satisifed.
>> In the attached samples, there's an example of this. One can then use
>> BPF_OBJ_GET_INFO_BY_FD in order to get the actual code of the program,
>> and use that at reload time.
>>
>> The primary reason for not adding maps support in this patchset is
>> to avoid introducing new complexities around PR_SET_NO_NEW_PRIVS.
>> If we have a map that the BPF program can read, it can potentially
>> "change" privileges after running. It seems like doing writes only
>> is safe, because it can be pure, and side effect free, and therefore
>> not negatively effect PR_SET_NO_NEW_PRIVS. Nonetheless, if we come
>> to an agreement, this can be in a follow-up patchset.
>
>
>
> Coincidentally I also sent an RFC for adding eBPF hash maps to the seccomp
> userspace mailing list just last week:
> https://groups.google.com/forum/#!topic/libseccomp/pX6QkVF0F74
>
> The kernel changes I proposed are in this email:
> https://groups.google.com/d/msg/libseccomp/pX6QkVF0F74/ZUJlwI5qAwAJ
>
> In that email thread, Kees requested that I try out a binary tree in cBPF
> and evaluate its performance.  I just got a rough prototype working, and
> while not as fast as an eBPF hash map, the cBPF binary tree was a
> significant
> improvement over the linear list of ifs that are currently generated.  Also,
> it only required changing a single function within the libseccomp libary
> itself.
>
> https://github.com/drakenclimber/libseccomp/commit/87b36369f17385f5a7a4d95101185577fbf6203b
>
> Here are the results I am currently seeing using an in-house customer's
> seccomp filter and a simplistic test program that runs getppid() thousands
> of times.
>
> Test Case                      minimum TSC ticks to make syscall
> ----------------------------------------------------------------
> seccomp disabled                                             620
> getppid() at the front of 306-syscall seccomp filter         722
> getppid() in middle of 306-syscall seccomp filter           1392
> getppid() at the end of the 306-syscall filter              2452
> seccomp using a 306-syscall-sized EBPF hash map              800
> cBPF filter using a binary tree                              922

I still think that's a crazy filter. :) It should be inverted to just
check the 26 syscalls and a final "greater than" test. I would expect
it to be faster still. :)

-Kees

-- 
Kees Cook
Pixel Security