netdev - Re: bpf bounded loops. Was: [flamebait] xdp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMp4zn9xEDpe2h8iGeHog+Ztm6V9PXR47GE=ptjEGtMRCP9BPg@mail.gmail.com>
Date:   Sat, 3 Dec 2016 01:11:44 -0800
From:   Sargun Dhillon <sargun@...gun.me>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     John Fastabend <john.fastabend@...il.com>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>,
        Tom Herbert <tom@...bertland.com>, Thomas Graf <tgraf@...g.ch>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        "David S. Miller" <davem@...emloft.net>
Subject: Re: bpf bounded loops. Was: [flamebait] xdp

On Fri, Dec 2, 2016 at 4:20 PM, Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
> On Fri, Dec 02, 2016 at 11:42:15AM -0800, John Fastabend wrote:
>> >> As far as pattern search for DNS packets...
>> >> it was requested by Cloudflare guys back in March:
>> >> https://github.com/iovisor/bcc/issues/471
>> >> and it is useful for several tracing use cases as well.
>> >> Unfortunately no one had time to implement it yet.
>> >
>> > The string operations you proposed on the other hand, which would count
>> > as one eBPF instructions, would give a lot more flexibility and allow
>> > more cycles to burn, but don't help parsing binary protocols like IPv6
>> > extension headers.
>
> these are two separate things. we need pattern search regardless
> of bounded loops. bpf program shouldn't be doing any complicated
> algorithms. The main reasons to have loops are:
> - speed up execution (smaller I-cache footprint)
> - avoid forcing compiler to unroll loops (easier for users)
> - support loops where unroll is not possible (like example below)
>
>> My rough thinking on this was the verifier had to start looking for loop
>> invariants and to guarantee termination. Sounds scary in general but
>> LLVM could put these in some normal form for us and the verifier could
>> only accept decreasing loops, the invariants could be required to be
>> integers, etc. By simplifying the loop enough the problem becomes
>> tractable.
>
> yep. I think what Hannes was proposing earlier is straighforward
> to implement for a compiler guy. The following:
> for (int i = 0; i < (var & 0xff); i++)
>   sum += map->value[i];  /* map value_size >= 0xff */
> is obviously bounded and dataflow analysis can easily prove
> that all memory operations are valid.
> Static analysis tools do way way more than this.
>
>> I think this would be better than new instructions and/or multiple
>> verifiers.
>
> agree that it's better than new instructions that would have
> required JIT changes. Though there are pros to new insns too :)
>
Has there been any thought to adding a map, or foldl helper a la the
tail call helper? Although you'd want to allocate an accumulator of
kinds for the foldl, I imagine this could be bounded in size quite
small for things like binary parsing operations -- we could reasonably
allow the accumulator to be updated, and return a special value to
exit the loop. I also started working on a map function a while ago
which would call a bpf program for each set cell in an arraymap, and
each set key/value in a hash map.

My intent was to intentionally make it so I could do this on the
context itself, so I could do encryption in BPF. I wanted to be able
to fold over the packet 16, or 32 bytes at a time, and (1) modify the
content, and (2) generate the authentication tag.

Any opinions on that approach?