netdev - Re: eBPF verifier slowness, more than 2 cpu seconds for about 600 instructions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAADnVQ+CPP+tFR_9fh-omxQoJXqkuz3OV-MDSqjEDsVGnMXi3A@mail.gmail.com>
Date:   Wed, 26 Jun 2019 12:38:46 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Andreas Steinmetz <ast@...dv.de>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andriin@...com>,
        Edward Cree <ecree@...arflare.com>
Cc:     bpf <bpf@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>
Subject: Re: eBPF verifier slowness, more than 2 cpu seconds for about 600 instructions

On Tue, Jun 18, 2019 at 2:40 PM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Mon, Jun 17, 2019 at 11:26:28AM -0700, Alexei Starovoitov wrote:
> > On Sun, Jun 16, 2019 at 11:59 PM Alexei Starovoitov
> > <alexei.starovoitov@...il.com> wrote:
> > >
> > > On Thu, Jun 6, 2019 at 6:31 PM Andreas Steinmetz <ast@...dv.de> wrote:
> > > >
> > > > Below is the source in question. It may look a bit strange but I
> > > > had to extract it from the project and preset parameters to fixed
> > > > values.
> > > > It takes from 2.8 to 4.5 seconds to load, depending on the processor.
> > > > Just compile and run the code below.
> > >
> > > Thanks for the report.
> > > It's interesting one indeed.
> > > 600+ instructions consume
> > > processed 280464 insns (limit 1000000) max_states_per_insn 15
> > > total_states 87341 peak_states 580 mark_read 45
> > >
> > > The verifier finds a lot of different ways to go through branches
> > > in the program and majority of the states are not equivalent and
> > > do not help pruning, so it's doing full brute force walk of all possible
> > > combinations.
> > > We need to figure out whether there is a way to make it smarter.
> >
> > btw my pending backtracking logic helps it quite a bit:
> > processed 164110 insns (limit 1000000) max_states_per_insn 11
> > total_states 13398 peak_states 349 mark_read 10
> >
> > and it's 2x faster to verify, but 164k insns processed shows that
> > there is still room for improvement.
>
> Hi Andreas,
>
> Could you please create selftests/bpf/verifier/.c out of it?
> Currently we don't have a single test that exercises the verifier this way.
> Could you also annotate instructions with comments like you did
> at the top of your file?

Andreas, ping.

> The program logic is interesting.
> If my understanding of assembler is correct it has unrolled
> parsing of ipv6 extension headers. Then unrolled parsing of tcp options.
> The way the program is using packet pointers forces the verifier to try
> all possible combinations of extension headers and tcp options.
>
> The precise backtracking logic helps to reduce amount of walking.
> Also I think it's safe to reduce precision of variable part
> of packet pointers. The following patch on top of bounded loop
> series help to reduce it further.
>
> Original:
>   processed 280464 insns (limit 1000000) max_states_per_insn 15
>   total_states 87341 peak_states 580 mark_read 45
>
> Backtracking:
>   processed 164110 insns (limit 1000000) max_states_per_insn 11
>   total_states 13398 peak_states 349 mark_read 10
>
> Backtracking + pkt_ptr var precision:
>   processed 96739 insns (limit 1000000) max_states_per_insn 11
>   total_states 7891 peak_states 329 mark_read 10
>
> The patch helps w/o backtracking as well:
>   processed 165254 insns (limit 1000000) max_states_per_insn 15
>   total_states 51434 peak_states 572 mark_read 45
>
> Backtracking and bounded loop heuristics reduce total memory
> consumption quite a bit. Which was nice to see.
>
> Anyway would be great if you could create a test out of it.
> Would be even more awesome if you convert it to C code
> and try to use bounded loops to parse extension headers
> and tcp options. That would be a true test for both loops
> and 'reduce precision' features.
>
> Thanks!
>
> From 4681224057af73335de0fdd629a2149bad91d59d Mon Sep 17 00:00:00 2001
> From: Alexei Starovoitov <ast@...nel.org>
> Date: Tue, 18 Jun 2019 13:40:29 -0700
> Subject: [PATCH bpf-next] bpf: relax tracking of variable offset in packet pointers
>
> Signed-off-by: Alexei Starovoitov <ast@...nel.org>
> ---
>  kernel/bpf/verifier.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d2c8a6677ac4..e37c69ad57b3 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3730,6 +3730,27 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
>                         dst_reg->id = ++env->id_gen;
>                         /* something was added to pkt_ptr, set range to zero */
>                         dst_reg->raw = 0;
> +                       if (bpf_prog_is_dev_bound(env->prog->aux))
> +                               /* nfp offload needs accurate max_pkt_offset */
> +                               break;
> +                       if (env->strict_alignment)
> +                               break;
> +                       /* scalar added to pkt pointer is within BPF_MAX_VAR_OFF bounds.
> +                        * 64-bit pkt_data pointer can be safely compared with pkt_data_end
> +                        * even on 32-bit architectures.
> +                        * In case this scalar was positive the verifier
> +                        * doesn't need to track it precisely.
> +                        */
> +                       if (dst_reg->smin_value >= 0)
> +                               /* clear variable part of pkt pointer */
> +                               __mark_reg_known_zero(dst_reg);
> +                               /* no need to clear dst_reg->off.
> +                                * It's a known part of the pointer.
> +                                * When this pkt_ptr compared with pkt_end
> +                                * the 'range' will be initialized from 'off' and
> +                                * *(u8*)(dst_reg - off) is still more than packet start,
> +                                * since unknown value was positive.
> +                                */
>                 }
>                 break;
>         case BPF_SUB:
> --
> 2.20.0
>