netdev - Re: eBPF verifier slowness, more than 2 cpu seconds for about 600 instructions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190618214028.y2qzbtonozr5cc7a@ast-mbp.dhcp.thefacebook.com>
Date:   Tue, 18 Jun 2019 14:40:30 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Andreas Steinmetz <ast@...dv.de>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andriin@...com>,
        Edward Cree <ecree@...arflare.com>
Cc:     bpf <bpf@...r.kernel.org>, netdev@...r.kernel.org
Subject: Re: eBPF verifier slowness, more than 2 cpu seconds for about 600
 instructions

On Mon, Jun 17, 2019 at 11:26:28AM -0700, Alexei Starovoitov wrote:
> On Sun, Jun 16, 2019 at 11:59 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > On Thu, Jun 6, 2019 at 6:31 PM Andreas Steinmetz <ast@...dv.de> wrote:
> > >
> > > Below is the source in question. It may look a bit strange but I
> > > had to extract it from the project and preset parameters to fixed
> > > values.
> > > It takes from 2.8 to 4.5 seconds to load, depending on the processor.
> > > Just compile and run the code below.
> >
> > Thanks for the report.
> > It's interesting one indeed.
> > 600+ instructions consume
> > processed 280464 insns (limit 1000000) max_states_per_insn 15
> > total_states 87341 peak_states 580 mark_read 45
> >
> > The verifier finds a lot of different ways to go through branches
> > in the program and majority of the states are not equivalent and
> > do not help pruning, so it's doing full brute force walk of all possible
> > combinations.
> > We need to figure out whether there is a way to make it smarter.
> 
> btw my pending backtracking logic helps it quite a bit:
> processed 164110 insns (limit 1000000) max_states_per_insn 11
> total_states 13398 peak_states 349 mark_read 10
> 
> and it's 2x faster to verify, but 164k insns processed shows that
> there is still room for improvement.

Hi Andreas,

Could you please create selftests/bpf/verifier/.c out of it?
Currently we don't have a single test that exercises the verifier this way.
Could you also annotate instructions with comments like you did
at the top of your file?
The program logic is interesting.
If my understanding of assembler is correct it has unrolled
parsing of ipv6 extension headers. Then unrolled parsing of tcp options.
The way the program is using packet pointers forces the verifier to try
all possible combinations of extension headers and tcp options.

The precise backtracking logic helps to reduce amount of walking.
Also I think it's safe to reduce precision of variable part
of packet pointers. The following patch on top of bounded loop
series help to reduce it further.

Original:
  processed 280464 insns (limit 1000000) max_states_per_insn 15
  total_states 87341 peak_states 580 mark_read 45

Backtracking:
  processed 164110 insns (limit 1000000) max_states_per_insn 11
  total_states 13398 peak_states 349 mark_read 10

Backtracking + pkt_ptr var precision:
  processed 96739 insns (limit 1000000) max_states_per_insn 11
  total_states 7891 peak_states 329 mark_read 10

The patch helps w/o backtracking as well:
  processed 165254 insns (limit 1000000) max_states_per_insn 15
  total_states 51434 peak_states 572 mark_read 45

Backtracking and bounded loop heuristics reduce total memory
consumption quite a bit. Which was nice to see.

Anyway would be great if you could create a test out of it.
Would be even more awesome if you convert it to C code
and try to use bounded loops to parse extension headers
and tcp options. That would be a true test for both loops
and 'reduce precision' features.

Thanks!

>From 4681224057af73335de0fdd629a2149bad91d59d Mon Sep 17 00:00:00 2001
From: Alexei Starovoitov <ast@...nel.org>
Date: Tue, 18 Jun 2019 13:40:29 -0700
Subject: [PATCH bpf-next] bpf: relax tracking of variable offset in packet pointers

Signed-off-by: Alexei Starovoitov <ast@...nel.org>
---
 kernel/bpf/verifier.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d2c8a6677ac4..e37c69ad57b3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3730,6 +3730,27 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 			dst_reg->id = ++env->id_gen;
 			/* something was added to pkt_ptr, set range to zero */
 			dst_reg->raw = 0;
+			if (bpf_prog_is_dev_bound(env->prog->aux))
+				/* nfp offload needs accurate max_pkt_offset */
+				break;
+			if (env->strict_alignment)
+				break;
+			/* scalar added to pkt pointer is within BPF_MAX_VAR_OFF bounds.
+			 * 64-bit pkt_data pointer can be safely compared with pkt_data_end
+			 * even on 32-bit architectures.
+			 * In case this scalar was positive the verifier
+			 * doesn't need to track it precisely.
+			 */
+			if (dst_reg->smin_value >= 0)
+				/* clear variable part of pkt pointer */
+				__mark_reg_known_zero(dst_reg);
+				/* no need to clear dst_reg->off.
+				 * It's a known part of the pointer.
+				 * When this pkt_ptr compared with pkt_end
+				 * the 'range' will be initialized from 'off' and
+				 * *(u8*)(dst_reg - off) is still more than packet start,
+				 * since unknown value was positive.
+				 */
 		}
 		break;
 	case BPF_SUB:
-- 
2.20.0