[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201116063150.GU16653@GaryWorkstation>
Date: Mon, 16 Nov 2020 14:31:50 +0800
From: Gary Lin <glin@...e.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Network Development <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, andreas.taschner@...e.com
Subject: Re: [PATCH RFC] bpf, x64: allow not-converged images when
BPF_JIT_ALWAYS_ON is set
On Fri, Nov 13, 2020 at 05:48:31PM -0800, Alexei Starovoitov wrote:
> On Fri, Nov 13, 2020 at 12:40 AM Gary Lin <glin@...e.com> wrote:
> >
> > The x64 bpf jit expects the bpf images converge within the given passes.
> > However there is a corner case:
> >
> > l0: ldh [4]
> > l1: jeq #0x537d, l2, l40
> > l2: ld [0]
> > l3: jeq #0xfa163e0d, l4, l40
> > l4: ldh [12]
> > l5: ldx #0xe
> > l6: jeq #0x86dd, l41, l7
> > l7: jeq #0x800, l8, l41
> > l8: ld [x+16]
> > l9: ja 41
> >
> > [... repeated ja 41 ]
> >
> > l40: ja 41
> > l41: ret #0
> > l42: ld #len
> > l43: ret a
> >
> > The bpf program contains 32 "ja 41" and do_jit() only removes one "ja 41"
> > right before "l41: ret #0" for offset==0 in each pass, so
> > bpf_int_jit_compile() needs to run do_jit() at least 32 times to
> > eliminate those JMP instructions. Since the current max number of passes
> > is 20, the bpf program couldn't converge within 20 passes and got rejected
> > when BPF_JIT_ALWAYS_ON is set even though it's legit as a classic socket
> > filter.
> >
> > A not-converged image may be not optimal but at least the bpf
> > instructions are translated into x64 machine code. Maybe we could just
> > issue a warning instead so that the program is still loaded and the user
> > is also notified.
>
> Non-convergence is not about being optimal. It's about correctness.
> If size is different it likely means that at least one jump has the
> wrong offset.
>
Ah, I see.
> Bumping from 20 to 64 also won't solve it.
> There could be a case where 64 isn't enough either.
True. Increasing the number of passes is just a workaround.
> One of the test_bpf.ko tests can hit any limit, iirc.
Thanks for the pointer. Will look into the tests.
>
> Also we've seen a case where JIT might never converge.
> The iteration N can have size 40, iteration N+1 size 38, iteration N+2 size 40
> and keep oscillating like this.
>
> I think the fix could be is to avoid optimality in size when pass
> number is getting large.
> Like after pass > 10 BPF_JA could always use 32-bit offset regardless
> of actual addrs[i + insn->off] - addrs[i]; difference.
> There could be other solutions too.
>
So the size convergence can be ignored if all BPF_JMPs are translated
into 32-bit offset jmp?
Thanks,
Gary Lin
Powered by blists - more mailing lists