[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220819034635.67875-1-kuniyu@amazon.com>
Date: Thu, 18 Aug 2022 20:46:35 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <alexei.starovoitov@...il.com>
CC: <andrii@...nel.org>, <ast@...nel.org>, <bpf@...r.kernel.org>,
<daniel@...earbox.net>, <kuni1840@...il.com>, <kuniyu@...zon.com>,
<netdev@...r.kernel.org>
Subject: Re: [PATCH v1 bpf 1/4] bpf: Fix data-races around bpf_jit_enable.
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
Date: Thu, 18 Aug 2022 20:27:49 -0700
> On Thu, Aug 18, 2022 at 6:15 PM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> >
> > From: Alexei Starovoitov <alexei.starovoitov@...il.com>
> > Date: Thu, 18 Aug 2022 18:05:44 -0700
> > > On Thu, Aug 18, 2022 at 5:56 PM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> > > >
> > > > From: Alexei Starovoitov <alexei.starovoitov@...il.com>
> > > > Date: Thu, 18 Aug 2022 17:13:25 -0700
> > > > > On Thu, Aug 18, 2022 at 5:07 PM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> > > > > >
> > > > > > From: Alexei Starovoitov <alexei.starovoitov@...il.com>
> > > > > > Date: Thu, 18 Aug 2022 15:49:46 -0700
> > > > > > > On Wed, Aug 17, 2022 at 9:24 PM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> > > > > > > >
> > > > > > > > A sysctl variable bpf_jit_enable is accessed concurrently, and there is
> > > > > > > > always a chance of data-race. So, all readers and a writer need some
> > > > > > > > basic protection to avoid load/store-tearing.
> > > > > > > >
> > > > > > > > Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
> > > > > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> > > > > > > > ---
> > > > > > > > arch/arm/net/bpf_jit_32.c | 2 +-
> > > > > > > > arch/arm64/net/bpf_jit_comp.c | 2 +-
> > > > > > > > arch/mips/net/bpf_jit_comp.c | 2 +-
> > > > > > > > arch/powerpc/net/bpf_jit_comp.c | 5 +++--
> > > > > > > > arch/riscv/net/bpf_jit_core.c | 2 +-
> > > > > > > > arch/s390/net/bpf_jit_comp.c | 2 +-
> > > > > > > > arch/sparc/net/bpf_jit_comp_32.c | 5 +++--
> > > > > > > > arch/sparc/net/bpf_jit_comp_64.c | 5 +++--
> > > > > > > > arch/x86/net/bpf_jit_comp.c | 2 +-
> > > > > > > > arch/x86/net/bpf_jit_comp32.c | 2 +-
> > > > > > > > include/linux/filter.h | 2 +-
> > > > > > > > net/core/sysctl_net_core.c | 4 ++--
> > > > > > > > 12 files changed, 19 insertions(+), 16 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
> > > > > > > > index 6a1c9fca5260..4b6b62a6fdd4 100644
> > > > > > > > --- a/arch/arm/net/bpf_jit_32.c
> > > > > > > > +++ b/arch/arm/net/bpf_jit_32.c
> > > > > > > > @@ -1999,7 +1999,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> > > > > > > > }
> > > > > > > > flush_icache_range((u32)header, (u32)(ctx.target + ctx.idx));
> > > > > > > >
> > > > > > > > - if (bpf_jit_enable > 1)
> > > > > > > > + if (READ_ONCE(bpf_jit_enable) > 1)
> > > > > > >
> > > > > > > Nack.
> > > > > > > Even if the compiler decides to use single byte loads for some
> > > > > > > odd reason there is no issue here.
> > > > > >
> > > > > > I see, and same for 2nd/3rd patches, right?
> > > > > >
> > > > > > Then how about this part?
> > > > > > It's not data-race nor problematic in practice, but should the value be
> > > > > > consistent in the same function?
> > > > > > The 2nd/3rd patches also have this kind of part.
> > > > >
> > > > > The bof_jit_enable > 1 is unsupported and buggy.
> > > > > It will be removed eventually.
> > > >
> > > > Ok, then I'm fine with no change.
> > > >
> > > > >
> > > > > Why are you doing these changes if they're not fixing any bugs ?
> > > > > Just to shut up some race sanitizer?
> > > >
> > > > For data-race, it's one of reason. I should have made sure the change fixes
> > > > an actual bug, my apologies.
> > > >
> > > > For two reads, I feel buggy that there's an inconsitent snapshot.
> > > > e.g.) in the 2nd patch, bpf_jit_harden == 0 in bpf_jit_blinding_enabled()
> > > > could return true. Thinking the previous value was 1, it seems to be timing
> > > > issue, but not intuitive.
> > >
> > > it's also used in bpf_jit_kallsyms_enabled.
> > > So the patch 2 doesn't make anything 'intuitive'.
> >
> > Exactly...
> >
> > So finally, should I repost 4th patch or drop it?
>
> This?
> - if (atomic_long_add_return(size, &bpf_jit_current) > bpf_jit_limit) {
> + if (atomic_long_add_return(size, &bpf_jit_current) >
> READ_ONCE(bpf_jit_limit)) {
>
> same question. What does it fix?
Its size is long, and load tearing [0] could occur by compiler
optimisation. So, concurrent writes & a teared-read could get
a bigger limit than intended.
write 0xFFFFFFFF00000000
teared-read 0xFFFFFFFF
write 0x00000000FFFFFFFF
teared-read 0xFFFFFFFFFFFFFFFF
[0]: https://lwn.net/Articles/793253/#Load%20Tearing
Powered by blists - more mailing lists