lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANk7y0jtm9yYobPLsMEHAem+R-wKjVOLWo=EeU-bojYks9tetQ@mail.gmail.com>
Date:   Thu, 22 Jun 2023 10:47:08 +0200
From:   Puranjay Mohan <puranjay12@...il.com>
To:     Mark Rutland <mark.rutland@....com>
Cc:     ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
        martin.lau@...ux.dev, song@...nel.org, catalin.marinas@....com,
        bpf@...r.kernel.org, kpsingh@...nel.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v3 3/3] bpf, arm64: use bpf_jit_binary_pack_alloc

Hi Mark,

On Thu, Jun 22, 2023 at 10:23 AM Mark Rutland <mark.rutland@....com> wrote:
>
> On Wed, Jun 21, 2023 at 10:57:20PM +0200, Puranjay Mohan wrote:
> > On Wed, Jun 21, 2023 at 5:31 PM Mark Rutland <mark.rutland@....com> wrote:
> > > On Mon, Jun 19, 2023 at 10:01:21AM +0000, Puranjay Mohan wrote:
> > > > @@ -1562,34 +1610,39 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> > > >
> > > >       /* 3. Extra pass to validate JITed code. */
> > > >       if (validate_ctx(&ctx)) {
> > > > -             bpf_jit_binary_free(header);
> > > >               prog = orig_prog;
> > > > -             goto out_off;
> > > > +             goto out_free_hdr;
> > > >       }
> > > >
> > > >       /* And we're done. */
> > > >       if (bpf_jit_enable > 1)
> > > >               bpf_jit_dump(prog->len, prog_size, 2, ctx.image);
> > > >
> > > > -     bpf_flush_icache(header, ctx.image + ctx.idx);
> > > > +     bpf_flush_icache(ro_header, ctx.ro_image + ctx.idx);
> > >
> > > I think this is too early; we haven't copied the instructions into the
> > > ro_header yet, so that still contains stale instructions.
> > >
> > > IIUC at the whole point of this is to pack multiple programs into shared ROX
> > > pages, and so there can be an executable mapping of the RO page at this point,
> > > and the CPU can fetch stale instructions throught that.
> > >
> > > Note that *regardless* of whether there is an executeable mapping at this point
> > > (and even if no executable mapping exists until after the copy), we at least
> > > need a data cache clean to the PoU *after* the copy (so fetches don't get a
> > > stale value from the PoU), and the I-cache maintenance has to happeon the VA
> > > the instrutions will be executed from (or VIPT I-caches can still contain stale
> > > instructions).
> >
> > Thanks for catching this, It is a big miss from my side.
> >
> > I was able to reproduce the boot issue in the other thread on my
> > raspberry pi. I think it is connected to the
> > wrong I-cache handling done by me.
> >
> > As you rightly pointed out: We need to do bpf_flush_icache() after
> > copying the instructions to the ro_header or the CPU can run
> > incorrect instructions.
> >
> > When I move the call to bpf_flush_icache() after
> > bpf_jit_binary_pack_finalize() (this does the copy to ro_header), the
> > boot issue
> > is fixed. Would this change be enough to make this work or I would
> > need to do more with the data cache as well to catch other
> > edge cases?
>
> AFAICT, bpf_flush_icache() calls flush_icache_range(). Despite its name,
> flush_icache_range() has d-cache maintenance, i-cache maintenance, and context
> synchronization (i.e. it does everything necessary).
>
> As long as you call that with the VAs the code will be executed from, that
> should be sufficient, and you don't need to do any other work.

Thanks for explaining this.
After reading your explanation, I feel this should work.

bpf_jit_binary_pack_finalize() will copy the instructions from
rw_header to ro_header.
After the copy, calling bpf_flush_icache(ro_header, ctx.ro_image +
ctx.idx); will invalidate the caches
for the VAs in the ro_header, this is where the code will be executed from.

I will send the v4 patchset with this change.

Thanks,
Puranjay

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ