[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wi62LDc5B3DOr5pyVtOUOuLkLzHvmZQApH9q=raqaGkUg@mail.gmail.com>
Date: Thu, 21 Apr 2022 15:30:26 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Song Liu <songliubraving@...com>
Cc: Song Liu <song@...nel.org>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
bpf <bpf@...r.kernel.org>, Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <Kernel-team@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
Christoph Hellwig <hch@...radead.org>,
Andrii Nakryiko <andrii@...nel.org>
Subject: Re: [PATCH bpf] bpf: invalidate unused part of bpf_prog_pack
On Thu, Apr 21, 2022 at 2:53 PM Song Liu <songliubraving@...com> wrote:
>
> However, we cannot really use the same function at free time. The
> huge page is RO+X at free time, but we are only zeroing out a chunk
> of it. So regular memset/memcpy won’t work. Instead, we will need
> something like bpf_arch_text_copy().
I actually think bpf_arch_text_copy() is another horribly badly done thing.
It seems only implemented on x86 (I'm not sure how anything else is
supposed to work, I didn't go look), and there it is horribly badly
done, using __text_poke() that does all these magical things just to
make it atomic wrt concurrent code execution.
None of which is *AT*ALL* relevant for this case, since concurrent
code execution simply isn't a thing (and if it were, you would already
have lost).
And if that wasn't pointless enough, it does all that magic "map the
page writably at a different virtual address using poking_addr in
poking_mm" and a different address space entirely.
All of that is required for REAL KERNEL CODE.
But the thing is, for bpf_prog_pack, all of that is just completely
pointless and stupid complexity.
We already *have* the other non-executable address that is writable:
it's the actual pages that got vmalloc'ed. Just use vmalloc_to_page()
and it's RIGHT THERE.
At which point you just use the same bpf_jit_fill_hole() function, and
you're done.
In other words, all of this seems excessively stupidly done, for no
good reason. It's only making it much too complicated, and just not
doing the right thing at all.
Linus
Powered by blists - more mailing lists