[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201117045305.ejenig33no3xzzi4@maharaja.localdomain>
Date: Mon, 16 Nov 2020 20:53:05 -0800
From: Daniel Xu <dxu@...uu.xyz>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: bpf <bpf@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Song Liu <songliubraving@...com>, andrii.nakryiko@...il.com,
kernel-team@...com
Subject: Re: [PATCH bpf v6 1/2] lib/strncpy_from_user.c: Don't overcopy bytes
after NUL terminator
Hi Linus,
On Mon, Nov 16, 2020 at 02:15:52PM -0800, Linus Torvalds wrote:
> On Mon, Nov 16, 2020 at 1:17 PM Daniel Xu <dxu@...uu.xyz> wrote:
> >
> > Based on on-list discussion and some off-list discussion with Alexei,
> > I'd like to propose the v4-style patch without the `(*out & ~mask)`
> > bit.
>
> So I've verified that at least on x86-64, this doesn't really make
> code generation any worse, and I'm ok with the patch from that
> standpoint.
>
> However, this was not what the discussion actually amended at as far
> as I'm concerned.
>
> I mentioned that if BPF cares about the bytes past the end of the
> string, I want a *BIG COMMENT* about it. Yes, in strncpy_from_user()
> itself, but even more in the place that cares.
Sure, sorry. Will send another version with comments.
> And no, that does not mean bpf_probe_read_user_str(). That function
> clearly doesn't care at all, and doesn't access anything past the end
> of the string. I want a comment in whatever code that accesses past
> the end of the string.
If I'm reading things right, that place is technically in
kernel/bpf/hashtab.c:alloc_htab_elem. In line:
memcpy(l_new->key, key, key_size);
where `key_size` is the width of the hashtab key. The flow looks like:
<bpf prog code>
bpf_map_update_elem()
htab_map_update_elem()
alloc_htab_elem()
It feels a bit weird to me to add a comment about strings in there
because the hash table code is string-agnostic, as mentioned in the
commit message.
> And your ABI point is actively misleading:
>
> > We can't really zero out the rest of the buffer due to ABI issues.
> > The bpf docs state for bpf_probe_read_user_str():
> >
> > > In case the string length is smaller than *size*, the target is not
> > > padded with further NUL bytes.
>
> This comment is actively wrong and misleading.
>
> The code (after the patch) clearly *does* pad a bit with "further NUL
> bytes". It's just that it doesn't pad all the way to the end.
Right, it's a bit ugly and perhaps misleading. But it fixes the real
problem of keying a map with potentially random memory that
strncpy_from_user() might append on. If we pad a deterministic number of
zeros it should be ok.
> Where is the actual buffer zeroing done?
Usually the bpf prog does it. I believe (could be wrong) the verifier
enforces the key is initialized in some form.
For my specific use case, it's done in the bpftrace code:
https://github.com/iovisor/bpftrace/blob/0c88a1a4711a3d13e8c9475f7d08f83a5018fdfd/src/ast/codegen_llvm.cpp#L529-L530
> Because without the buffer zeroing, this whole patch is completely
> pointless. Which is why I want that comment, and I want a pointer to
> where that zeroing is done.
>
> Really. You have two cases:
>
> (a) the buffer isn't zeroed before the strncpy_from_user()
>
> (b) the buffer is guaranteed to be zero before that
>
> and in case (a), this patch is pointless, since the data after the
> string is already undefined.
See above -- I think the verifier enforces that the data is initialized.
> And in case (b), I want to see a comment and a pointer to the code
> that actually does the zeroing.
See above also.
> HOWEVER. Look at bpf_probe_read_user_str_common(), and notice how it
> ALREADY does the zeroing of the buffer past the end, it's just that it
> only does it in the error case.
I don't think the error case is very relevant here. I normally wouldn't
make any assumptions about the state of a buffer after a failed function
call.
> Why do you send this patch, instead of
>
> (a) get rid of the pointless pre-zeroing
>
> (b) change bpf_probe_read_user_str_common() to do
>
> int ret;
> u32 offset;
>
> ret = strncpy_from_user_nofault(dst, unsafe_ptr, size);
> offset = ret < 0 ? 0 : ret;
> memset(dst+offset, 0, size-offset);
> return ret;
>
> which seems to be much simpler anyway. The comment you quote about
> "target is not padded with further NUL bytes" is clearly wrong anyway,
> since that error case *does* pad the target with NUL bytes, and always
> has.
I think on the bpf side there's the same concern about performance.
You can't really dynamically size buffers in bpf land so users usually
use a larger buffer than necessary, sometimes on the order of KBs. So if
we unnecessarily zero out the rest of the buffer it could cause perf
regressions.
[...]
Thanks,
Daniel
Powered by blists - more mailing lists