[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201113205746.htvdzudtqrw6h7oa@ast-mbp>
Date: Fri, 13 Nov 2020 12:57:46 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Daniel Xu <dxu@...uu.xyz>, bpf <bpf@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, kernel-team@...com
Subject: Re: [PATCH bpf v5 1/2] lib/strncpy_from_user.c: Don't overcopy bytes
after NUL terminator
On Fri, Nov 13, 2020 at 12:10:57PM -0800, Linus Torvalds wrote:
> On Fri, Nov 13, 2020 at 11:17 AM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > The v4 approach preserves performance. It wasn't switching to byte_at_a_time:
>
> That v4 looks better, but still pointless.
>
> But it might be acceptable, with that final
>
> *out = (*out & ~mask) | (c & mask);
>
> just replaced with
>
> *out = c & mask;
>
> which still writes past the end, but now it only writes zeroes.
>
> But the only reason for that to be done is if you have exposed the
> destination buffer to another thread before (and you zeroed it before
> exposing it), and you want to make sure that any concurrent reader can
> never see anything past the end of the string.
>
> Again - the *only* case that could possibly matter is when you
> pre-zeroed the buffer, because if you didn't, then a concurrent reader
> would see random garbage *anyway*, particularly since there is no SMP
> memory ordering imposed with the strncpy. So nothing but "pre-zeroed"
> makes any possible sense, which means that the whole "(*out & ~mask)"
> in that v4 patch is completely and utterly meaningless. There's
> absolutely zero reason to try to preserve any old data.
>
> In other words, you have two cases:
>
> (a) no threaded and unlocked accesses to the resulting string
>
> (b) you _do_ have concurrent threaded accesses to the string and no
> locking (really? That's seriously questionable),
>
> If you have case (a), then the only correct thing to do is to
> explicitly pad afterwards. It's optimal, and doesn't make any
> assumptions about implementation of strncpy_from_user().
(a) is the only case.
There is no concurrent access to dest.
Theoretically it's possible for two bpf progs on different cpus
to write into the same dest, but it's completely racy and buggy
for other reasons.
I've looked through most of the kernel code where strncpy_from_user()
is used and couldn't find a case where dest is not used as a string.
In particular I was worried about the code like:
struct foo {
...
char name[64];
...
} *f;
f = kcalloc();
...
ret = strncpy_from_user(f->name, user_addr, sizeof(f->name));
if (ret <= 0)
goto ...;
f->name[ret] = 0;
and then the whole *f would be passed to a hash function
that will go over the sizeof(struct foo) assuming
that strncpy_from_user() didn't add the garbage.
The extra zeroing:
f->name[ret] = 0;
didn't save it from the garbage in the "name".
I can convince myself that your new definition of strncpy_from_user:
"
You told it that the destination buffer was some amount of bytes, and
strncpy_from_user() will use up to that maximum number of bytes.
That's the only guarantee you have - it won't write _past_ the buffer
you gave it.
"
makes sense from the performance point of view.
But I think if glibc's strncpy() did something like this it would
probably caused a lot of pain for user space.
The hash element example above is typical bpf usage.
One real bpf prog was converted as a test:
tools/testing/selftests/bpf/progs/pyperf.h
There it's populating:
typedef struct {
char name[FUNCTION_NAME_LEN];
char file[FILE_NAME_LEN];
} Symbol;
with two calls:
bpf_probe_read_user_str(&symbol->name, sizeof(symbol->name),
frame->co_name + pidData->offsets.String_data);
These are the function name inside python interpreter.
The 'len' result is ignored by bpf prog.
And later the whole 'Symbol' is passed into a hash map.
The kernel code doesn't seem to be doing anything like this, so
it's fine to adopt your definition of strncpy_from_user().
The garbage copying part can be cleared on bpf side.
It's easy enough to do in bpf_probe_read_user_str().
We can just zero up to sizeof(long) bytes there.
Powered by blists - more mailing lists