linux-kernel - Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access kernel memory that can fault

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190223135126.1722237ffda9c50e66fff135@kernel.org>
Date:   Sat, 23 Feb 2019 13:51:26 +0900
From:   Masami Hiramatsu <mhiramat@...nel.org>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        David Miller <davem@...emloft.net>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Andy Lutomirski <luto@...capital.net>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        stable <stable@...r.kernel.org>,
        Changbin Du <changbin.du@...il.com>,
        Jann Horn <jannh@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        Andrew Lutomirski <luto@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>, bpf@...r.kernel.org
Subject: Re: [PATCH 1/2 v2] kprobe: Do not use uaccess functions to access
 kernel memory that can fault

On Fri, 22 Feb 2019 15:56:20 -0800
Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:

> On Fri, Feb 22, 2019 at 03:16:35PM -0800, Linus Torvalds wrote:
> > 
> > So a kernel pointer value of 0x12345678 could be a value kernel
> > pointer pointing to some random kmalloc'ed kernel memory, and a user
> > pointer value of 0x12345678 could be a valid _user_ pointer pointing
> > to some user mapping.
> > 
> > See?
> > 
> > If you access a user pointer, you need to use a user accessor function
> > (eg "get_user()"), while if you access a kernel pointer you need to
> > just dereference it directly (unless you can't trust it, in which case
> > you need to use a _different_ accessor function).
> 
> that was clear already.
> Reading 0x12345678 via probe_kernel_read can return valid value
> and via get_user() can return another valid value on _some_ architectures.
> 
> > The fact that user and kernel pointers happen to be distinct on x86-64
> > (right now) is just a random implementation detail.
> 
> yes and my point that people already rely on this implementation detail.
> Say we implement 
> int bpf_probe_read(void *val, void *unsafe_ptr)
> {
>   if (probe_kernel_read(val, unsafe_ptr) == OK) {
>      return 0;
>   } else (get_user(val, unsafe_ptr) == OK) {
>      return 0;
>   } else {
>      *val = 0;
>      return -EFAULT;
>   }
> }

Note that we can not use get_user() form kprobe handler. If you use it,
you have to prepare fault_handler() and make bpf itself can be aborted.
So, maybe you can use probe_user_read().

Hmm, however, it still doesn't work correctly on "some" architecture,
since whether a pointer (address) points user-space or kernel-space
depends on the context. In kprobe/bpf, the context means where you
put the probe and which pointer you record.

I think only "__user" tag tells us which one is user-space. But 
unfortunately, that "__user" tag is only for compiler or checker, not
for runtime binary. Such useful attribute goes away when we execute it.

So, even if we introduce "ustring", ftrace/perf users has to decide to use
it by themselves. As far as I know, DWARF(debuginfo) also doesn't have
that attribute. So perf-probe can not help it from debuginfo.
(Maybe if we introduce C parser, it might be detected...)

> It will preserve existing bpf_probe_read() behavior on x86.
> If x86 implementation changes tomorrow then progs that read user
> addresses may start failing randomly because first probe_kernel_read()
> will be returning random values from kernel memory and that's no good,
> but at least we won't be breaking them today, so we have time to
> introduce bpf_user_read and bpf_kernel_read and folks have time to adopt them.

I see. I think bpf also has to introduce new bpf_probe_read_user() and
keep bpf_probe_read() for kernel dataa only.

> Imo that's much better than making current bpf_probe_read() fail
> on user addresses today and not providing a non disruptive path forward.

Agreed.

Thank you,


-- 
Masami Hiramatsu <mhiramat@...nel.org>