linux-kernel - Re: [PATCHv2 1/3] uprobe: Add uretprobe syscall to speed up return probe

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAEf4BzZ2RFfz8PNgJ4ENZ0us4uX=DWhYFimXdtWms-VvGXOjgQ@mail.gmail.com>
Date: Wed, 3 Apr 2024 09:58:12 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Masami Hiramatsu <mhiramat@...nel.org>
Cc: Jiri Olsa <olsajiri@...il.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Oleg Nesterov <oleg@...hat.com>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Andrii Nakryiko <andrii@...nel.org>, linux-kernel@...r.kernel.org, 
	linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org, 
	Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>, 
	John Fastabend <john.fastabend@...il.com>, Peter Zijlstra <peterz@...radead.org>, 
	Thomas Gleixner <tglx@...utronix.de>, "Borislav Petkov (AMD)" <bp@...en8.de>, x86@...nel.org
Subject: Re: [PATCHv2 1/3] uprobe: Add uretprobe syscall to speed up return probe

On Wed, Apr 3, 2024 at 7:09 AM Masami Hiramatsu <mhiramat@...nel.org> wrote:
>
> On Wed, 3 Apr 2024 11:47:41 +0200
> Jiri Olsa <olsajiri@...il.com> wrote:
>
> > On Wed, Apr 03, 2024 at 10:07:08AM +0900, Masami Hiramatsu wrote:
> > > Hi Jiri,
> > >
> > > On Tue,  2 Apr 2024 11:33:00 +0200
> > > Jiri Olsa <jolsa@...nel.org> wrote:
> > >
> > > > Adding uretprobe syscall instead of trap to speed up return probe.
> > >
> > > This is interesting approach. But I doubt we need to add additional
> > > syscall just for this purpose. Can't we use another syscall or ioctl?
> >
> > so the plan is to optimize entry uprobe in a similar way and given
> > the syscall is not a scarce resource I wanted to add another syscall
> > for that one as well
> >
> > tbh I'm not sure sure which syscall or ioctl to reuse for this, it's
> > possible to do that, the trampoline will just have to save one or
> > more additional registers, but adding new syscall seems cleaner to me
>
> Hmm, I think a similar syscall is ptrace? prctl may also be a candidate.

I think both ptrace and prctl are for completely different use cases
and it would be an abuse of existing API to reuse them for uretprobe
tracing. Also, keep in mind, that any extra argument that has to be
passed into this syscall means that we need to complicate and slow
generated assembly code that is injected into user process (to
save/restore registers) and also kernel-side (again, to deal with all
the extra registers that would be stored/restored on stack).

Given syscalls are not some kind of scarce resources, what's the
downside to have a dedicated and simple syscall?

>
> >
> > >
> > > Also, we should run syzkaller on this syscall. And if uretprobe is
> >
> > right, I'll check on syzkaller
> >
> > > set in the user function, what happen if the user function directly
> > > calls this syscall? (maybe it consumes shadow stack?)
> >
> > the process should receive SIGILL if there's no pending uretprobe for
> > the current task, or it will trigger uretprobe if there's one pending
>
> No, that is too aggressive and not safe. Since the syscall is exposed to
> user program, it should return appropriate error code instead of SIGILL.
>

This is the way it is today with uretprobes even through interrupt.
E.g., it could happen that user process is using fibers and is
replacing stack pointer without kernel realizing this, which will
trigger some defensive checks in uretprobe handling code and kernel
will send SIGILL because it can't support such cases. This is
happening today already, and it works fine in practice (except for
applications that manually change stack pointer, too bad, you can't
trace them with uretprobes, unfortunately).

So I think it's absolutely adequate to have this behavior if the user
process is *intentionally* abusing this API.

> >
> > but we could limit the syscall to be executed just from the trampoline,
> > that should prevent all the user space use cases, I'll do that in next
> > version and add more tests for that
>
> Why not limit? :) The uprobe_handle_trampoline() expects it is called
> only from the trampoline, so it is natural to check the caller address.
> (and uprobe should know where is the trampoline)
>
> Since the syscall is always exposed to the user program, it should
> - Do nothing and return an error unless it is properly called.
> - check the prerequisites for operation strictly.
> I concern that new system calls introduce vulnerabilities.
>

As Oleg and Jiri mentioned, this syscall can't harm kernel or other
processes, only the process that is abusing the API. So any extra
checks that would slow down this approach is an unnecessary overhead
and complication that will never be useful in practice.

Also note that sys_uretprobe is a kind of internal and unstable API
and it is explicitly called out that its contract can change at any
time and user space shouldn't rely on it. It's purely for the kernel's
own usage.

So let's please keep it fast and simple.

> Thank you,
>
>
> >
> > thanks,
> > jirka
> >
> >
> > >

[...]