lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 11 Mar 2024 20:49:17 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: André Rösti <an.roesti@...il.com>,
 linux-kernel@...r.kernel.org
Cc: peterz@...radead.org, luto@...nel.org
Subject: Re: [PATCH] Respect system call number changes by sys_enter probes

André!

On Sat, Mar 09 2024 at 05:53, André Rösti wrote:

Nice finding!

Just a few nitpicks:

  The subject line lacks a subsystem prefix. In this case is should be:

  Subject: [PATCH] entry: Respect ......

Documentation/process/ has quite some information about change logs
along with the tip tree specific one:

  https://www.kernel.org/doc/html/latest/process/maintainer-tip.html

> When a probe  is registered at the `trace_sys_enter` tracepoint, and

s/`trace_sys_enter`/trace_sys_enter()/ 

> that probe changes the system call number, the old system call still
> gets executed on x86_64 (and potentially other architectures). This

s/on x86_64 (and potentially other architectures//

Why? It's clear that this code is only executed on architectures which
utilize the generic entry code.

> is inconsistent with how ARM64 (and potentially other architectures)
> handles this, and inconsistent with the tracepoint semantics prior to
> change b6ec41346103 (core/entry: Report syscall correctly for trace
> and audit).

  This worked correctly until commit b6ec41346103 ("core/entry: Report
  syscall correctly for trace and audit") which removed the re-evaluation
  of the syscall number after the trace point.

The ARM64 info is nice to have, but not really relevant because the real
issue is the commit which changed the semantics of the entry code.

> With this patch, the semantics are restored to be the same as before

Please search for "This patch" in the process documentation

> the aforementioned change (and thus made consistent with ARM64). The
> change adds one line to re-read the system call number register into
> the `syscall` variable. By reading twice, the benefits of the
> aforementioned change b6ec41346103 are kept.

My version for the above would be:

  Restore the original semantics by re-evaluating the system call number
  after trace_sys_enter().

That's the gist of the change, right?

It does not matter where the number comes from and which variable is it
written to. That's what is in the patch itself. It's neither interesting
that reading twice keeps some benefit because again, that's what can be
seen from the actual change. You'd need to mention it if your fix would
change it.

> There should be no performance impact if no sys_enter tracepoints are
> registered, since re-reading the system call number from `regs` is
> only done conditonally if the tracepoint is in use. If a probe is
> registered, the performance impact should still be minimal, since the
> additional call to `syscall_get_nr` amounts to only an inlined read
> of `regs->orig_ax` (on x86_64).

That's pretty x86 specific. What about something like this:

  The performance impact of this re-evaluation is minimal because it is
  only relevant when a trace point is active and compared to the actual
  trace point overhead the read from a cache hot variable is
  neglectible.

This lacks a:

Fixes: commit b6ec41346103 ("core/entry: Report syscall correctly for trace and audit")

tag.

> Signed-off-by: André Rösti <an.roesti@...il.com>
> ---
> @Thomas Gleixner: You may have received this e-mail twice. My apologies!
> This is my first attempt to contribute, and I made a mistake using git
> send-email. Thanks for your work maintaining this and sorry again.

Don't worry. The first time submission is a learning experience. I
lively remember the healthy lesson I got a few decades ago and I still
get them today when I fail to see or describe something.

That said, I'm impressed by the detective work you did to put an
explanatory change log together on your first submission. Keep up the
good work!

I'm looking forward to the V2. Feel free to use my suggestions above as
guideline and rephrase it in your own words.

> ---
>  kernel/entry/common.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index 88cb3c88aaa5..89b14ba9ed14 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -57,8 +57,11 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall,
>  	/* Either of the above might have changed the syscall number */
>  	syscall = syscall_get_nr(current, regs);
>  
> -	if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT))
> +	if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) {
>  		trace_sys_enter(regs, syscall);
> +		/* Tracers may have changed system call number as well */

I'd rather say 'Probes or BPF hooks in the tracepoint ...'

Because a trace point per se does not change anything.

> +		syscall = syscall_get_nr(current, regs);
> +	}
>  
>  	syscall_enter_audit(regs, syscall);

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ