lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250501154503.2308f177@gandalf.local.home>
Date: Thu, 1 May 2025 15:45:03 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Paul Cacheux via B4 Relay <devnull+paulcacheux.gmail.com@...nel.org>
Cc: paulcacheux@...il.com, Masami Hiramatsu <mhiramat@...nel.org>, Mathieu
 Desnoyers <mathieu.desnoyers@...icios.com>, linux-kernel@...r.kernel.org,
 linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH] tracing: fix race when creating trace probe log error
 message

On Tue, 22 Apr 2025 20:33:13 +0200
Paul Cacheux via B4 Relay <devnull+paulcacheux.gmail.com@...nel.org> wrote:

> From: Paul Cacheux <paulcacheux@...il.com>

Sorry for the late reply, I just noticed this patch.

> 
> When creating a trace probe a global variable is modified and this
> data used when an error is raised and the error message generated.
> 
> Modification of this global variable is done without any lock and
> multiple trace operations will race, causing some potential issues
> when generating the error.
> 
> This commit moves away from the global variable and passes the
> error context as a regular function argument.
> 
> Fixes: ab105a4fb894 ("tracing: Use tracing error_log with probe events")
> 
> Signed-off-by: Paul Cacheux <paulcacheux@...il.com>
> ---
> As reported in [1] a race exists in the shared trace probe log
> used to build error messages. This can cause kernel crashes
> when building the actual error message, but the race happens
> even for non-error tracefs uses, it's just not visible.
> 
> Reproducer first reported that is still crashing:
> 
>   # 'p4' is invalid command which make kernel run into trace_probe_log_err()
>   cd /sys/kernel/debug/tracing
>   while true; do
>     echo 'p4:myprobe1 do_sys_openat2 dfd=%ax filename=%dx flags=%cx mode=+4($stack)' >> kprobe_events &
>     echo 'p4:myprobe2 do_sys_openat2' >> kprobe_events &
>     echo 'p4:myprobe3 do_sys_openat2 dfd=%ax filename=%dx' >> kprobe_events &
>   done;
> 
> The original email suggested to use a mutex or to allocate the
> trace_probe_log on the stack. The mutex can cause performance
> issues, and require high confidence in the correctness of the
> current trace_probe_log_clear calls. This patch implements
> the stack solution instead and passes a pointer to using
> functions.
> 
> [1] https://lore.kernel.org/all/20221121081103.3070449-1-zhengyejian1@huawei.com/T/

Honestly, I don't like either approach.

What could be done is wrap the internals of the function in a mutex so they
are not re-entrant (using guard(mutex)). If two error codes are happening
together, just let it get corrupted. There should never be two additions at
the same time, and if the admin is doing that then they deserve what they
get.

I don't care if the error log gets garbage if there's multiple accesses at
the same time. The fix should only prevent it from crashing.

-- Steve


-- Steve


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ