lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250819135008.5f1ba00e@gandalf.local.home>
Date: Tue, 19 Aug 2025 13:50:08 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Luo Gengkun <luogengkun@...weicloud.com>
Cc: mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
 linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH] tracing: Fix tracing_marker may trigger page fault
 during preempt_disable

On Tue, 19 Aug 2025 10:51:52 +0000
Luo Gengkun <luogengkun@...weicloud.com> wrote:

> Both tracing_mark_write and tracing_mark_raw_write call
> __copy_from_user_inatomic during preempt_disable. But in some case,
> __copy_from_user_inatomic may trigger page fault, and will call schedule()
> subtly. And if a task is migrated to other cpu, the following warning will

Wait! What?

__copy_from_user_inatomic() is allowed to be called from in atomic context.
Hence the name it has. How the hell can it sleep? If it does, it's totally
broken!

Now, I'm not against using nofault() as it is better named, but I want to
know why you are suggesting this change. Did you actually trigger a bug here?

> be trigger:
>         if (RB_WARN_ON(cpu_buffer,
>                        !local_read(&cpu_buffer->committing)))
> 
> An example can illustrate this issue:
> 
> process flow						CPU
> ---------------------------------------------------------------------
> 
> tracing_mark_raw_write():				cpu:0
>    ...
>    ring_buffer_lock_reserve():				cpu:0
>       ...
>       cpu = raw_smp_processor_id()			cpu:0
>       cpu_buffer = buffer->buffers[cpu]			cpu:0
>       ...
>    ...
>    __copy_from_user_inatomic():				cpu:0
>       ...
>       # page fault
>       do_mem_abort():					cpu:0

Sounds to me that arm64 __copy_from_user_inatomic() may be broken.

>          ...
>          # Call schedule
>          schedule()					cpu:0
> 	 ...
>    # the task schedule to cpu1
>    __buffer_unlock_commit():				cpu:1
>       ...
>       ring_buffer_unlock_commit():			cpu:1
> 	 ...
> 	 cpu = raw_smp_processor_id()			cpu:1
> 	 cpu_buffer = buffer->buffers[cpu]		cpu:1
> 
> As shown above, the process will acquire cpuid twice and the return values
> are not the same.
> 
> To fix this problem using copy_from_user_nofault instead of
> __copy_from_user_inatomic, as the former performs 'access_ok' before
> copying.
> 
> Fixes: 656c7f0d2d2b ("tracing: Replace kmap with copy_from_user() in trace_marker writing")

The above commit was intorduced in 2016. copy_from_user_nofault() was
introduced in 2020. I don't think this would be the fix for that kernel.

So no, I'm not taking this patch. If you see __copy_from_user_inatomic()
sleeping, it's users are not the issue. That function is.

-- Steve



> Signed-off-by: Luo Gengkun <luogengkun@...weicloud.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ