lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLWRUTFeumwr1--E@arm.com>
Date: Mon, 1 Sep 2025 13:28:01 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Mark Rutland <mark.rutland@....com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
	Luo Gengkun <luogengkun@...weicloud.com>, mhiramat@...nel.org,
	mathieu.desnoyers@...icios.com, linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org, Will Deacon <will@...nel.org>,
	linux-arm-kernel@...ts.infradead.org,
	Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH] tracing: Fix tracing_marker may trigger page fault
 during preempt_disable

On Mon, Sep 01, 2025 at 10:56:47AM +0100, Mark Rutland wrote:
> On Sat, Aug 30, 2025 at 11:22:51AM +0100, Catalin Marinas wrote:
> > On Fri, Aug 29, 2025 at 06:13:11PM -0400, Steven Rostedt wrote:
> > > On Fri, 29 Aug 2025 20:53:40 +0100
> > > Catalin Marinas <catalin.marinas@....com> wrote:
> > > valid user address.
> > > > BTW, arm64 also bails out early in do_page_fault() if in_atomic() but I
> > > > suspect that's not the case here.
> > > > 
> > > > Adding Al Viro since since he wrote a large part of uaccess.h.
> > > 
> > > So, __copy_from_user_inatomic() is supposed to be called if
> > > pagefault_disable() has already been called? If this is the case, can we
> > > add more comments to this code? I've been using the inatomic() version this
> > > way in preempt disabled locations since 2016.
> > 
> > This should work as long as in_atomic() returns true as it's checked in
> > the arm64 fault code. With PREEMPT_NONE, however, I don't think this
> > works.
> 
> Sorry, what exactly breaks for the PREEMPT_NONE case?

This code would trigger a warning:

	preempt_disable();
	WARN_ON(!in_atomic());
	preempt_enable();

More importantly, a faulting __copy_from_user_inatomic() between
get/put_cpu() could trigger migration.

> > > I just wanted to figure out why __copy_from_user_inatomic() wasn't atomic.
> > > If anything, it needs to be better documented.
> > 
> > Yeah, I had no idea until I looked at the code. I guess it means it can
> > be safely used if in_atomic() == true (well, making it up, not sure what
> > the intention was).
> 
> I think that was the intention -- it's the caller asserting that they
> know the access won't fault (and hence won't sleep), and that's why
> __copy_to_user_inatomic() and __copy_to_user() only differ by the latter
> calling might_sleep().
> 
> It looks like other inconsistencies have crept in by accident. AFAICT
> the should_fail_usercopy() check in __copy_from_user() was accidentally
> missed from __copy_from_user_inatomic() when it was introduced in
> commit:

I was wondering about that but some code comment for the inatomic
variant states that it's the responsibility of the caller to ensure it
doesn't fault. Not sure one can do other than pinning the page _and_
taking the mm lock. So I agree we should add the fault injection here as
well.

> ... so there's a bunch of scope for cleanup, and we could probably have:
> 
> 	/*
> 	 * TODO: comment saying to only call this directly when you know
> 	 * that the fault handler won't sleep.
> 	 */
> 	static __always_inline __must_check unsigned long
> 	__copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
> 	{
> 		...
> 	}
> 
> 	static __always_inline __must_check unsigned long
> 	__copy_from_user(void *to, const void __user *from, unsigned long n)
> 	{
> 		might_fault();
> 		return __copy_from_user_inatomic();
> 	}
> 
> ... to avoid the inconsistency.

I think the _inatomic variant should be reserved to arch code that knows
the conditions. Generic code/drivers may not necessarily be aware of
what the arch fault handler does. The _nofault API I think is better
suited in generic code.

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ