netdev - Re: [PATCH net-next v2 5/6] rust: Add read_poll

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZwPq4pdvcz9ADNTM@boqun-archlinux>
Date: Mon, 7 Oct 2024 07:06:26 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: FUJITA Tomonori <fujita.tomonori@...il.com>, netdev@...r.kernel.org,
	rust-for-linux@...r.kernel.org, hkallweit1@...il.com,
	tmgross@...ch.edu, ojeda@...nel.org, alex.gaynor@...il.com,
	gary@...yguo.net, bjorn3_gh@...tonmail.com, benno.lossin@...ton.me,
	a.hindborg@...sung.com, aliceryhl@...gle.com,
	anna-maria@...utronix.de, frederic@...nel.org, tglx@...utronix.de,
	arnd@...db.de, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v2 5/6] rust: Add read_poll_timeout function

On Mon, Oct 07, 2024 at 03:48:09PM +0200, Andrew Lunn wrote:
> On Mon, Oct 07, 2024 at 05:28:28AM -0700, Boqun Feng wrote:
> > On Sun, Oct 06, 2024 at 04:45:21PM +0200, Andrew Lunn wrote:
> > [...]
> > > > > > +    if sleep {
> > > > > > +        // SAFETY: FFI call.
> > > > > > +        unsafe { bindings::might_sleep() }
> > > > > > +    }
> > > > > 
> > > > > What is actually unsafe about might_sleep()? It is a void foo(void)
> > > > 
> > > > Every extern "C" function is by default unsafe, because C doesn't have
> > > > the concept of safe/unsafe. If you want to avoid unsafe, you could
> > > > introduce a Rust's might_sleep() which calls into
> > > > `bindings::might_sleep()`:
> > > > 
> > > > 	pub fn might_sleep() {
> > > > 	    // SAFETY: ??
> > > > 	    unsafe { bindings::might_sleep() }
> > > > 	}
> > > > 
> > > > however, if you call a might_sleep() in a preemption disabled context
> > > > when CONFIG_DEBUG_ATOMIC_SLEEP=n and PREEMPT=VOLUNTERY, it could means
> > > > an unexpected RCU quiescent state, which results an early RCU grace
> > > > period, and that may mean a use-after-free. So it's not that safe as you
> > > > may expected.
> > > 
> > > If you call might_sleep() in a preemption disabled context you code is
> > > already unsafe, since that is the whole point of it, to find bugs
> > 
> > Well, in Rust, the rule is: any type-checked (compiled successfully)
> > code that only calls safe Rust functions cannot be unsafe. So the fact
> > that calling might_sleep() in a preemption disabled context is unsafe
> > means that something has to be unsafe.
> > 
> > This eventually can turn into a "blaming game" in the design space: we
> > can either design the preemption disable function as unsafe or the
> > might_sleep() function as unsafe. But one of them has to be unsafe
> > function, otherwise we are breaking the safe code guarantee.
> 
> Just keep in mind, it could of been C which put you into atomic
> context before calling into Rust. An interrupt handler would be a good
> example, and i'm sure there are others.
> 

That's why the klint approach is preferred right now. Without klint, and
if we don't want to mark might_sleep() as unsafe, we probably need to
mark the registration of an interrupt handler unsafe, and the safety
requirement would be "making sure the handler doesn't call schedule()".

> > However, this is actually a special case: currently we want to use klint
> > [1] to detect all context mis-matches at compile time. So the above rule
> > extends for kernel: any type-checked *and klint-checked* code that only
> > calls safe Rust functions cannot be unsafe. I.e. we add additional
> > compile time checking for unsafe code. So if might_sleep() has the
> > proper klint annotation, and we actually enable klint for kernel code,
> > then we can make it safe (along with preemption disable functions being
> > safe).
> > 
> > > where you use a sleeping function in atomic context. Depending on why
> > > you are in atomic context, it might appear to work, until it does not
> > > actually work, and bad things happen. So it is not might_sleep() which
> > > is unsafe, it is the Rust code calling it.
> > 
> > The whole point of unsafe functions is that calling it may result into
> > unsafe code, so that's why all extern "C" functions are unsafe, so are
> > might_sleep() (without klint in the picture).
> 
> There is a psychological part to this. might_sleep() is a good debug
> tool, which costs very little in normal builds, but finds logic bugs
> when enabled in debug builds. What we don't want is Rust developers
> not scattering it though their code because it adds unsafe code, and
> the aim is not to have any unsafe code.
> 

Sure, but my point is these need to be put together into a proper
design. For example, spin_lock() is currently exposed into Rust as a
safe API lock(), so the following code is unsafe:

	let g = lock1.lock(); // lock1 is a spinlock
	might_sleep();
	drop(g);

without the klint rule, if we want to mark might_sleep() as safe, then
we need to mark lock() as unsafe, otherwise, it's an unsafe code block
constructed by pure safe functions. However, compared to might_sleep(),
I think we would like keep lock() as safe since it is used more widely.

Regards,
Boqun

> 	Andrew