[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZwRq7PzAPzCAIBVv@boqun-archlinux>
Date: Mon, 7 Oct 2024 16:12:44 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: Alice Ryhl <aliceryhl@...gle.com>,
	FUJITA Tomonori <fujita.tomonori@...il.com>, netdev@...r.kernel.org,
	rust-for-linux@...r.kernel.org, hkallweit1@...il.com,
	tmgross@...ch.edu, ojeda@...nel.org, alex.gaynor@...il.com,
	gary@...yguo.net, bjorn3_gh@...tonmail.com, benno.lossin@...ton.me,
	a.hindborg@...sung.com, anna-maria@...utronix.de,
	frederic@...nel.org, tglx@...utronix.de, arnd@...db.de,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v2 5/6] rust: Add read_poll_timeout function
On Mon, Oct 07, 2024 at 07:13:40PM +0200, Andrew Lunn wrote:
> > > pub fn might_sleep() {
> > >     // SAFETY: Always safe to call.
> > >     unsafe { bindings::might_sleep() };
> > 
> > It's not always safe to call, because might_sleep() has a
> > might_resched() and in preempt=voluntary kernel, that's a
> > cond_resched(), which may eventually call __schedule() and report a
> > quiescent state of RCU. This could means an unexpected early grace
> > period, and that means a potential use-afer-free.
> 
> How does C handle this?
> 
> I'm not an RCU person...
> 
> But if you have called might_sleep() you are about to do something
> which could sleep. If it does sleep, the scheduler is going to be
> called, the grace period has ended, and RCU is going to do its
> thing. If that results in a use-after-free, your code is
> broken. might_sleep makes no difference here, the code is still
> broken, it just happens to light the fuse for the explosion a bit
> earlier.
> 
Because of the might_resched() in might_sleep(), it will report the
quiescent state of the current CPU, and RCU will pass a grace period if
all CPUs have passed a quiescent state. So for example if someone writes
the following:
    <reader>			<updater>
    rcu_read_lock();
    p = rcu_dereference(gp);
    might_sleep():
      might_resched():
				todo = gp;
				rcu_assign_pointer(gp, NULL);
				synchronize_rcu();
        rcu_all_qs(); // report a quiescent state inside RCU read-side
	              // critical section, which may make a grace period
		      // pass even there is an active RCU reader
				kfree(todo);
    a = READ_ONCE(p->a); // UAF
    rcu_read_unlock();
We probably call the reader side code a "wrong annotation", however,
it's still unsafe code because of the UAF. Also you seems to assume that
might_sleep() is always attached to a sleepable function, which is not
an invalid assumption, but we couldn't use it for reasoning the
safe/unsafe property of Rust functions unless we can encode this in the
type system. For Rust code, without klint rule, might_sleep() needs to
be unsafe. So we have two options for might_sleep().
* Since we rely on klint for atomic context detection, we can mark the
  trivial wrapper (as what Alice presented in the other email) as safe,
  but we need to begin to add klint annotation for that function, unless
  Gary finds a smart way to auto-annotate functions.
* Instead of might_sleep(), we provide the wrapper of __might_sleep(),
  since it doesn't have might_resched() in it, it should be safe. And
  all we care about here is the debugging rather than voluntary context
  switch. (Besides I think preempt=volunatry is eventually going to be
  gone because of PREEMPT_AUTO [1], if that happens I think the
  might_resched() might be dropped entirely).
Does this make sense?
[1]: https://lore.kernel.org/lkml/20240528003521.979836-1-ankur.a.arora@oracle.com/
Regards,
Boqun
> Or, i'm missing something, not being an RCU person.
> 
> 	Andrew
Powered by blists - more mailing lists
 
