lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZmoLi57aT4EB_97W@pollux>
Date: Wed, 12 Jun 2024 22:56:43 +0200
From: Danilo Krummrich <dakr@...hat.com>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: Boqun Feng <boqun.feng@...il.com>, rafael@...nel.org, mcgrof@...nel.org,
	russell.h.weight@...el.com, ojeda@...nel.org, alex.gaynor@...il.com,
	wedsonaf@...il.com, gary@...yguo.net, bjorn3_gh@...tonmail.com,
	benno.lossin@...ton.me, a.hindborg@...sung.com,
	aliceryhl@...gle.com, airlied@...il.com, fujita.tomonori@...il.com,
	pstanner@...hat.com, ajanulgu@...hat.com, lyude@...hat.com,
	rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] rust: add abstraction for struct device

On Wed, Jun 12, 2024 at 07:13:31PM +0200, Greg KH wrote:
> On Wed, Jun 12, 2024 at 06:18:38PM +0200, Danilo Krummrich wrote:
> > On Wed, Jun 12, 2024 at 05:50:42PM +0200, Greg KH wrote:
> > > On Wed, Jun 12, 2024 at 05:35:21PM +0200, Danilo Krummrich wrote:
> > > > On Wed, Jun 12, 2024 at 05:02:52PM +0200, Greg KH wrote:
> > > > > On Wed, Jun 12, 2024 at 04:51:42PM +0200, Danilo Krummrich wrote:
> > > > > > On 6/11/24 18:13, Boqun Feng wrote:
> > > > > > > On Tue, Jun 11, 2024 at 03:29:22PM +0200, Greg KH wrote:
> > > > > > > > On Tue, Jun 11, 2024 at 03:21:31PM +0200, Danilo Krummrich wrote:
> > > > > > > > > ...hence, I agree we should indeed add to the #Invariants and #Safety section
> > > > > > > > > that `->release` must be callable  from any thread.
> > > > > > > > > 
> > > > > > > > > However, this is just theory, do we actually have cases where `device::release`
> > > > > > > 
> > > > > > > @Danilo, right, it's only theorical, but it's good to call it out since
> > > > > > > it's the requirement for a safe Rust abstraction.
> > > > > > 
> > > > > > Similar to my previous reply, if we want to call this out as safety requirement
> > > > > > in `Device::from_raw`, we probably want to add it to the documentation of the C
> > > > > > `struct device`, such that we can argue that this is an invariant of C's
> > > > > > `struct device`.
> > > > > > 
> > > > > > Otherwise we'd have to write something like:
> > > > > > 
> > > > > > "It must also be ensured that the `->release` function of a `struct device` can
> > > > > > be called from any non-atomic context. While not being officially documented this
> > > > > > is guaranteed by the invariant of `struct device`."
> > > > > 
> > > > > In the 20+ years of the driver model being part of the kernel, I don't
> > > > > think this has come up yet, so maybe you can call the release function
> > > > > in irq context.  I don't know, I was just guessing :)
> > > > 
> > > > Ah, I see. I thought you know and it's defined, but just not documented.
> > > > 
> > > > This means it's simply undefined what we expect to happen when the last
> > > > reference of a device is dropped from atomic context.
> > > > 
> > > > Now, I understand (and would even expect) that practically this has never been
> > > > an issue. You'd need two circumstances, release() actually does something that
> > > > is not allowed in atomic context plus the last device reference is dropped from
> > > > atomic context - rather unlikely.
> > > > 
> > > > > 
> > > > > So let's not go adding constraints that we just do not have please.
> > > > > Same goes for the C code, so the rust code is no different here.
> > > > 
> > > > I agree we shouldn't add random constraints, but for writing safe code we also
> > > > have to rely on defined behavior.
> > > 
> > > As the rust code is relying on C code that could change at any point in
> > > time, how can that ever be "safe"?  :)
> > 
> > That's the same as with any other API. If the logic of an API is changed the
> > users (e.g a Rust abstraction) of the API have to be adjusted.
> 
> Agreed, just like any other in-kernel code, so there shouldn't be
> anything special here.
> 
> > > Sorry, this type of definition annoys me.
> > > 
> > > > I see two options:
> > > > 
> > > > (1) We globally (for struct device) define from which context release() is
> > > >     allowed to be called.
> > > 
> > > If you want, feel free to do that work please.  And then find out how to
> > > enforce it in the driver core.
> > 
> > If we *would* define non-atomic context only, we could enforce it with
> > might_sleep() for instance.
> 
> might_sleep() isn't always correct from what I remember.
> 
> > If we *would* define any context, there is nothing to enforce, but we'd need to
> > validate that no implementer of release() voids that.
> 
> Trying to validate that might be hard, again, I don't think it's worth
> it.
> 
> > The former is a constaint you don't want to add, the latter a lot of work. What
> > if we at least define that implementers of release() must *minimally* make sure
> > that it can be call from any non-atomic context.
> > 
> > That'd be something we can rely on in Rust.
> 
> Determining if you are, or are not, in atomic context is almost
> impossible in C, I don't know how you are going to do that at build time
> in Rust.  Good luck!

We can't always enforce things, but we can still follow rules even if we can't
enforce them with tools or language features.

In particular, nothing prevents C code to define from which context things are
allowed to be called. When I was working on drm_gpuvm [1] this was one of the
major aspects I had to get right and documented; keyword dma_fence signalling
critical sections.

[1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_gpuvm.c

> 
> > Oh, I fully agree with that. Let me try to explain a bit what this is about:
> > 
> > In Rust we have the `Send` and `Sync` marker traits. If a type (e.g. `Device`)
> > implements `Send` it means that it's safe to pass an instance of this type
> > between threads. Which is clearly something we want to do with a `Device`.
> > 
> > If I don't implement `Sync` for `Device` the compiler will prevent me from
> > sending it between threads, e.g. by disallowing me to put an instance of
> > `Device` into another data structure that is potentially passed between threads.
> > 
> > If I implement `Sync` I have to add a safety comment on why it is safe to pass
> > `Device` between threads. And here we have what Boqun pointed out: `Device` can
> > only be passed between threads when we're allowed to drop the last reference
> > from any thread. In the case of the kernel this can be any non-atomic context,
> > any context or any other subset. But I have to write something here that is
> > a defined rule and can be relied on.
> 
> You really have two things here, a matrix of:
> 	can transfer between threads
> 	can call in irq context
> that are independent and not related to each other at all.
> 
> Looks like Rust has built in support for the first.  And nothing for the
> second as that is a very kernel-specific thing.

The language documentation defines `Send` as "can be transferred between
threads", likely because it's written from a userspace perspective. But in
the kernel context it actually means can be transferred between any context,
thread, IRQ, etc.

If this isn't true, then we have to add a comment what is allowed (e.g. any
non-atomic context) and what's not allowed.

> 
> So let's not confuse the two please.  `Send` and `Sync` should be fine
> for a device pointer to be passed around, as long as the reference is
> incremented, as that's what all of the kernel C code does today.  Let's
> not worry about irq context at all, that's independent and can be
> handled at a later time, if at all, with a different "marking" as it's
> independent of the current two things.

That'd be great, but as mentioned above, we only have `Send`, but nothing like
`SendIrq`, hence `Send` really means any context.

Given your proposal, to just say it's fine to pass between (actual) threads and
ignore IRQ context for now, we have to implement `Send`, but document that IRQ
context is not covered.

We can either do that in the Rust abstraction as safety requirement, or we can,
as proposed previously, add a comment to the C `struct device` documentation
that implementers of release() must *at least* make sure that it can be called
from any non-atomic context. We can then refer to that.

- Danilo

> 
> thanks,
> 
> greg k-h
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ