[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANeycqpYXOdLbUmTce_o6FhSC1rg8TZqe5MkLpsF8E4Dr9352A@mail.gmail.com>
Date: Sun, 5 Mar 2023 03:52:08 -0300
From: Wedson Almeida Filho <wedsonaf@...il.com>
To: Asahi Lina <lina@...hilina.net>
Cc: Robin Murphy <robin.murphy@....com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Miguel Ojeda <ojeda@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Will Deacon <will@...nel.org>, Joerg Roedel <joro@...tes.org>,
Hector Martin <marcan@...can.st>,
Sven Peter <sven@...npeter.dev>, Arnd Bergmann <arnd@...db.de>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Alyssa Rosenzweig <alyssa@...enzweig.io>,
Neal Gompa <neal@...pa.dev>, rust-for-linux@...r.kernel.org,
linux-kernel@...r.kernel.org, asahi@...ts.linux.dev
Subject: Re: [PATCH 2/5] rust: device: Add a minimal RawDevice trait
On Fri, 24 Feb 2023 at 13:23, Asahi Lina <lina@...hilina.net> wrote:
>
> On 25/02/2023 00.14, Robin Murphy wrote:
> > On 2023-02-24 14:48, Asahi Lina wrote:
> >>
> >>
> >> On 2023/02/24 23:32, Robin Murphy wrote:
> >>> FWIW the DMA API *has* to know which specific device it's operating
> >>> with, since the relevant properties can and do vary even between
> >>> different devices within a single bus_type (e.g. DMA masks).
> >>>
> >>> In the case of io-pgtable at least, there's no explicit refcounting
> >>> since the struct device must be the one representing the physical
> >>> platform/PCI/etc. device consuming the pagetable, so if that were to
> >>> disappear from underneath its driver while the pagetable is still in
> >>> use, things would already have gone very very wrong indeed :)
> >>
> >> There's no terribly good way to encode this relationship in safe Rust as
> >> far as I know. So although it might be "obvious" (and I think my driver
> >> can never violate it as it is currently designed), this means the Rust
> >> abstraction will have to take the device reference if the C side does
> >> not, because safe rust abstractions have to actually make these bugs
> >> impossible and nothing stops a Rust driver from, say, stashing an
> >> io_pgtable reference into a global and letting the device go away.
> >
> > If someone did that, then simply holding a struct device reference
> > wouldn't guarantee much, since it only prevents the pointer itself from
> > becoming invalid - it still doesn't say any of the data *in* the
> > structure is still valid and "safe" for what a DMA API call might do
> > with it.
> >
> > At the very least you'd probably have to somehow also guarantee that the
> > device has a driver bound (which is the closest thing to a general
> > indication of valid DMA ops across all architectures) and block it from
> > unbinding for the lifetime of the reference, but that would then mean a
> > simple driver which expects to tear down its io-pgtable from its .remove
> > callback could never be unbound due to the circular dependency :/
>
> I'd like to hear from the other Rust folks about ideas on how to solve
> this ^^. I think it might fit into the Revocable model for device
> resources, but it's going to be a bit of an issue for my driver because
> for me, the io_pgtable objects are deep in the Vm object that has its
> own refcounting, and those are created and destroyed dynamically at
> runtime, not just during device probe...
Our bus abstractions all require that driver data stored in devices
implement the `DeviceRemoval` trait, which has a single function:
device_remove. The idea is that this function will do something to
relinquish access to resources that shouldn't be accessed after the
device is unbound (device_remove is called, for whatever reason). This
mechanism is generic and allows us to break circular dependencies.
One way to implement the `DeviceRemoval` trait is to hold all
resources in "revocable" wrappers (of which we have several that have
different constraints/requirements), and then revoke access when
`device_remove` is called (this results in blocking while concurrent
users are running, then running "destructors" when they're all done
and before returning). The part that isn't great about this approach
is that we introduce error paths everywhere these resources are used
(because they may have been revoked).
Powered by blists - more mailing lists