[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240729154203.GF3371438@nvidia.com>
Date: Mon, 29 Jul 2024 12:42:03 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: Dan Williams <dan.j.williams@...el.com>, ksummit@...ts.linux.dev,
linux-cxl@...r.kernel.org, linux-rdma@...r.kernel.org,
netdev@...r.kernel.org, shiju.jose@...wei.com,
Borislav Petkov <bp@...en8.de>,
Mauro Carvalho Chehab <mchehab@...nel.org>
Subject: Re: [MAINTAINERS SUMMIT] Device Passthrough Considered Harmful?
On Mon, Jul 29, 2024 at 01:45:12PM +0100, Jonathan Cameron wrote:
> If we expose that particular Feature via Set Feature we may run into
> future problems. It is probably possible to make the driver stateless
> so any interference from a userspace program using fwctl is not fatal
> - in this case userspace code should probably be safe to state changes
> anyway. We know about this clash today, so could easily block fwctl
> from exposing this feature, but it is illustrative of a wider problem.
> We will get some decisions about what should be exposed via fwctl wrong
> in the long term, even if they are correct at time of initial decision.
> So how do we cope with that?
>
> 1) Make no guarantees on ABI for taint causing operations.
> So we can block this FWCTL in a kernel if EDAC / ras control is in place
> for the same feature. I'm fine with this but it's not obviously
> a correct thing to do!
Maybe, I think that is a bit suboptimal.
> 2) Allow the footgun. Keep the fwctl interface and harden the other kernel
> support against state changes that result. If userspace code breaks,
> then tough luck. (Another form of ABI break, perhaps comprehended by
> existing proposed FWCTL rules).
This one is certainly closer to being in line with the fwctl doc as
written, but I'd say fwctl should be unable to hijack a scrubber block
from the kernel in the first place.
> 3) We are stuck for ever with not supporting anything via other interfaces
> that would break if fwctl was in use. Ouch.
Definately not this option.
> Note that I think this only matters for the Set path as Get side shouldn't
> have side effects and is fine to expose without synchronization with
> a clear statement that values read are a snapshot only.
Yes, that makes sense.
> We could say it can only be used for features we have 'opted' in +
> vendor defined features, but I'm not sure that helps. If a vendor
> defines a feature for generation A, and does what we want them to by
> proposing a spec addition they use in generation B, we would want a
> path to single upstream interface for both generations. So I don't
> think restricting this to particular classes of command helps us.
My expectation for fwctl was that it would own things that are
reasonably sharable by the kernel and userspace.
As an example, instead of a turning on a feature dynamically at run
time, you'd want to instead tell the FW that on next reboot that
feature will be forced on.
Another take would be things that are clearly contained to fwctl
multi-instance features where fwctl gets its own private thing that
cannot disturb the kernel.
I'm really not familiar with cxl to give any comment here - but
dynamically control the single global scrubber unit seems like a poor
fit to me.
Jason
Powered by blists - more mailing lists