[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210727190317.GJ1721383@nvidia.com>
Date: Tue, 27 Jul 2021 16:03:17 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Alex Williamson <alex.williamson@...hat.com>
Cc: Cornelia Huck <cohuck@...hat.com>, Christoph Hellwig <hch@....de>,
Kirti Wankhede <kwankhede@...dia.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] vfio/mdev: don't warn if ->request is not set
On Tue, Jul 27, 2021 at 12:53:09PM -0600, Alex Williamson wrote:
> On Tue, 27 Jul 2021 14:32:09 -0300
> Jason Gunthorpe <jgg@...dia.com> wrote:
>
> > On Tue, Jul 27, 2021 at 08:04:16AM +0200, Cornelia Huck wrote:
> > > On Mon, Jul 26 2021, Alex Williamson <alex.williamson@...hat.com> wrote:
> > >
> > > > On Mon, 26 Jul 2021 20:09:06 -0300
> > > > Jason Gunthorpe <jgg@...dia.com> wrote:
> > > >
> > > >> On Mon, Jul 26, 2021 at 07:07:04PM +0200, Cornelia Huck wrote:
> > > >>
> > > >> > But I wonder why nobody else implements this? Lack of surprise removal?
> > > >>
> > > >> The only implementation triggers an eventfd that seems to be the same
> > > >> eventfd as the interrupt..
> > > >>
> > > >> Do you know how this works in userspace? I'm surprised that the
> > > >> interrupt eventfd can trigger an observation that the kernel driver
> > > >> wants to be unplugged?
> > > >
> > > > I think we're talking about ccw, but I see QEMU registering separate
> > > > eventfds for each of the 3 IRQ indexes and the mdev driver specifically
> > > > triggering the req_trigger...? Thanks,
> > > >
> > > > Alex
> > >
> > > Exactly, ccw has a trigger for normal I/O interrupts, CRW (machine
> > > checks), and this one.
> >
> > If it is a dedicated eventfd for 'device being removed' why is it in
> > the CCW implementation and not core code?
>
> The CCW implementation (likewise the vfio-pci implementation) owns
> the IRQ index address space and the decision to make this a signal
> to userspace rather than perhaps some handling a device might be
> able to do internally.
The core code holds the vfio_device_get() so long as the FD is
open. There is no way to pass the wait_for_completion without
userspace closing the FD, so there isn't really much choice for the
drivers to do beyond signal to userpace to close the FD??
> For instance an alternate vfio-pci implementation might zap all
> mmaps, block all r/w access, and turn this into a surprise removal.
This is nice, but wouldn't close the FD, so needs core changes
anyhow..
> Another implementation might be more aggressive to sending SIGKILL
> to the user process.
We don't try to revoke FDs from the kernel, it is racy, dangerous and
unreliable.
> This was the thought behind why vfio-core triggers the driver
> request callback with a counter, leaving the policy to the driver.
IMHO subsystem policy does not belong in drivers. Down that road lies
a mess for userspace.
Jason
Powered by blists - more mailing lists