[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220726151232.GF4438@nvidia.com>
Date: Tue, 26 Jul 2022 12:12:32 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: Alex Williamson <alex.williamson@...hat.com>,
Yishai Hadas <yishaih@...dia.com>,
"saeedm@...dia.com" <saeedm@...dia.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"kuba@...nel.org" <kuba@...nel.org>,
"Martins, Joao" <joao.m.martins@...cle.com>,
"leonro@...dia.com" <leonro@...dia.com>,
"maorg@...dia.com" <maorg@...dia.com>,
"cohuck@...hat.com" <cohuck@...hat.com>
Subject: Re: [PATCH V2 vfio 06/11] vfio: Introduce the DMA logging feature
support
On Tue, Jul 26, 2022 at 07:34:55AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@...dia.com>
> > Sent: Monday, July 25, 2022 10:37 PM
> >
> > On Mon, Jul 25, 2022 at 07:38:52AM +0000, Tian, Kevin wrote:
> >
> > > > Yes. qemu has to select a static aperture at start.
> > > >
> > > > The entire aperture is best, if that fails
> > > >
> > > > A smaller aperture and hope the guest doesn't use the whole space, if
> > > > that fails,
> > > >
> > > > The entire guest physical map and hope the guest is in PT mode
> > >
> > > That sounds a bit hacky... does it instead suggest that an interface
> > > for reporting the supported ranges on a tracker could be helpful once
> > > trying the entire aperture fails?
> >
> > It is the "try and fail" approach. It gives the driver the most
> > flexability in processing the ranges to try and make them work. If we
> > attempt to describe all the device constraints that might exist we
> > will be here forever.
>
> Usually the caller of a 'try and fail' interface knows exactly what to
> be tried and then call the interface to see whether the callee can
> meet its requirement.
Which is exactly this case.
qemu has one thing to try that meets its full requirement - the entire
vIOMMU aperture.
The other two are possible options based on assumptions of how the
guest VM is operating that might work - but this guessing is entirely
between qemu and the VM, not something the kernel can help with.
So, from the kernel perspective qemu will try three things in order of
preference and the first to work will be the right one. Making the
kernel API more complicated is not going to help qemu guess what the
guest is doing any better.
In any case this is vIOMMU mode so if the VM establishes mappings
outside the tracked IOVA then qemu is aware of it and qemu can
perma-dirty those pages as part of its migration logic. It is not
broken, it just might not meet the SLA.
> But I can see why a reporting mechanism doesn't fit well with
> your example below. In the worst case probably the user has to
> decide between using vIOMMU vs. vfio DMA logging if a simple
> policy of using the entire aperture doesn't work...
Well, yes, this is exactly the situation unfortunately. Without
special HW support vIOMMU is not going to work perfectly, but there
are reasonably use cases where vIOMMU is on but the guest is in PT
mode that could work, or where the IOVA aperture is limited, or
so on..
Jason
Powered by blists - more mailing lists