[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251018221911.GC1034710.vipinsh@google.com>
Date: Sat, 18 Oct 2025 15:19:11 -0700
From: Vipin Sharma <vipinsh@...gle.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: bhelgaas@...gle.com, alex.williamson@...hat.com,
pasha.tatashin@...een.com, dmatlack@...gle.com, jgg@...pe.ca,
graf@...zon.com, pratyush@...nel.org, gregkh@...uxfoundation.org,
chrisl@...nel.org, rppt@...nel.org, skhawaja@...gle.com,
parav@...dia.com, saeedm@...dia.com, kevin.tian@...el.com,
jrhilke@...gle.com, david@...hat.com, jgowans@...zon.com,
dwmw2@...radead.org, epetron@...zon.de, junaids@...gle.com,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
kvm@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [RFC PATCH 11/21] vfio/pci: Skip clearing bus master on live
update device during kexec
On 2025-10-18 09:09:06, Lukas Wunner wrote:
> On Fri, Oct 17, 2025 at 05:07:03PM -0700, Vipin Sharma wrote:
> > Set skip_kexec_clear_master on live update prepare() so that the device
> > participating in live update can continue to perform DMA during kexec
> > phase.
>
> Instead of introducing the skip_kexec_clear_master flag,
> could you introduce a function to check whether a device
> participates in live update and call that in pci_device_shutdown()?
>
> I think that would be cleaner. Otherwise someone reading
> the code has to chase down the meaning of skip_kexec_clear_master,
> i.e. search for places where the bit is set.
That is one way to do it. In our internal implementation we have an API
which checks for the device participation in the live update, similar to
what you have suggested.
The PCI series posted by Chris [1] is providing a different way to know
the live update particpation of device. There pci_dev has a new struct
which contains particpation information.
In this VFIO series, my intention is to make minimal changes to PCI or
any other subsystem. I opted for a simple variable to check what device
should do during kexec reboot.
My hunch is that we will end up needing some state information in the
struct pci_dev{} which denotes device participation and whatever that
ends up being, we can use that here.
[1] https://lore.kernel.org/linux-pci/20250916-luo-pci-v2-0-c494053c3c08@kernel.org/
>
> When the device is unbound from vfio-pci, don't you have to
> clear the skip_kexec_clear_master flag? I'm not seeing this
> in your patches but maybe I'm missing something. That problem
> would solve itself if you follow the suggestion above.
VFIO subsystem blocks removal from vfio-pci if there is still a
reference to device (references are increased/decreased when device is
opened/closed, check vfio_unregister_group_dev()). LUO also do fget on
the VFIO FD which means we will not get closed callback on the VFIO FD
until that reference is dropped besides the opened file in userspace.
So, prior to kexec, luo will drop reference only if live update cancel
happens and that is the time we are resetting this flag in this patch
series.
Powered by blists - more mailing lists