[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220902124234.472737cd.alex.williamson@redhat.com>
Date: Fri, 2 Sep 2022 12:42:34 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Abhishek Sahu <abhsahu@...dia.com>
Cc: Cornelia Huck <cohuck@...hat.com>,
Yishai Hadas <yishaih@...dia.com>,
Jason Gunthorpe <jgg@...dia.com>,
Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
Kevin Tian <kevin.tian@...el.com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Max Gurtovoy <mgurtovoy@...dia.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
<linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
<linux-pm@...r.kernel.org>, <linux-pci@...r.kernel.org>
Subject: Re: [PATCH v7 0/5] vfio/pci: power management changes
On Mon, 29 Aug 2022 17:18:45 +0530
Abhishek Sahu <abhsahu@...dia.com> wrote:
> This is part 2 for the vfio-pci driver power management support.
> Part 1 of this patch series was related to adding D3cold support
> when there is no user of the VFIO device and has already merged in the
> mainline kernel. If we enable the runtime power management for
> vfio-pci device in the guest OS, then the device is being runtime
> suspended (for linux guest OS) and the PCI device will be put into
> D3hot state (in function vfio_pm_config_write()). If the D3cold
> state can be used instead of D3hot, then it will help in saving
> maximum power. The D3cold state can't be possible with native
> PCI PM. It requires interaction with platform firmware which is
> system-specific. To go into low power states (Including D3cold),
> the runtime PM framework can be used which internally interacts
> with PCI and platform firmware and puts the device into the
> lowest possible D-States.
>
> This patch series adds the support to engage runtime power management
> initiated by the user. Since D3cold state can't be achieved by writing
> PCI standard PM config registers, so new device features have been
> added in DEVICE_FEATURE IOCTL for low power entry and exit related
> handling. For the PCI device, this low power state will be D3cold
> (if the platform supports the D3cold state). The hypervisors can implement
> virtual ACPI methods to make the integration with guest OS.
> For example, in guest Linux OS if PCI device ACPI node has
> _PR3 and _PR0 power resources with _ON/_OFF method, then guest
> Linux OS makes the _OFF call during D3cold transition and
> then _ON during D0 transition. The hypervisor can tap these virtual
> ACPI calls and then do the low power related IOCTL.
>
> The entry device feature has two variants. These two variants are mainly
> to support the different behaviour for the low power entry.
> If there is any access for the VFIO device on the host side, then the
> device will be moved out of the low power state without the user's
> guest driver involvement. Some devices (for example NVIDIA VGA or
> 3D controller) require the user's guest driver involvement for
> each low-power entry. In the first variant, the host can move the
> device into low power without any guest driver involvement while
> in the second variant, the host will send a notification to user
> through eventfd and then user guest driver needs to move the device
> into low power. The hypervisor can implement the virtual PME
> support to notify the guest OS. Please refer
> https://lore.kernel.org/lkml/20220701110814.7310-7-abhsahu@nvidia.com/
> where initially this virtual PME was implemented in the vfio-pci driver
> itself, but later-on, it has been decided that hypervisor can implement
> this.
>
> * Changes in v7
Applied to vfio next branch for v6.1. Thanks,
Alex
Powered by blists - more mailing lists