lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <55436bdc-fd09-fdb8-50b1-af088d594a72@nvidia.com>
Date:   Thu, 19 May 2022 10:21:11 +0530
From:   Abhishek Sahu <abhsahu@...dia.com>
To:     Alex Williamson <alex.williamson@...hat.com>
Cc:     Cornelia Huck <cohuck@...hat.com>,
        Yishai Hadas <yishaih@...dia.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Shameer Kolothum <shameerali.kolothum.thodi@...wei.com>,
        Kevin Tian <kevin.tian@...el.com>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Max Gurtovoy <mgurtovoy@...dia.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-pm@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [PATCH v5 0/4] vfio/pci: power management changes

On 5/18/2022 11:21 PM, Alex Williamson wrote:
> On Wed, 18 May 2022 16:46:08 +0530
> Abhishek Sahu <abhsahu@...dia.com> wrote:
> 
>> Currently, there is very limited power management support available
>> in the upstream vfio-pci driver. If there is no user of vfio-pci device,
>> then it will be moved into D3Hot state. Similarly, if we enable the
>> runtime power management for vfio-pci device in the guest OS, then the
>> device is being runtime suspended (for linux guest OS) and the PCI
>> device will be put into D3hot state (in function
>> vfio_pm_config_write()). If the D3cold state can be used instead of
>> D3hot, then it will help in saving maximum power. The D3cold state can't
>> be possible with native PCI PM. It requires interaction with platform
>> firmware which is system-specific. To go into low power states
>> (including D3cold), the runtime PM framework can be used which
>> internally interacts with PCI and platform firmware and puts the device
>> into the lowest possible D-States.
>>
>> This patch series registers the vfio-pci driver with runtime
>> PM framework and uses the same for moving the physical PCI
>> device to go into the low power state for unused idle devices.
>> There will be separate patch series that will add the support
>> for using runtime PM framework for used idle devices.
>>
>> The current PM support was added with commit 6eb7018705de ("vfio-pci:
>> Move idle devices to D3hot power state") where the following point was
>> mentioned regarding D3cold state.
>>
>>  "It's tempting to try to use D3cold, but we have no reason to inhibit
>>   hotplug of idle devices and we might get into a loop of having the
>>   device disappear before we have a chance to try to use it."
>>
>> With the runtime PM, if the user want to prevent going into D3cold then
>> /sys/bus/pci/devices/.../d3cold_allowed can be set to 0 for the
>> devices where the above functionality is required instead of
>> disallowing the D3cold state for all the cases.
>>
>> The BAR access needs to be disabled if device is in D3hot state.
>> Also, there should not be any config access if device is in D3cold
>> state. For SR-IOV, the PF power state should be higher than VF's power
>> state.
>>
>> * Changes in v5
>>
>> - Rebased over https://github.com/awilliam/linux-vfio/tree/next.
>> - Renamed vfio_pci_lock_and_set_power_state() to
>>   vfio_lock_and_set_power_state() and made it static.
>> - Inside vfio_pci_core_sriov_configure(), protected setting of
>>   power state and sriov enablement with 'memory_lock'.
>> - Removed CONFIG_PM macro use since it is not needed with current
>>   code.
> 
> Applied to vfio next branch for v5.19.  Thanks!
> 
> Alex
> 

 Thanks Alex for your thorough review and support in getting
 this series merged. I will start exploring for the second part
 and will find out a generic way to support all the use cases.

 Regards,
 Abhishek
  
>> * Changes in v4
>>   (https://lore.kernel.org/lkml/20220517100219.15146-1-abhsahu@nvidia.com)
>>
>> - Rebased over https://github.com/awilliam/linux-vfio/tree/next.
>> - Split the patch series into 2 parts. This part contains the patches
>>   for using runtime PM for unused idle device.
>> - Used the 'pdev->current_state' for checking if the device in D3 state.
>> - Adds the check in __vfio_pci_memory_enabled() function itself instead
>>   of adding power state check at each caller.
>> - Make vfio_pci_lock_and_set_power_state() global since it is needed
>>   in different files.
>> - Used vfio_pci_lock_and_set_power_state() instead of
>>   vfio_pci_set_power_state() before pci_enable_sriov().
>> - Inside vfio_pci_core_sriov_configure(), handled both the cases
>>   (the device is in low power state with and without user).
>> - Used list_for_each_entry_continue_reverse() in
>>   vfio_pci_dev_set_pm_runtime_get().
>>
>> * Changes in v3
>>   (https://lore.kernel.org/lkml/20220425092615.10133-1-abhsahu@nvidia.com)
>>
>> - Rebased patches on v5.18-rc3.
>> - Marked this series as PATCH instead of RFC.
>> - Addressed the review comments given in v2.
>> - Removed the limitation to keep device in D0 state if there is any
>>   access from host side. This is specific to NVIDIA use case and
>>   will be handled separately.
>> - Used the existing DEVICE_FEATURE IOCTL itself instead of adding new
>>   IOCTL for power management.
>> - Removed all custom code related with power management in runtime
>>   suspend/resume callbacks and IOCTL handling. Now, the callbacks
>>   contain code related with INTx handling and few other stuffs and
>>   all the PCI state and platform PM handling will be done by PCI core
>>   functions itself.
>> - Add the support of wake-up in main vfio layer itself since now we have
>>   more vfio/pci based drivers.
>> - Instead of assigning the 'struct dev_pm_ops' in individual parent
>>   driver, now the vfio_pci_core tself assigns the 'struct dev_pm_ops'. 
>> - Added handling of power management around SR-IOV handling.
>> - Moved the setting of drvdata in a separate patch.
>> - Masked INTx before during runtime suspended state.
>> - Changed the order of patches so that Fix related things are at beginning
>>   of this patch series.
>> - Removed storing the power state locally and used one new boolean to
>>   track the d3 (D3cold and D3hot) power state 
>> - Removed check for IO access in D3 power state.
>> - Used another helper function vfio_lock_and_set_power_state() instead
>>   of touching vfio_pci_set_power_state().
>> - Considered the fixes made in
>>   https://lore.kernel.org/lkml/20220217122107.22434-1-abhsahu@nvidia.com
>>   and updated the patches accordingly.
>>
>> * Changes in v2
>>   (https://lore.kernel.org/lkml/20220124181726.19174-1-abhsahu@nvidia.com)
>>
>> - Rebased patches on v5.17-rc1.
>> - Included the patch to handle BAR access in D3cold.
>> - Included the patch to fix memory leak.
>> - Made a separate IOCTL that can be used to change the power state from
>>   D3hot to D3cold and D3cold to D0.
>> - Addressed the review comments given in v1.
>>
>> * v1
>>   https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/
>>
>> Abhishek Sahu (4):
>>   vfio/pci: Invalidate mmaps and block the access in D3hot power state
>>   vfio/pci: Change the PF power state to D0 before enabling VFs
>>   vfio/pci: Virtualize PME related registers bits and initialize to zero
>>   vfio/pci: Move the unused device into low power state with runtime PM
>>
>>  drivers/vfio/pci/vfio_pci_config.c |  56 ++++++++-
>>  drivers/vfio/pci/vfio_pci_core.c   | 178 ++++++++++++++++++++---------
>>  2 files changed, 178 insertions(+), 56 deletions(-)
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ