lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 26 May 2021 00:49:26 +0800
From:   Kai-Heng Feng <kai.heng.feng@...onical.com>
To:     Christoph Hellwig <hch@....de>
Cc:     Koba Ko <koba.ko@...onical.com>, Keith Busch <kbusch@...nel.org>,
        Jens Axboe <axboe@...com>, Sagi Grimberg <sagi@...mberg.me>,
        linux-nvme <linux-nvme@...ts.infradead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Henrik Juul Hansen <hjhansen2020@...il.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Linux PCI <linux-pci@...r.kernel.org>
Subject: Re: [PATCH] nvme-pci: Avoid to go into d3cold if device can't use npss.

On Tue, May 25, 2021 at 3:44 PM Christoph Hellwig <hch@....de> wrote:
>
> On Thu, May 20, 2021 at 11:33:15AM +0800, Koba Ko wrote:
> > After resume, host can't change power state of the closed controller
> > from D3cold to D0.
>
> Why?

IIUC it's a regression introduced by commit b97120b15ebd ("nvme-pci:
use simple suspend when a HMB is enabled"). The affected NVMe is using
HMB.

That commit intentionally put the device to D3hot instead of D0 on
suspend, as the root port of the NVMe device has _PR3, the NVMe was
put to D3cold as a result. I believe because the other OS doesn't put
the NVMe to D3cold, so turning off the power resource is untested by
the vendor.

I think the proper fix would be reverting that commit, and
teardown/setup DMA on suspend/resume for HMB NVMes.

Kai-Heng

>
> > For these devices, just avoid to go deeper than d3hot.
>
> What are "these devices"?
>
> > @@ -2958,6 +2959,15 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >
> >       dev_info(dev->ctrl.device, "pci function %s\n", dev_name(&pdev->dev));
> >
> > +     if (pm_suspend_via_firmware() || !dev->ctrl.npss ||
> > +         !pcie_aspm_enabled(pdev) ||
> > +         dev->nr_host_mem_descs ||
> > +         (dev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND)) {
>
> Before we start open coding this in even more places we really want a
> little helper function for these checks, which should be accomodated with
> the comment near the existing copy of the checks.
>
> > +             pdev->d3cold_allowed = false;
> > +             pci_d3cold_disable(pdev);
> > +             pm_runtime_resume(&pdev->dev);
>
> Why do we need to both set d3cold_allowed and call pci_d3cold_disable?
>
> What is the pm_runtime_resume doing here?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ