[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5633857.hqcsiBO4aL@kreacher>
Date:   Wed, 18 Sep 2019 23:27:01 +0200
From:   "Rafael J. Wysocki" <rjw@...ysocki.net>
To:     Mario.Limonciello@...l.com
Cc:     kbusch@...nel.org, axboe@...com, hch@....de, sagi@...mberg.me,
        linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Ryan.Hong@...l.com, Crag.Wang@...l.com, sjg@...gle.com,
        Jared.Dominguez@...l.com
Subject: Re: [PATCH] nvme-pci: Save PCI state before putting drive into deepest state
On Wednesday, September 18, 2019 6:52:31 PM CEST Mario.Limonciello@...l.com wrote:
> > -----Original Message-----
> > From: Rafael J. Wysocki <rjw@...ysocki.net>
> > Sent: Tuesday, September 17, 2019 4:36 PM
> > To: Keith Busch
> > Cc: Limonciello, Mario; Jens Axboe; Christoph Hellwig; Sagi Grimberg; linux-
> > nvme@...ts.infradead.org; LKML; Hong, Ryan; Wang, Crag; sjg@...gle.com;
> > Dominguez, Jared
> > Subject: Re: [PATCH] nvme-pci: Save PCI state before putting drive into deepest
> > state
> > 
> > 
> > [EXTERNAL EMAIL]
> > 
> > On Tuesday, September 17, 2019 11:24:14 PM CEST Keith Busch wrote:
> > > On Wed, Sep 11, 2019 at 06:42:33PM -0500, Mario Limonciello wrote:
> > > > The action of saving the PCI state will cause numerous PCI configuration
> > > > space reads which depending upon the vendor implementation may cause
> > > > the drive to exit the deepest NVMe state.
> > > >
> > > > In these cases ASPM will typically resolve the PCIe link state and APST
> > > > may resolve the NVMe power state.  However it has also been observed
> > > > that this register access after quiesced will cause PC10 failure
> > > > on some device combinations.
> > > >
> > > > To resolve this, move the PCI state saving to before SetFeatures has been
> > > > called.  This has been proven to resolve the issue across a 5000 sample
> > > > test on previously failing disk/system combinations.
> > > >
> > > > Signed-off-by: Mario Limonciello <mario.limonciello@...l.com>
> > > > ---
> > > >  drivers/nvme/host/pci.c | 13 +++++++------
> > > >  1 file changed, 7 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > > > index 732d5b6..9b3fed4 100644
> > > > --- a/drivers/nvme/host/pci.c
> > > > +++ b/drivers/nvme/host/pci.c
> > > > @@ -2894,6 +2894,13 @@ static int nvme_suspend(struct device *dev)
> > > >  	if (ret < 0)
> > > >  		goto unfreeze;
> > > >
> > > > +	/*
> > > > +	 * A saved state prevents pci pm from generically controlling the
> > > > +	 * device's power. If we're using protocol specific settings, we don't
> > > > +	 * want pci interfering.
> > > > +	 */
> > > > +	pci_save_state(pdev);
> > > > +
> > > >  	ret = nvme_set_power_state(ctrl, ctrl->npss);
> > > >  	if (ret < 0)
> > > >  		goto unfreeze;
> > > > @@ -2908,12 +2915,6 @@ static int nvme_suspend(struct device *dev)
> > > >  		ret = 0;
> > > >  		goto unfreeze;
> > > >  	}
> > > > -	/*
> > > > -	 * A saved state prevents pci pm from generically controlling the
> > > > -	 * device's power. If we're using protocol specific settings, we don't
> > > > -	 * want pci interfering.
> > > > -	 */
> > > > -	pci_save_state(pdev);
> > > >  unfreeze:
> > > >  	nvme_unfreeze(ctrl);
> > > >  	return ret;
> > >
> > > In the event that something else fails after the point you've saved
> > > the state, we need to fallback to the behavior for when the driver
> > > doesn't save the state, right?
> > 
> > Depending on whether or not an error is going to be returned.
> > 
> > When returning an error, it is not necessary to worry about the saved state,
> > because that will cause the entire system-wide suspend to be aborted.
> 
> It looks like in this case an error would be returned.
Not necessarily.
If nvme_set_power_state() returns a positive number, you need to clear
pdev->state_saved before jumping to unfreeze.
Actually, you can drop the "goto unfreeze" after the "ret = 0" (in the
"if (ret)" block) and add the clearing of pdev->state_saved before it.
Let me reply to the original patch, though.
> 
> > 
> > Otherwise it is sufficient to clear the state_saved flag of the PCI device
> > before returning 0 to make the PCI layer take over.
> 
Powered by blists - more mailing lists
 
