linux-kernel - Re: [PATCH 4/5] PCI: wait device ready after pci_pm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <d886e7d4-db9a-9dc9-e8f1-5dd48cb9beb4@codeaurora.org>
Date:   Mon, 16 Oct 2017 08:51:49 -0400
From:   Sinan Kaya <okaya@...eaurora.org>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     "linux-pci@...r.kernel.org >> Linux PCI" <linux-pci@...r.kernel.org>,
        timur@...eaurora.org, alex.williamson@...hat.com,
        linux-arm-msm@...r.kernel.org, Bjorn Helgaas <bhelgaas@...gle.com>,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 4/5] PCI: wait device ready after pci_pm_reset()

On 10/12/2017 12:48 PM, Sinan Kaya wrote:
> On 10/11/2017 6:06 PM, Bjorn Helgaas wrote:
>>> static int pci_pm_reset(struct pci_dev *dev, int probe)
>>>  {
>>> +	unsigned int delay = dev->d3_delay;
>>>  	u16 csr;
>>>  
>>>  	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
>>> @@ -3988,7 +3989,10 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
>>>  	pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr);
>>>  	pci_dev_d3_sleep(dev);
>>>  
>>> -	return 0;
>>> +	if (delay < pci_pm_d3_delay)
>>> +		delay = pci_pm_d3_delay;
>>> +
>>> +	return pci_dev_wait(dev, "PM D3->D0", delay, 1000);
>> 1) Why do we wait up to 1 second here, when we wait up to 60 seconds
>> for the other methods?  Can they all be the same?  Maybe a #define for
>> it?
> 
> I know you want to have similar behavior for systems that do and do not support
> CRS. That was the reason why I converted flr wait function to into dev_wait function.
> 
> However, here is the problem:
> 
> For systems that do not support CRS, there is no way of knowing whether we
> are reading 0xFFFFFFFF because the endpoint is not reachable due to an error
> like "it doesn't support this reset type" or if it is actually emitting a CRS.
> 
> If one system has a problem with pm_reset, this code would add an unnecessary
> 1 second delay into the reset path. If I make it 60 it would be something like:
> 
> 1. try reset method A
> 2. wait 60 seconds
> 3. try reset method B
> 4. wait 60 seconds. 
> 5. try reset method C
> 6. wait 60 seconds
> 
> This might end up being a regression on some system. 
> 
> I'm still leaning towards a wait only if we are observing a CRS. What's your
> thought on this?
> 
> then the sequence would be.
> 
> 1. try reset method A
> 2. if CRS pending, wait 60 seconds
> 3. try reset method B
> 4. if CRS pending, wait 60 seconds. 
> 5. try reset method C
> 6. if CRS pending, wait 60 seconds
> 

Thinking more about this. Another possibility is to have an adjustable sleep time.
Start with 60 seconds for all reset types. If somebody doesn't like it,
have a kernel command line override.

>>
>> 2) I don't really like the fact that we do the initial sleep one place
>> and then pass the length of that sleep here.  It's hard to verify
>> they're the same and keep them in sync.  I think the only thing you
>> use initial_wait for is to include that time in the dmesg messages.
>> Maybe we should just omit that time from the message and drop the
>> parameter?
>>
> 
> This was for printing reasons like you spotted, I can certainly get rid of
> the initial_wait.
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.