lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 25 May 2019 05:31:20 +0000
From:   "Robert R. Howell" <RHowell@...o.edu>
To:     "Rafael J. Wysocki" <rjw@...ysocki.net>
CC:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Hans de Goede <hdegoede@...hat.com>,
        Kai-Heng Feng <kai.heng.feng@...onical.com>,
        "lenb@...nel.org" <lenb@...nel.org>,
        "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] ACPI / LPSS: Don't skip late system PM ops for hibernate
 on BYT/CHT

On 5/16/19 5:11 AM, Rafael J. Wysocki wrote:
> 
> On Thursday, April 25, 2019 6:38:34 PM CEST Robert R. Howell wrote:
>> On 4/24/19 1:20 AM, Rafael J. Wysocki wrote:
>>
>>> On Tue, Apr 23, 2019 at 10:03 PM Robert R. Howell <RHowell@...o.edu> wrote:
>>>>
>>>> On 4/23/19 2:07 AM, Rafael J. Wysocki wrote:
>>>>>
>>>>> On Sat, Apr 20, 2019 at 12:44 AM Robert R. Howell <RHowell@...o.edu> wrote:
>>>>>>
>>>>>> On 4/18/19 5:42 AM, Hans de Goede wrote:
>>>>>>
>>>>>>>> On 4/8/19 2:16 AM, Hans de Goede wrote:>
>>>>>>>>>
>>>>>>>>> Hmm, interesting so you have hibernation working on a T100TA
>>>>>>>>> (with 5.0 + 02e45646d53b reverted), right ?
>>>>>>>>>
>>>>>>
>>>>>>
>>>>>> I've managed to find a way around the i2c_designware timeout issues
>>>>>> on the T100TA's.  The key is to NOT set DPM_FLAG_SMART_SUSPEND,
>>>>>> which was added in the 02e45646d53b commit.
>>>>>>
>>>>>> To test that I've started with a 5.1-rc5 kernel, applied your recent patch
>>>>>> to acpi_lpss.c, then apply the following patch of mine, removing
>>>>>> DPM_FLAG_SMART_SUSPEND.  (For the T100 hardware I need to apply some
>>>>>> other patches as well but those are not related to the i2c-designware or
>>>>>> acpi issues addressed here.)
>>>>>>
>>>>>> On a resume from hibernation I still see one error:
>>>>>>   "i2c_designware 80860F41:00: Error i2c_dw_xfer called while suspended"
>>>>>> but I no longer get the i2c_designware timeouts, and audio does now work
>>>>>> after the resume.
>>>>>>
>>>>>> Removing DPM_FLAG_SMART_SUSPEND may not be what you want for other
>>>>>> hardware, but perhaps this will give you a clue as to what is going
>>>>>> wrong with hibernate/resume on the T100TA's.
>>>>>
>>>>> What if you drop DPM_FLAG_LEAVE_SUSPENDED alone instead?
>>>>>
>>>>
>>>> I did try dropping just DPM_FLAG_LEAVE_SUSPENDED, dropping just
>>>> DPM_FLAG_SMART_SUSPEND, and dropping both flags.  When I just drop
>>>> DPM_FLAG_LEAVE_SUSPENDED I still get the i2c_designware timeouts
>>>> after the resume.  If I drop just DPM_FLAG_SMART_SUSPEND or drop both,
>>>> then the timeouts go away.
>>>
>>> OK, thanks!
>>>
>>> Is non-hibernation system suspend affected too?
>>
>> I just ran some tests on a T100TA, using the 5.1-rc5 code with Hans' patch applied
>> but without any changes to i2c-designware-platdrv.c, so the
>> DPM_FLAG_SMART_PREPARE, DPM_FLAG_SMART_SUSPEND, and DPM_FLAG_LEAVE_SUSPENDED flags
>> are all set.
>>
>> Suspend does work OK, and after resume I do NOT get any of the crippling
>> i2c_designware timeout errors which cause sound to fail after hibernate.  I DO see one
>>   "i2c_designware 80860F41:00: Error i2c_dw_xfer call while suspended"
>> error on resume, just as I do on hibernate.  I've attached a portion of dmesg below.
>> The "asus_wmi:  Unknown key 79 pressed" error is a glitch which occurs
>> intermittently on these machines, but doesn't seem related to the other issues.
>> I had one test run when it was absent but the rest of the messages were the
>> same -- but then kept getting that unknown key error on all my later tries.
>>
>> I did notice the "2sidle" in the following rather than "shallow" or "deep".  A
>> cat of /sys/power/state shows "freeze mem disk" but a
>> cat of /sys/power/mem_sleep" shows only "[s2idle] so it looks like shallow and deep
>> are not enabled for this system.  I did check the input power (or really current)
>> as it went into suspend and the micro-usb power input drops from about
>> 0.5 amps to 0.05 amps.  But clearly a lot of devices are still active, as movement
>> of a bluetooth mouse (the MX Anywhere 2) will wake it from suspend.  That presumably is
>> why suspend doesn't trigger the same i2c_designware problems as hibernate.
>>
>> Let me know if I can do any other tests.
> 
> Can you please check if the appended patch makes the hibernate issue go away for you, without any other changes?
> 
> ---
>  drivers/pci/pci-driver.c |   36 ++++++++++--------------------------
>  1 file changed, 10 insertions(+), 26 deletions(-)
> 
> Index: linux-pm/drivers/pci/pci-driver.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-driver.c
> +++ linux-pm/drivers/pci/pci-driver.c
> @@ -957,15 +957,14 @@ static int pci_pm_freeze(struct device *
>         }
> 
>         /*
> -        * This used to be done in pci_pm_prepare() for all devices and some
> -        * drivers may depend on it, so do it here.  Ideally, runtime-suspended
> -        * devices should not be touched during freeze/thaw transitions,
> -        * however.
> +        * Resume all runtime-suspended devices before creating a snapshot
> +        * image of system memory, because the restore kernel generally cannot
> +        * be expected to always handle them consistently and pci_pm_restore()
> +        * always leaves them as "active", so ensure that the state saved in the
> +        * image will always be consistent with that.
>          */
> -       if (!dev_pm_smart_suspend_and_suspended(dev)) {
> -               pm_runtime_resume(dev);
> -               pci_dev->state_saved = false;
> -       }
> +       pm_runtime_resume(dev);
> +       pci_dev->state_saved = false;
> 
>         if (pm->freeze) {
>                 int error;
> @@ -992,9 +991,6 @@ static int pci_pm_freeze_noirq(struct de
>         struct pci_dev *pci_dev = to_pci_dev(dev);
>         struct device_driver *drv = dev->driver;
> 
> -       if (dev_pm_smart_suspend_and_suspended(dev))
> -               return 0;
> -
>         if (pci_has_legacy_pm_support(pci_dev))
>                 return pci_legacy_suspend_late(dev, PMSG_FREEZE);
> 
> @@ -1024,16 +1020,6 @@ static int pci_pm_thaw_noirq(struct devi
>         struct device_driver *drv = dev->driver;
>         int error = 0;
> 
> -       /*
> -        * If the device is in runtime suspend, the code below may not work
> -        * correctly with it, so skip that code and make the PM core skip all of
> -        * the subsequent "thaw" callbacks for the device.
> -        */
> -       if (dev_pm_smart_suspend_and_suspended(dev)) {
> -               dev_pm_skip_next_resume_phases(dev);
> -               return 0;
> -       }
> -
>         if (pcibios_pm_ops.thaw_noirq) {
>                 error = pcibios_pm_ops.thaw_noirq(dev);
>                 if (error)
> @@ -1093,8 +1079,10 @@ static int pci_pm_poweroff(struct device
> 
>         /* The reason to do that is the same as in pci_pm_suspend(). */
>         if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
> -           !pci_dev_keep_suspended(pci_dev))
> +           !pci_dev_keep_suspended(pci_dev)) {
>                 pm_runtime_resume(dev);
> +               pci_dev->state_saved = false;
> +       }
> 
>         pci_dev->state_saved = false;
>         if (pm->poweroff) {
> @@ -1168,10 +1156,6 @@ static int pci_pm_restore_noirq(struct d
>         struct device_driver *drv = dev->driver;
>         int error = 0;
> 
> -       /* This is analogous to the pci_pm_resume_noirq() case. */
> -       if (dev_pm_smart_suspend_and_suspended(dev))
> -               pm_runtime_set_active(dev);
> -
>         if (pcibios_pm_ops.restore_noirq) {
>                 error = pcibios_pm_ops.restore_noirq(dev);
>                 if (error)
> 
> 
> 

I've finally managed to complete a reasonable set of tests on my T100TA using your 
2nd patch from above, and on a 5.1.4 based kernel with ONLY this patch applied I can 
successfully suspend and hibernate the system.  On resume the i2c devices (in particular the sound)
do still work OK.  On the first resume from suspend or hibernate I DO still see 
the "i2c_designware 80860F41:00: Transfer while suspended" error, followed by a 
stack trace.  However as before that error doesn't seem to cause any permanent problems.  
and with your patch applied I do NOT get the subsequent i2c timeout errors 
that prevent further use of the sound system.  I don't see any other reported errors.  
As usual I need to have a systemd script which removes the brcmfmac and hci_uart drivers 
before hibernate, then installs them on resume, or those drivers fail to work after resume.

I had initially tried testing your patch with the 5.1.0 kernel that I had been 
using for testing a few weeks ago, and also the just released 5.2-rc1 kernel.  
I think your patch is also fixing the i2c timeout errors with those, but with both of those 
other kernels I have a lot of other issues which have confused the results.  I've been trying 
to sort out exactly what causes or avoids those issues over the last several days.  
However that has turned into enough of a can of worms that unless you really want 
to know those details, I'll avoid that "noise" here except a VERY brief summary.

With 5.1.0 and ONLY your patch, resume from hibernate does seem to work, and I DO NOT 
get i2c_timeout errors, but I do see a number of errors and stack traces reported 
from other i2c devices such as the ATML1000 touchscreen controller.  However the devices 
all do seem to work once the resume is completed.  The just released 5.2-rc1 kernel 
seems to have introduced problems with at least one other device, brcmfmac, with 
or without your patch.  I have to blacklist it or the system hangs on either suspend 
or hibernate.  (And working wifi never did come up even on the initial boot.)  
But with brcmfmac blacklisted your patch seems to fix the i2c_designware SOUND timeouts.  

Let me know if I can do any further testing.

Bob Howell

Powered by blists - more mailing lists