lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 17 Jan 2017 16:57:49 +0000
From:   <Mario.Limonciello@...l.com>
To:     <pmenzel@...gen.mpg.de>, <rafael.j.wysocki@...el.com>
CC:     <linux@...mhuis.info>, <gregkh@...uxfoundation.org>,
        <tomas.winkler@...el.com>, <jan@...dor.com>,
        <alexander.usyskin@...el.com>, <linux-kernel@...r.kernel.org>,
        <yu.c.chen@...el.com>, <tomi.p.sarvela@...el.com>,
        <daniel@...ra.org>, <len.brown@...el.com>,
        <linux-pm@...r.kernel.org>
Subject: RE: Regression on Dell XPS13 (was: [char-misc for 4.10-rc4 V2] mei:
 bus: enable OS version only for SPT and newer)

Hi Paul,

Thanks for raising this topic and including me.  
Suspend to Idle support in Linux as an alternative S3 on x86 is a new 
topic.  In all, I expected that some problems would arise as a result 
of this patch, and I hope they will spur interesting discussions and 
solutions.

Something that might not be immediately obvious is that by supporting
suspend to idle, new driver bugs are going to be identified for drivers
that don't properly put hardware into low enough power modes for the CPU
to go into the proper state.  Traditionally with S3 the BIOS would be responsible 
for powering devices down before S3 and back up when exiting S3.
This onus is now on the OS.

> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@...gen.mpg.de]
> Sent: Tuesday, January 17, 2017 8:34 AM
> To: Rafael J. Wysocki <rafael.j.wysocki@...el.com>; Limonciello, Mario
> <Mario_Limonciello@...l.com>
> Cc: Thorsten Leemhuis <linux@...mhuis.info>; Greg Kroah-Hartman
> <gregkh@...uxfoundation.org>; Tomas Winkler <tomas.winkler@...el.com>;
> Jan Niehusmann <jan@...dor.com>; Usyskin, Alexander
> <alexander.usyskin@...el.com>; linux-kernel@...r.kernel.org; Chen, Yu C
> <yu.c.chen@...el.com>; Sarvela, Tomi P <tomi.p.sarvela@...el.com>; Daniel
> Blueman <daniel@...ra.org>; Len Brown <len.brown@...el.com>; linux-
> pm@...r.kernel.org
> Subject: Regression on Dell XPS13 (was: [char-misc for 4.10-rc4 V2] mei: bus:
> enable OS version only for SPT and newer)
> 
> Dear Rafael, dear Mario,
> 
> 
> On 01/17/17 09:14, Thorsten Leemhuis wrote:
> > Greg Kroah-Hartman wrote on 16.01.2017 12:05:
> >> On Sun, Jan 15, 2017 at 11:58:30AM +0100, Greg Kroah-Hartman wrote:
> >>> On Sun, Jan 15, 2017 at 07:19:03AM +0000, Winkler, Tomas wrote:
> >>>>> Subject: Re: [char-misc for 4.10-rc4 V2] mei: bus: enable OS
> >>>>> version only for SPT and newer On Sat, Jan 14, 2017 at 08:27:31PM
> >>>>> +0100, Paul Menzel wrote:
> >>>>>> On 2017-01-13 14:00, Greg Kroah-Hartman wrote:
> >>>>>>> On Wed, Jan 11, 2017 at 03:26:06PM +0100, Paul Menzel wrote:
> >>>>>>>> On 01/11/17 15:12, Winkler, Tomas wrote:
> >>>>>>>>>>> On 01/11/17 10:24, Winkler, Tomas wrote:
> >>>>>>>>>>>>> On Wed, Jan 11, 2017 at 01:27:21AM +0200, Tomas Winkler
> wrote:
> >>>>>>>>>>>>>> On older platforms the command should be just ignored
> by
> >>>>>>>>>>>>>> the firmware but some older platforms misbehave so it's
> >>>>>>>>>>>>>> safer to send the command only if required.
> >>>>>>>>>>>>> Thanks! This fixes suspend-to-ram for me (on a Thinkpad
> x201s).
> >>>>>>>>>>>> What about Dell XPS13?
> 
> There is a regression with Linux 4.10-rc{1,2,3} on the Intel Kaby Lake device
> Dell XPS13 (9360). Hitting the power button, the light of the power button
> never goes off. Pressing it again, nothing happen. Only pressing it like ten
> seconds the screen comes back for five seconds, and then it seems to try to
> suspend again.


When it's in suspend to idle, the EC will modify some behaviors that align
to vendor specifications.  These include both the power button and 
the power light.  Both are actually expected, but the power button is 
indicative of other problems in the stack that need work.  
Let me explain how it all works:

When in suspend to idle, if the power button is pressed only momentarily (~ <6s),
the EC sends a wakeup event to the firmware which sends it to the OS 
through an ACPI event to the Intel HID event filter driver (intel-hid).

If it's pressed for longer (~6-9 seconds) then an IRQ is triggered to wake up 
the system the traditional way through a power button press.

If the button is pressed for a very long time (~10+ seconds) then 
the system will be force powered off.

So in the <6s scenario, the intel-hid driver is responsible to receive the ACPI
event and process accordingly.  The maintainer has a patch ready for the intel-hid
portion of this work, but it's currently being reviewed by Intel to ensure it
can be legally submitted into the kernel.

After that patch is approved for disclosure, submitted and accepted, the other 
missing part is that the ACPI subsystem itself is frozen when in suspend to idle, 
so this event can't be received by intel-hid.  This part needs discussion on how to fix.

In the 6-9s scenario there are no problems as this is the traditional wakeup
(as you found).

> 
> >>>>>>>>>>> With Linus' master branch from today, and Greg's
> >>>>>>>>>>> char-misc-linus merged
> >>>>>>>>>>> (Merge: 807b93e995d1 546cf3ef9c92), the regression is still
> there.
> >> […]
> >>>>>>>> I am currently bisecting to find the culprit. 13 steps will
> >>>>>>>> take some time though.
> >>>>>>> I can duplicate this on my laptop here as well :(
> >>>>>> Which system do you have?
> >>>>> A Dell XPS13, don't know what cpu type it is, here's the output of
> >>>>> one cpu from /proc/cpuinfo
> >> […]
> >>>>> model name	: Intel(R) Core(TM) i7-6560U CPU @ 2.20GHz
> >>>>>>> Did you get anywhere with your bisection?
> >>>>>> Sorry, I replied to a different message with my status.
> >>>> You are close!  I'll try bisection tomorrow if I have some spare time.
> >
> > Paul, did you get any closer? I have trouble finding time for a proper
> > bisection (it's a bit chaotic here right now :-/ )
> >
> >>>> Greg,  is that same Laptop mode as Paul's, you've experience the issue
> on?
> >>> It's the same model name, but as this model has been shipped with
> >>> many different CPU versions over the years, I'm not sure if it is
> >>> the exact same one.
> >
> > Gregs afaics is Skylake generation (XPS 13 (9350)), Paul and I have a
> > Kaby Lake (XPS 13 (9360)), which is one generation newer (Wifi is also
> > different; other components likely as well).
> >
> >> Ok, 4.10-rc4 seems to have fixed this issue with me.  I don't know
> >> what it was, but I can't duplicate it anymore.
> >
> > Good to hear.
> >
> >> Paul, are you still having this issue?
> >
> > Don't know about Paul, but I did a quick test with rc4 on my machine
> > and the issue is still there :-/
> 
> I didn’t test Linux 4.10-rc4 yet, but I completed the bisection.
> 
> ```
> 406e79385f3223d82272cf2be86bc95cd000a258 is the first bad commit commit
> 406e79385f3223d82272cf2be86bc95cd000a258
> Author: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> Date:   Mon Nov 21 22:45:40 2016 +0100
> 
>      PM / sleep: System sleep state selection interface rework
> 
>      There are systems in which the platform doesn't support any special
>      sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
>      available system sleep state.  However, some user space frameworks
>      only use the "mem" and (sometimes) "standby" sleep state labels, so
>      the users of those systems need to modify user space in order to be
>      able to use system suspend at all and that may be a pain in practice.
> 
>      Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
>      sleep state enumeration) attempted to address this problem by adding
>      a command line argument to change the meaning of the "mem" string in
>      /sys/power/state to make it trigger suspend-to-idle (instead of
>      suspend-to-RAM).
> 
>      However, there also are systems in which the platform does support
>      special sleep states, but suspend-to-idle is the preferred one anyway
>      (it even may save more energy than the platform-provided sleep states
>      in some cases) and the above commit doesn't help in those cases.
> 
>      For this reason, rework the system sleep state selection interface
>      again (but preserve backwards compatibiliby).  Namely, add a new
>      sysfs file, /sys/power/mem_sleep, that will control the system
>      suspend mode triggered by writing "mem" to /sys/power/state (in
>      analogy with what /sys/power/disk does for hibernation).  Make it
>      select suspend-to-RAM ("deep" sleep) by default (if supported) and
>      fall back to suspend-to-idle ("s2idle") otherwise and add a new
>      command line argument, mem_sleep_default, allowing that default to
>      be overridden if need be.
> 
>      At the same time, drop the relative_sleep_states command line
>      argument that doesn't make sense any more.
> 
>      Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>      Tested-by: Mario Limonciello <mario.limonciello@...l.com>
> 
> :040000 040000 a5770fe0413cbe7794eab28f72dfe8ede1f090c2
> a2882c77659517aa7137c930e0e7f178bc76bfbd M      Documentation
> :040000 040000 2594b1a87815173e97199ef9d9c5918fec22fcfd
> fe0b69953be653644c5366ac631131fdbbdb9bcc M      kernel
> $ git tag --contains 406e79385f3223d82272cf2be86bc95cd000a258
> v4.10-rc1
> v4.10-rc2
> v4.10-rc3
> ```
> 
> Please find the config and the bisection log attached. Sometimes I had to
> cherry-pick build fix commits for PIE enabled GCC on top.
> 
> Rafael, do you want me to open a bug report for that? Mario, what systems
> did you actually test this on? (Why isn’t that listed in the commit message?)
> Mario, do you have access to Dell XPS13 (9360) devices to help getting this
> fixed?
> 

I tested this on several systems and ensured that the kernel was doing the
right thing in terms of choosing the correct state to go into.

As I mentioned above, those behaviors are currently expected until these
types issues are identified and fixed in the proper subsystems.  If they
end up being troublesome to resolve, it's possible to quirk individual systems,
to disable this behavior and return to traditional S3 but I would prefer to actually 
identify and fix the various problems so that we can push the Linux stack forward.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ