lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e583fd64-2cff-4595-a559-a675c6f5ad0d@amd.com>
Date:   Mon, 27 Nov 2023 14:55:48 -0600
From:   Mario Limonciello <mario.limonciello@....com>
To:     Luke Jones <luke@...nes.dev>
Cc:     hdegoede@...hat.com, ilpo.jarvinen@...ux.intel.com,
        corentin.chary@...il.com, platform-driver-x86@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] platform/x86: asus-wmi: disable USB0 hub on ROG
 Ally before suspend

On 11/27/2023 14:46, Luke Jones wrote:
> 
> 
> On Mon, Nov 27 2023 at 02:14:23 PM -06:00:00, Mario Limonciello 
> <mario.limonciello@....com> wrote:
>> On 11/26/2023 17:05, Luke D. Jones wrote:
>>> ASUS have worked around an issue in XInput where it doesn't support USB
>>> selective suspend, which causes suspend issues in Windows. They worked
>>> around this by adjusting the MCU firmware to disable the USB0 hub when
>>> the screen is switched off during the Microsoft DSM suspend path in 
>>> ACPI.
>>>
>>> The issue we have with this however is one of timing - the call the 
>>> tells
>>> the MCU to this isn't able to complete before suspend is done so we call
>>> this in a prepare() and add a small msleep() to ensure it is done. This
>>> must be done before the screen is switched off to prevent a variety of
>>> possible races.
>>
>> Right now the way that Linux handles the LPS0 calls is that they're 
>> all back to back.  Luke did try to inject a delay after the LPS0 calls 
>> were done but before it went to sleep but this wasn't sufficient.
>>
>> Another "potential" way to solve this problem from Linux may be to 
>> actually glue the LPS0 screen off call to when DRM actually has eDP 
>> turned off.
>>
>> Making such a change would essentially push back the "screen off" LPS0 
>> command to when the user has run 'systemctl suspend' (or an action 
>> that did this) because the compositor usually turns it off with DPMS 
>> at this time.
> 
> I would be willing to test this if you want some concrete data. 

It would require some cross subsystem plumbing to evaluate feasibility.
I don't currently have any plans to do it.

I think your patch makes sense; I just want to make it known that 
"might" clean this up if it ever happens.

> See my 
> big block of text below.
> 
>>
>> This is a much bigger change though and *much more ripe for breakage*.
>>
>> So I think in may be worth leaving a TODO comment to look into doing 
>> that in the future.
> Do you mean add the TODO to a line in this patch?

Yeah.  In case someone ever does it (me or otherwise) I think it would 
be good to have some reference in the comments that the commit 'might' 
be possible to revert.

> 
>>
>> If that ever happens; it's possible that this change could be reverted 
>> too.
>>
>>>
>>> Further to this the MCU powersave option must also be disabled as it can
>>> cause a number of issues such as:
>>> - unreliable resume connection of N-Key
>>> - complete loss of N-Key if the power is plugged in while suspended
>>> Disabling the powersave option prevents this.
>>>
>>> Without this the MCU is unable to initialise itself correctly on resume.
>>
>> initialize
> 
> Are we forced to use USA spelling? I'm from NZ
> "initialise is predominantly used in British English (used in UK/AU/NZ) 
> ( en-GB )"
> 

Ah I didn't realize it's an acceptable spelling for en-GB, and thought 
it was just a typo; sorry.

>>
>>>
>>> Signed-off-by: Luke D. Jones <luke@...nes.dev>
>>
>> I think it would be good to add a Closes: tag to the AMD Gitlab issue 
>> that this was discussed within as well as any other public references 
>> you know about.
>>
>> Additionally as Phoenix APU support goes back as far as kernel 6.1 and 
>> this is well contained to only run on the ROG I suggest to CC stable 
>> so that people can use the ROG on that LTS kernel or later.
>>
>>> ---
>>> -SNIP-
>>>   @@ -4701,6 +4749,8 @@ static const struct dev_pm_ops asus_pm_ops = {
>>>       .thaw = asus_hotk_thaw,
>>>       .restore = asus_hotk_restore,
>>>       .resume = asus_hotk_resume,
>>> +    .resume_early = asus_hotk_resume_early,
>>> +    .prepare = asus_hotk_prepare,
>>
>> Have you experimented with only using the prepare() call or only the 
>> resume_early() call?  Are both really needed?
> 
> I have yes. Although the device comes back eventually in resume after 
> only a prepare call it's not preferable as it tends to change the device 
> path. With resume_early we can get the device replugged super early 
> (before anything notices it's gone in fact).
> 
> This whole thing is a bit of a mess. It ends up being a race between 
> various things to prevent a HUB0 disconnect being registered by the xhci 
> subsystem, and adding the device back before the xhci subsystem gets 
> control.
> 
> If I add a sleep longer than 1300ms in prepare then the xhci subsys 
> registers a disconnect of the USB0 hub. If the sleep is under 250ms it 
> isn't quite enough for the MCU to do its thing, and on battery it seems 
> worse.
> 
> I have asked the ASUS guys I'm in contact with for something to disable 
> this MCU behaviour since it is purely a workaround for a broken Windows 
> thing :( They are open to something, maybe an OS detect in ACPI or a WMI 
> method addition similar to the MCU powersave method, from what I'm told 
> it would require an MCU firmware update along with BIOS update. If this 
> eventuates I'll submit an additional patch to check and set that plus 
> disable this.

Don't let them do an OS detection in ACPI, it's going to be too painful.
I would instead suggest that they can have a bit that you can program in 
via ACPI or WMI from the ASUS WMI driver that says to skip the MCU 
disconnect behavior.

> 
> I may possibly write a new version of this patch as we've seen that 
> enabling powersave reduces suspend power use by at least half. And 
> looking through my DSDT dumps, there are a few laptops with the same 
> feature as Ally. The patch for powersave being enabled requires also AC 
> power state on suspend change detection, and a later forced reset in 
> late resume (and the device paths change regardless when powersave is on).
> 
> When I look at it objectively, the device path changing should be a 
> non-issue really as it is fully handled by USB subsystem and behaves 
> exactly like what it is - a USB hub disconnect. It's just that some 
> userspace apps don't expect this. I will experiment some more.
> 
> Regards,
> Luke.
> 

As another experiment - what happens if you "comment out" the LPS0 calls 
that do this problematic stuff?

It's important to make sure the callback to amd-pmc stays in place, but 
if you just skip those ACPI ones does it still get to the deepest state 
and are there other problems?

>>
>>>   };
>>>     /* Registration 
>>> ***************************************************************/
>>> diff --git a/include/linux/platform_data/x86/asus-wmi.h 
>>> b/include/linux/platform_data/x86/asus-wmi.h
>>> index 63e630276499..ab1c7deff118 100644
>>> --- a/include/linux/platform_data/x86/asus-wmi.h
>>> +++ b/include/linux/platform_data/x86/asus-wmi.h
>>> @@ -114,6 +114,9 @@
>>>   /* Charging mode - 1=Barrel, 2=USB */
>>>   #define ASUS_WMI_DEVID_CHARGE_MODE    0x0012006C
>>>   +/* MCU powersave mode */
>>> +#define ASUS_WMI_DEVID_MCU_POWERSAVE   0x001200E2
>>> +
>>>   /* epu is connected? 1 == true */
>>>   #define ASUS_WMI_DEVID_EGPU_CONNECTED    0x00090018
>>>   /* egpu on/off */
>>
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ