lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZbrNLxHL03R66PxQ@x1-carbon>
Date: Wed, 31 Jan 2024 23:43:59 +0100
From: Niklas Cassel <cassel@...nel.org>
To: Daniel Drake <drake@...lessos.org>,
	Vitalii Solomonov <solomonov.v@...il.com>
Cc: Jian-Hong Pan <jhp@...lessos.org>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	David Box <david.e.box@...ux.intel.com>,
	Damien Le Moal <dlemoal@...nel.org>,
	Nirmal Patel <nirmal.patel@...ux.intel.com>,
	Jonathan Derrick <jonathan.derrick@...ux.dev>,
	linux-ide@...r.kernel.org, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux@...lessos.org
Subject: Re: [PATCH 1/2] ata: ahci: Add force LPM policy quirk for ASUS
 B1400CEAE

On Wed, Jan 31, 2024 at 07:08:12AM -0400, Daniel Drake wrote:
> On Wed, Jan 31, 2024 at 6:57 AM Niklas Cassel <cassel@...nel.org> wrote:
> > Unfortunately, we don't have any of these laptops that has a Tiger Lake
> > AHCI controller (with a disappearing drive), so until someone who does
> > debugs this, I think we are stuck at status quo.
> 
> I've asked for volunteers to help test things on those original bug
> reports (and may have one already) and would appreciate any suggested
> debugging approaches from those more experienced with SATA/AHCI. What
> would be your first few suggested debugging steps?
> 
> Non-LPM boot:
> ata1: SATA max UDMA/133 abar m2048@...0202000 port 0x50202100 irq 145
> ata2: SATA max UDMA/133 abar m2048@...0202000 port 0x50202180 irq 145
> ata2: SATA link down (SStatus 0 SControl 300)
> ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> 
> LPM failed boot:
> ata1: SATA max UDMA/133 abar m2048@...0202000 port 0x50202100 irq 145
> ata2: SATA max UDMA/133 abar m2048@...0202000 port 0x50202180 irq 145
> ata1: SATA link down (SStatus 4 SControl 300)
> ata2: SATA link down (SStatus 4 SControl 300)
> 
> note SStatus=4 on both ports  (means "PHY in offline mode"?)

Hello Daniel, Vitalii,

The attachments that you uploaded to bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=217114
namely dmesg_VMD_off and dmesg_VMD_on:

dmesg_VMD_off:[    1.020080] ahci 0000:00:17.0: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
dmesg_VMD_off:[    1.020095] ahci 0000:00:17.0: flags: 64bit ncq sntf pm clo only pio slum part deso sadm sds 
dmesg_VMD_off:[    1.020645] ata1: SATA max UDMA/133 abar m2048@...0902000 port 0x80902100 irq 123 lpm-pol 0
dmesg_VMD_off:[    1.330090] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)


dmesg_VMD_on:[    0.973901] ahci 10000:e0:17.0: AHCI 0001.0301 32 slots 1 ports 6 Gbps 0x1 impl SATA mode
dmesg_VMD_on:[    0.973904] ahci 10000:e0:17.0: flags: 64bit ncq sntf pm clo only pio slum part deso sadm sds 
dmesg_VMD_on:[    0.974094] ata1: SATA max UDMA/133 abar m2048@...2102000 port 0x82102100 irq 142 lpm-pol 0
dmesg_VMD_on:[    1.287221] ata1: SATA link down (SStatus 4 SControl 300)


I assume that both of these logs are from the same kernel binary.
Does this kernel binary have the Tiger Lake LPM enablement patch included?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/patch/?id=104ff59af73aba524e57ae0fef70121643ff270e

If so, since with Intel VMD turned off we can detect the SATA drive,
but with Intel VMD turned on we do not, which strongly suggests that
the problem is related to Intel VMD.



In libata we perform a reset of the port at boot, see:
libata-sata.c:sata_link_hardreset()
after writing to SControl, we call
libata-core.c:ata_wait_ready() that will poll for the port being ready
by calling the check_ready callback.
For AHCI, this callback funcion is set to:
libahci.c:ahci_check_ready().

A reset should take the device out of deep power state and should be
sufficient to establish a connection (and that also seems to be the
case when not using Intel VMD).

However, if you want to debug, I would start by adding prints to
libata-sata.c:sata_link_hardreset()
libata-core.c:ata_wait_ready()
libahci.c:ahci_check_ready().



Vitalii,

I noticed that the prints says "lpm-pol 0"
Do you have:
CONFIG_SATA_MOBILE_LPM_POLICY set to 0?

CONFIG_SATA_MOBILE_LPM_POLICY=0
means do not touch any settings set by firmware.

This means that this code:
https://github.com/torvalds/linux/blob/v6.8-rc2/drivers/ata/libata-eh.c#L3572-L3579
https://github.com/torvalds/linux/blob/v6.8-rc2/drivers/ata/libata-eh.c#L3067-L3072

will not call ata_eh_set_lpm(), which means that ahci_set_lpm() will not
be called, which means that sata_link_scr_lpm() will not be called.

While this shouldn't make a difference, and I didn't check the code
to see where "SATA link up" is printed, but possibly, when using VMD,
perhaps it is so quick to put the device to a deeper power state after
a reset, and with policy == 0, we will never send a ATA_LPM_WAKE_ONLY
that sets PxCMD.ICC to the active state (brings the device out of sleep).

Could you please try to set:
CONFIG_SATA_MOBILE_LPM_POLICY=3
and enable VMD again, and see if that makes you able to detect the SATA
drive even with VMD enabled.


Kind regards,
Niklas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ