lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <59f6ff78-6b45-465a-bd41-28c7a5d10931@davidgow.net>
Date:   Fri, 15 Sep 2023 11:22:14 +0800
From:   David Gow <david@...idgow.net>
To:     Niklas Cassel <Niklas.Cassel@....com>,
        Bagas Sanjaya <bagasdotme@...il.com>
Cc:     Damien Le Moal <dlemoal@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        patenteng <dimitar@...kalov.co.uk>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Regressions <regressions@...ts.linux.dev>,
        Linux IDE and libata <linux-ide@...r.kernel.org>,
        Linux PCI <linux-pci@...r.kernel.org>
Subject: Re: Fwd: Kernel 6.5.2 Causes Marvell Technology Group 88SE9128 PCIe
 SATA to Constantly Reset

Le 2023/09/13 à 23:12, Niklas Cassel a écrit :
> On Wed, Sep 13, 2023 at 06:25:31PM +0700, Bagas Sanjaya wrote:
>> Hi,
>>
>> I notice a regression report on Bugzilla [1]. Quoting from it:
>>
>>> After upgrading to 6.5.2 from 6.4.12 I keep getting the following kernel messages around three times per second:
>>>
>>> [ 9683.269830] ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>> [ 9683.270399] ata16.00: configured for UDMA/66
>>>
>>> So I've tracked the offending device:
>>>
>>> ll /sys/class/ata_port/ata16
>>> lrwxrwxrwx 1 root root 0 Sep 10 21:51 /sys/class/ata_port/ata16 -> ../../devices/pci0000:00/0000:00:1c.7/0000:0a:00.0/ata16/ata_port/ata16
>>>
>>> cat /sys/bus/pci/devices/0000:0a:00.0/uevent
>>> DRIVER=ahci
>>> PCI_CLASS=10601
>>> PCI_ID=1B4B:9130
>>> PCI_SUBSYS_ID=1043:8438
>>> PCI_SLOT_NAME=0000:0a:00.0
>>> MODALIAS=pci:v00001B4Bd00009130sv00001043sd00008438bc01sc06i01
>>>
>>> lspci | grep 0a:00.0
>>> 0a:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9128 PCIe SATA 6 Gb/s RAID controller with HyperDuo (rev 11)
>>>
>>> I am not using the 88SE9128, so I have no way of knowing whether it works or not. It may simply be getting reset a couple of times per second or it may not function at all.
>>
>> See Bugzilla for the full thread.
>>
>> patenteng: I have asked you to bisect this regression. Any conclusion?
>>
>> Anyway, I'm adding this regression to regzbot:
>>
>> #regzbot: introduced: v6.4..v6.5 https://bugzilla.kernel.org/show_bug.cgi?id=217902
> 
> Hello Bagas, patenteng,
> 
> 
> FYI, the prints:
> [ 9683.269830] ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [ 9683.270399] ata16.00: configured for UDMA/66
> 
> Just show that ATA error handler has been invoked.
> There was no reset performed.
> 
> If there was a reset, you would have seen something like:
> [    1.441326] ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [    1.541250] ata8.00: configured for UDMA/133
> [    1.541411] ata8: hard resetting link
> 
> 
> Could you please try this patch and see if it improves things for you:
> https://lore.kernel.org/linux-ide/20230913150443.1200790-1-nks@flawful.org/T/#u
> 

FWIW, I'm seeing a very similar issue both in 6.5.2 and in git master 
[aed8aee11130 ("Merge tag 'pmdomain-v6.6-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm") with that 
patch applied.


The log is similar (the last two lines repeat several times a second):
[    0.369632] ata14: SATA max UDMA/133 abar m2048@...7c10000 port 
0xf7c10480 irq 33
[    0.683693] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    1.031662] ata14.00: ATAPI: MARVELL VIRTUALL, 1.09, max UDMA/66
[    1.031852] ata14.00: configured for UDMA/66
[    1.414145] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    1.414505] ata14.00: configured for UDMA/66
[    1.744094] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    1.744368] ata14.00: configured for UDMA/66
[    2.073916] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    2.074276] ata14.00: configured for UDMA/66


lspci shows:
09:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9230 PCIe 2.0 
x2 4-port SATA 6 Gb/s RAID Controller (rev 10) (prog-if 01 [AHCI 1.0])
         Subsystem: Gigabyte Technology Co., Ltd Device b000
         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- 
ParErr- Stepping- SERR- FastB2B- DisINTx+
         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
<TAbort- <MAbort- >SERR- <PERR- INTx-
         Latency: 0, Cache Line Size: 64 bytes
         Interrupt: pin A routed to IRQ 33
         Region 0: I/O ports at b050 [size=8]
         Region 1: I/O ports at b040 [size=4]
         Region 2: I/O ports at b030 [size=8]
         Region 3: I/O ports at b020 [size=4]
         Region 4: I/O ports at b000 [size=32]
         Region 5: Memory at f7c10000 (32-bit, non-prefetchable) [size=2K]
         Expansion ROM at f7c00000 [disabled] [size=64K]
         Capabilities: <access denied>
         Kernel driver in use: ahci

The controller in question lives on a Gigabyte Z87X-UD5H-CF motherboard. 
I'm using the controller for several drives, and it's working, it's just 
spammy. (At worst, there's some performance hitching, but that might 
just be journald rotating logs as they fill up with the message).

I haven't had a chance to bisect yet (this is a slightly awkward machine 
for me to install test kernels on), but can also confirm it worked with 
6.4.12.

Hopefully that's useful. I'll get back to you if I manage to bisect it.

Cheers,
-- David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ