[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47b775c5-57fa-5edf-b59e-8a9041ffbee7@candelatech.com>
Date: Tue, 30 Aug 2022 13:47:48 -0700
From: Ben Greear <greearb@...delatech.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, bjorn@...gaas.com
Cc: LKML <linux-kernel@...r.kernel.org>, stable@...r.kernel.org,
Stefan Roese <sr@...x.de>, Bjorn Helgaas <bhelgaas@...gle.com>,
Pali Rohár <pali@...nel.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Bharat Kumar Gogada <bharat.kumar.gogada@...inx.com>,
Michal Simek <michal.simek@...inx.com>,
Yao Hongbo <yaohongbo@...ux.alibaba.com>,
Naveen Naidu <naveennaidu479@...il.com>,
Sasha Levin <sashal@...nel.org>
Subject: Re: [PATCH 5.4 182/389] PCI/portdrv: Dont disable AER reporting in
get_port_device_capability()
On 8/23/22 11:41 PM, Greg Kroah-Hartman wrote:
> On Tue, Aug 23, 2022 at 07:20:14AM -0500, Bjorn Helgaas wrote:
>> On Tue, Aug 23, 2022, 6:35 AM Greg Kroah-Hartman <gregkh@...uxfoundation.org>
>> wrote:
>>
>>> From: Stefan Roese <sr@...x.de>
>>>
>>> [ Upstream commit 8795e182b02dc87e343c79e73af6b8b7f9c5e635 ]
>>>
>>
>> There's an open regression related to this commit:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=216373
>
> This is already in the following released stable kernels:
> 5.10.137 5.15.61 5.18.18 5.19.2
>
> I'll go drop it from the 4.19 and 5.4 queues, but when this gets
> resolved in Linus's tree, make sure there's a cc: stable on the fix so
> that we know to backport it to the above branches as well. Or at the
> least, a "Fixes:" tag.
This is still in 5.19.5. We saw some funny iwlwifi crashes in 5.19.3+
that we did not see in 5.19.0+. I just bisected the scary looking AER errors to this
patch, though I do not know for certain if it causes the iwlwifi related crashes yet.
In general, from reading the commit msg, this patch doesn't seem to be a great candidate
for stable in general. Does it fix some important problem?
In case it helps, here is example of what I see in dmesg. The kernel crashes in iwlwifi
had to do with rx messages from the firmware, and some warnings lead me to believe that
pci messages were slow coming back and/or maybe duplicated. So maybe this AER patch changes
timing or otherwise screws up the PCI adapter boards we use...
[ 50.905809] iwlwifi 0000:04:00.0: AER: can't recover (no error_detected callback)
[ 50.905830] pcieport 0000:03:01.0: AER: device recovery failed
[ 50.905831] pcieport 0000:00:1c.0: AER: Uncorrected (Non-Fatal) error received: 0000:03:01.0
[ 50.905845] pcieport 0000:03:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 50.915679] pcieport 0000:03:01.0: device [10b5:8619] error status/mask=00100000/00000000
[ 50.922735] pcieport 0000:03:01.0: [20] UnsupReq (First)
[ 50.928230] pcieport 0000:03:01.0: AER: TLP Header: 34000000 04001f10 00000000 88c888c8
[ 50.935126] iwlwifi 0000:04:00.0: AER: can't recover (no error_detected callback)
[ 50.935133] pcieport 0000:03:01.0: AER: device recovery failed
[ 50.935134] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Non-Fatal) error received: 0000:03:01.0
[ 50.935222] pcieport 0000:03:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 50.945059] pcieport 0000:03:01.0: device [10b5:8619] error status/mask=00100000/00000000
[ 50.952120] pcieport 0000:03:01.0: [20] UnsupReq (First)
[ 50.957614] pcieport 0000:03:01.0: AER: TLP Header: 34000000 04001f10 00000000 88c888c8
[ 50.964492] pcieport 0000:03:01.0: AER: Error of this Agent is reported first
[ 50.970519] pcieport 0000:03:02.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 50.980344] pcieport 0000:03:02.0: device [10b5:8619] error status/mask=00100000/00000000
[ 50.987399] pcieport 0000:03:02.0: [20] UnsupReq (First)
[ 50.992891] pcieport 0000:03:02.0: AER: TLP Header: 34000000 05001f10 00000000 88c888c8
[ 50.999785] pcieport 0000:03:03.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 51.009611] pcieport 0000:03:03.0: device [10b5:8619] error status/mask=00100000/00000000
[ 51.016665] pcieport 0000:03:03.0: [20] UnsupReq (First)
[ 51.022161] pcieport 0000:03:03.0: AER: TLP Header: 34000000 06001f10 00000000 88c888c8
[ 51.029052] pcieport 0000:03:05.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 51.038881] pcieport 0000:03:05.0: device [10b5:8619] error status/mask=00100000/00000000
[ 51.045931] pcieport 0000:03:05.0: [20] UnsupReq (First)
[ 51.051430] pcieport 0000:03:05.0: AER: TLP Header: 34000000 07001f10 00000000 88c888c8
[ 51.058320] pcieport 0000:03:07.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 51.068147] pcieport 0000:03:07.0: device [10b5:8619] error status/mask=00100000/00000000
[ 51.075200] pcieport 0000:03:07.0: [20] UnsupReq (First)
[ 51.080696] pcieport 0000:03:07.0: AER: TLP Header: 34000000 08001f10 00000000 88c888c8
[ 51.087589] iwlwifi 0000:04:00.0: AER: can't recover (no error_detected callback)
[ 51.087598] pcieport 0000:03:01.0: AER: device recovery failed
[ 51.087611] iwlwifi 0000:05:00.0: AER: can't recover (no error_detected callback)
[ 51.087615] pcieport 0000:03:02.0: AER: device recovery failed
[ 51.087628] iwlwifi 0000:06:00.0: AER: can't recover (no error_detected callback)
[ 51.087631] pcieport 0000:03:03.0: AER: device recovery failed
[ 51.087643] iwlwifi 0000:07:00.0: AER: can't recover (no error_detected callback)
[ 51.087646] pcieport 0000:03:05.0: AER: device recovery failed
[ 51.087659] iwlwifi 0000:08:00.0: AER: can't recover (no error_detected callback)
[ 51.087662] pcieport 0000:03:07.0: AER: device recovery failed
[ 51.103761] pcieport 0000:00:1c.0: AER: Uncorrected (Non-Fatal) error received: 0000:03:0f.0
[ 51.103778] pcieport 0000:03:0f.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 51.113608] pcieport 0000:03:0f.0: device [10b5:8619] error status/mask=00100000/00000000
[ 51.120658] pcieport 0000:03:0f.0: [20] UnsupReq (First)
[ 51.126152] pcieport 0000:03:0f.0: AER: TLP Header: 34000000 0f001f10 00000000 88c888c8
[ 51.133044] iwlwifi 0000:0f:00.0: AER: can't recover (no error_detected callback)
[ 51.133068] pcieport 0000:03:0f.0: AER: device recovery failed
[ 51.168925] pcieport 0000:00:1c.0: AER: Uncorrected (Non-Fatal) error received: 0000:03:0f.0
[ 51.168940] pcieport 0000:03:0f.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 51.178773] pcieport 0000:03:0f.0: device [10b5:8619] error status/mask=00100000/00000000
[ 51.185823] pcieport 0000:03:0f.0: [20] UnsupReq (First)
[ 51.191318] pcieport 0000:03:0f.0: AER: TLP Header: 34000000 0f001f10 00000000 88c888c8
[ 51.198211] iwlwifi 0000:0f:00.0: AER: can't recover (no error_detected callback)
[ 51.198234] pcieport 0000:03:0f.0: AER: device recovery failed
[ 51.260695] pcieport 0000:00:1c.0: AER: Uncorrected (Non-Fatal) error received: 0000:03:0f.0
[ 51.260710] pcieport 0000:03:0f.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 51.270548] pcieport 0000:03:0f.0: device [10b5:8619] error status/mask=00100000/00000000
[ 51.277605] pcieport 0000:03:0f.0: [20] UnsupReq (First)
[ 51.283103] pcieport 0000:03:0f.0: AER: TLP Header: 34000000 0f001f10 00000000 88c888c8
[ 51.290009] iwlwifi 0000:0f:00.0: AER: can't recover (no error_detected callback)
[ 51.290033] pcieport 0000:03:0f.0: AER: device recovery failed
[ 51.328514] pcieport 0000:00:1c.0: AER: Uncorrected (Non-Fatal) error received: 0000:03:0f.0
[ 51.328530] pcieport 0000:03:0f.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[ 51.331638] ACPI: \: failed to evaluate _DSM bf0212f2-788f-c64d-a5b3-1f738e285ade (0x1001)
[ 51.338363] pcieport 0000:03:0f.0: device [10b5:8619] error status/mask=00100000/00000000
[ 51.338364] pcieport 0000:03:0f.0: [20] UnsupReq (First)
[ 51.345413] ACPI: \: failed to evaluate _DSM bf0212f2-788f-c64d-a5b3-1f738e285ade (0x1001)
[ 51.350900] pcieport 0000:03:0f.0: AER: TLP Header: 34000000 0f001f10 00000000 88c888c8
[ 51.350927] iwlwifi 0000:0f:00.0: AER: can't recover (no error_detected callback)
Thanks,
Ben
--
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc http://www.candelatech.com
Powered by blists - more mailing lists