[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0FCA67B6-4D93-458F-856C-33AB2A4AC93B@oracle.com>
Date: Wed, 24 May 2023 14:59:30 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Linux regressions mailing list <regressions@...ts.linux.dev>
CC: Leon Romanovsky <leon@...nel.org>, Eli Cohen <elic@...dia.com>,
Saeed
Mahameed <saeedm@...dia.com>,
linux-rdma <linux-rdma@...r.kernel.org>,
"open
list:NETWORKING [GENERAL]" <netdev@...r.kernel.org>
Subject: Re: system hang on start-up (mlx5?)
> On May 23, 2023, at 10:20 AM, Linux regression tracking (Thorsten Leemhuis) <regressions@...mhuis.info> wrote:
>
> [CCing the regression list, as it should be in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>
> On 16.05.23 21:23, Chuck Lever III wrote:
>>> On May 4, 2023, at 3:02 PM, Chuck Lever III <chuck.lever@...cle.com> wrote:
>>>> On May 4, 2023, at 3:29 AM, Leon Romanovsky <leon@...nel.org> wrote:
>>>> On Wed, May 03, 2023 at 02:02:33PM +0000, Chuck Lever III wrote:
>>>>>> On May 3, 2023, at 2:34 AM, Eli Cohen <elic@...dia.com> wrote:
>>>>>> Just verifying, could you make sure your server and card firmware are up to date?
>>>>> Device firmware updated to 16.35.2000; no change.
>>>>> System firmware is dated September 2016. I'll see if I can get
>>>>> something more recent installed.
>>>> We are trying to reproduce this issue internally.
>>> More information. I captured the serial console during boot.
>>> Here are the last messages:
>> […]
>> Following up.
>>
>> Jason shamed me into replacing a working CX-3Pro in one of
>> my lab systems with a CX-5 VPI, and the same problem occurs.
>> Removing the CX-5 from the system alleviates the problem.
>>
>> Supermicro SYS-6028R-T/X10DRi, v6.4-rc2
>
> I wondered what happened to this, as this looks stalled. Or was progress
> to fix this regression made I just missed it?
I have not heard of an available fix for this issue.
> I noticed the patch "net/mlx5: Fix irq affinity management" (
> https://lore.kernel.org/all/20230523054242.21596-15-saeed@kernel.org/
> ) refers to the culprit of this regression. Is that supposed to fix this
> issue and just lacks proper tags to indicate that?
This patch was suggested to me when I initially reported the crash,
and I tried it at that time. It does not address the problem for me.
--
Chuck Lever
Powered by blists - more mailing lists