[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d2979424-bb3e-3e1f-d53c-2b3580811533@gmail.com>
Date: Thu, 25 Feb 2021 17:24:48 +0200
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Ido Schimmel <idosch@...sch.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Steven Rostedt <rostedt@...dmis.org>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>
Cc: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Jay Vosburgh <j.vosburgh@...il.com>,
Veaceslav Falico <vfalico@...il.com>,
Andy Gospodarek <andy@...yhouse.net>,
Moshe Shemesh <moshe@...dia.com>,
Itay Aveksis <itayav@...dia.com>,
Ran Rozenstein <ranro@...dia.com>,
Tariq Toukan <tariqt@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leonro@...dia.com>
Subject: Re: bug report: WARNING in bonding
On 2/18/2021 7:10 PM, Tariq Toukan wrote:
>
>
> On 11/12/2020 6:33 PM, Ido Schimmel wrote:
>> On Thu, Nov 12, 2020 at 05:54:30PM +0200, Tariq Toukan wrote:
>>>
>>>
>>> On 11/12/2020 5:46 PM, Ido Schimmel wrote:
>>>> On Thu, Nov 12, 2020 at 05:38:44PM +0200, Tariq Toukan wrote:
>>>>> Hi all,
>>>>>
>>>>> In the past ~2-3 weeks, we started seeing the following WARNING and
>>>>> traces
>>>>> in our regression testing systems, almost every day.
>>>>>
>>>>> Reproduction is not stable, and not isolated to a specific test, so
>>>>> it's
>>>>> hard to bisect.
>>>>>
>>>>> Any idea what could this be?
>>>>> Or what is the suspected offending patch?
>>>>
>>>> Do you have commit f8e48a3dca06 ("lockdep: Fix preemption WARN for
>>>> spurious
>>>> IRQ-enable")? I think it fixed the issue for me
>>>>
>>>
>>> We do have it. Yet issue still exists.
>>
>> I checked my mail and apparently we stopped seeing this warning after I
>> fixed a lockdep issue (spin_lock() vs spin_lock_bh()) in a yet to be
>> submitted patch. Do you see any other lockdep warnings in the log
>> besides this one? Maybe something in mlx4/5 which is why syzbot didn't
>> hit it?
>>
>
> Hi,
>
> Issue still reproduces. Even in GA kernel.
> It is always preceded by some other lockdep warning.
>
> So to get the reproduction:
> - First, have any lockdep issue.
> - Then, open bond interface.
>
> Any idea what could it be?
>
> We'll share any new info as soon as we have it.
>
> Regards,
> Tariq
Bisect shows this is the offending commit:
commit 4d004099a668c41522242aa146a38cc4eb59cb1e
Author: Peter Zijlstra <peterz@...radead.org>
Date: Fri Oct 2 11:04:21 2020 +0200
lockdep: Fix lockdep recursion
Steve reported that lockdep_assert*irq*(), when nested inside lockdep
itself, will trigger a false-positive.
One example is the stack-trace code, as called from inside lockdep,
triggering tracing, which in turn calls RCU, which then uses
lockdep_assert_irqs_disabled().
Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context}
to per-cpu variables")
Reported-by: Steven Rostedt <rostedt@...dmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Signed-off-by: Ingo Molnar <mingo@...nel.org>
Powered by blists - more mailing lists