[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b644ff1714731cfb652d809d4864f0d178b24a97.camel@web.de>
Date: Wed, 14 May 2025 11:32:58 +0200
From: Bert Karwatzki <spasswolf@....de>
To: linux-kernel@...r.kernel.org
Cc: linux-next@...r.kernel.org, llvm@...ts.linux.dev, Johannes Berg
<johannes.berg@...el.com>, spasswolf@....de, Thomas Gleixner
<tglx@...utronix.de>
Subject: Re: lockup and kernel panic in linux-next-202505{09,12} when
compiled with clang
Am Mittwoch, dem 14.05.2025 um 02:11 +0200 schrieb Bert Karwatzki:
> Am Mittwoch, dem 14.05.2025 um 00:33 +0200 schrieb Thomas Gleixner:
> > On Tue, May 13 2025 at 18:48, Bert Karwatzki wrote:
> > > >
> > > > I'll now start a bisection where I revert 76a853f86c97 where possible in
> > > > order to find the remaining bugs.
> > >
> > > The second bisection (from v6.15-rc6 to next-20250512) is finished now:
> > >
> > > This commit leads to lockups and kernel panics after
> > > watching ~5-10min of a youtube video while compiling a kernel,
> > > reverting it in next-20250512 is possible:
> > > 76a853f86c97 ("wifi: free SKBTX_WIFI_STATUS skb tx_flags flag")
> > > This commit leads to the boot failure, reverting leads to the
> > > compile error it is supposed to fix:
> > > 97f4b999e0c8 ("genirq: Use scoped_guard() to shut clang up")
> >
> > I really have a hard time to understand what you are trying to explain
> > here. 'This commit leads..' is so unspecified that I can't make any
> > sense of it.
> >
> > Also please make sure that you have commit b5fcb6898202 ("genirq: Ensure
> > flags in lock guard is consistently initialized") in your tree when
> > re-testing. That's fixing another subtle (AFAICT clang only) problem in
> > the guard conversion. If it's not in next yet, you can just merge
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
> >
> > into next or wait for the next next integration.
> >
> > Thanks
> >
> > tglx
>
>
> I merged git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core into
> next-20250513 and this fixes the boot failure but the system still locks up
> after a few minutes (with flashing capslock). To solve this I need to revert
> 76a853f86c97 ("wifi: free SKBTX_WIFI_STATUS skb tx_flags flag")
>
> Also commit 97f4b999e0c8 did not actually cause the boot failure that was a
> bisection error.
>
> Bert Karwatzki
To investigate the problem with commit 76a853f86c97 ("wifi: free
SKBTX_WIFI_STATUS skb tx_flags flag") I used next-20250513 with irq/core merged
to fix the boot issue and the reverted commit 76a853f86c97.
$ git log --oneline
bb3ff0e21a16 Revert "wifi: free SKBTX_WIFI_STATUS skb tx_flags flag"
28d1f7734aa3 Merge branch 'irq/core' into clang_panic
aa94665adc28 (tag: next-20250513, origin/master, origin/HEAD, master) Add linux-
next specific files for 20250513
Then I reapplied commit 76a853f86c97 hunk by hunk and found the one hunk that
causes the problem:
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 3e751dd3ae7b..63df21228029 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -4648,8 +4648,7 @@ static void ieee80211_8023_xmit(struct
ieee80211_sub_if_data *sdata,
memcpy(IEEE80211_SKB_CB(seg), info, sizeof(*info));
}
- if (unlikely(skb->sk &&
- skb_shinfo(skb)->tx_flags & SKBTX_WIFI_STATUS)) {
+ if (unlikely(skb->sk && sock_flag(skb->sk, SOCK_WIFI_STATUS))) {
info->status_data = ieee80211_store_ack_skb(local, skb,
&info->flags, NULL);
if (info->status_data)
This is enough to cause a kernel panic when compiled with clang (clang-19.1.7
from debian sid). Compiling the same kernel with gcc (gcc-14.2.0 from debian
sid) shows no problem.
The wifi card used is
04:00.0 Network controller [0280]: MEDIATEK Corp. MT7921K (RZ608) Wi-Fi 6E 80MHz
[14c3:0608]
Bert Karwatzki
Powered by blists - more mailing lists