[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOFcj8SnLPZoq7JLhUGZKmJVTt=iGC6AcTxghauz91PfDCm+ew@mail.gmail.com>
Date: Wed, 14 Jan 2026 23:19:21 -0800
From: Zac Bowling <zbowling@...il.com>
To: Sean Wang <sean.wang@...nel.org>, linux@...me.work
Cc: deren.wu@...iatek.com, kvalo@...nel.org, linux-kernel@...r.kernel.org,
linux-mediatek@...ts.infradead.org, linux-wireless@...r.kernel.org,
lorenzo@...nel.org, nbd@....name, ryder.lee@...iatek.com,
sean.wang@...iatek.com
Subject: Re: [PATCH] wifi: mt76: mt792x: fix firmware reload failure after
previous load crash
While I'm still waiting for feedback from folks on these patches, I've
set up a public repository with all the fixes for others experiencing
these issues and created a DKMS package for folks so they can easily
load these patches as an alternative driver since so many folks are
running into these same problems on several popular commercial laptops
and desktops:
https://github.com/zbowling/mt7925
The repository has:
- All 18 patches from this series I've sent here (different versions
of these patches that apply cleanly to different kernel versions)
- Pre-patched kernel branches (6.17.x, 6.18.x, 6.19-rc5) in another
repo linked in the README
- A new DKMS package for out-of-tree builds (requires kernel 6.17+)
with various hacks with #ifdef kernel versions so that the single
package works for all recent kernels.
The DKMS package builds mt76, mt76-connac-lib, mt792x-lib,
mt7925-common, and mt7925e modules with all fixes applied.
Testing in the community with everyone experiencing these same panics
in the current upstream version, I've heard feedback from many folks
that this patch series (either just apply the patches or using the
DKMS build) that this fixes most of their issues.
There still seems to be ongoing issues inside the firmware related to
MLO and deauths with certain APs (especially with my Unifi U7 Pros)
but at least this keeps machines from crashing while it the chip
resets so you only suffer momentary losses in connectivity instead of
straight-up kernel panic or a deadlock.
For anyone still hitting the NULL pointer dereferences, mutex
deadlocks with NetworkManager and friends during MLO and deauth
situations, or suspend/resume hangs with mt7925 - this DMKS package or
these patches should greatly help.
Happy to address any review feedback whenever you finally have a
chance to look at these.
Zac Bowling
On Sat, Jan 3, 2026 at 10:42 AM Zac Bowling <zbowling@...il.com> wrote:
>
> Hi Sean,
>
> Thanks! I don't have a MT7921, only a MT7925, so no unfortunately. I
> ordered off Amazon and should be here in a week or two.
>
> Zac Bowling
>
> Zac Bowling
>
>
> On Fri, Jan 2, 2026 at 10:46 PM Sean Wang <sean.wang@...nel.org> wrote:
> >
> > On Fri, Jan 2, 2026 at 2:03 PM Zac Bowling <zbowling@...il.com> wrote:
> > >
> > > If the firmware loading process crashes or is interrupted after
> > > acquiring the patch semaphore but before releasing it, subsequent
> > > firmware load attempts will fail with 'Failed to get patch semaphore'
> > > because the semaphore is still held.
> > >
> > > This issue manifests as devices becoming unusable after suspend/resume
> > > failures or firmware crashes, requiring a full hardware reboot to
> > > recover. This has been widely reported on MT7921 and MT7925 devices.
> > >
> > > Apply the same fix that was applied to MT7915 in commit 79dd14f:
> > > 1. Release the patch semaphore before starting firmware load (in case
> > > it was held by a previous failed attempt)
> > > 2. Restart MCU firmware to ensure clean state
> > > 3. Wait briefly for MCU to be ready
> > >
> > > This fix applies to both MT7921 and MT7925 drivers which share the
> > > mt792x_load_firmware() function.
> > >
> > > Fixes: 'Failed to get patch semaphore' errors after firmware crash
> > > Signed-off-by: Zac Bowling <zac@...bowling.com>
> > > ---
> > > mt792x_core.c | 14 ++++++++++++++
> > > 1 file changed, 14 insertions(+)
> > >
> > > diff --git a/mt792x_core.c b/mt792x_core.c
> > > index cc488ee9..b82e4470 100644
> > > --- a/mt792x_core.c
> > > +++ b/mt792x_core.c
> > > @@ -927,6 +927,20 @@ int mt792x_load_firmware(struct mt792x_dev *dev)
> > > {
> > > int ret;
> > >
> > > + /* Release semaphore if taken by previous failed load attempt.
> > > + * This prevents "Failed to get patch semaphore" errors when
> > > + * recovering from firmware crashes or suspend/resume failures.
> > > + */
> > > + ret = mt76_connac_mcu_patch_sem_ctrl(&dev->mt76, false);
> > > + if (ret < 0)
> > > + dev_dbg(dev->mt76.dev, "Semaphore release returned %d (may be expected)\n", ret);
> > > +
> > > + /* Always restart MCU to ensure clean state before loading firmware */
> > > + mt76_connac_mcu_restart(&dev->mt76);
> > > +
> > > + /* Wait for MCU to be ready after restart */
> > > + msleep(100);
> > > +
> >
> > Hi Zac,
> >
> > This is a good finding. Since this is a common mt792x code path, have you
> > also had a chance to test it on MT7921?
> >
> > One small nit: the Fixes tag should reference the actual commit being
> > fixed, e.g.
> >
> > Fixes: <commit-sha> ("mt76: mt792x: ...")
> >
> > instead of the error string.
> >
> > Sean
> >
> > > ret = mt76_connac2_load_patch(&dev->mt76, mt792x_patch_name(dev));
> > > if (ret)
> > > return ret;
> > > --
> > > 2.51.0
> > >
> > >
> > >
Powered by blists - more mailing lists