[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aUFOl4euBSyPtA5F@horms.kernel.org>
Date: Tue, 16 Dec 2025 12:20:39 +0000
From: Simon Horman <horms@...nel.org>
To: Dipayaan Roy <dipayanroy@...ux.microsoft.com>
Cc: kys@...rosoft.com, haiyangz@...rosoft.com, wei.liu@...nel.org,
decui@...rosoft.com, andrew+netdev@...n.ch, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
longli@...rosoft.com, kotaranov@...rosoft.com,
shradhagupta@...ux.microsoft.com, ssengar@...ux.microsoft.com,
ernis@...ux.microsoft.com, shirazsaleem@...rosoft.com,
linux-hyperv@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
dipayanroy@...rosoft.com
Subject: Re: [PATCH net-next] net: mana: Fix use-after-free in reset service
rescan path
On Tue, Dec 16, 2025 at 02:55:08AM -0800, Dipayaan Roy wrote:
> When mana_serv_reset() encounters -ETIMEDOUT or -EPROTO from
> mana_gd_resume(), it performs a PCI rescan via mana_serv_rescan().
>
> mana_serv_rescan() calls pci_stop_and_remove_bus_device(), which can
> invoke the driver's remove path and free the gdma_context associated
> with the device. After returning, mana_serv_reset() currently jumps to
> the out label and attempts to clear gc->in_service, dereferencing a
> freed gdma_context.
>
> The issue was observed with the following call logs:
> [ 698.942636] BUG: unable to handle page fault for address: ff6c2b638088508d
> [ 698.943121] #PF: supervisor write access in kernel mode
> [ 698.943423] #PF: error_code(0x0002) - not-present page
> [S[ 698.943793] Pat Dec 6 07:GD5 100000067 P4D 1002f7067 PUD 1002f8067 PMD 101bef067 PTE 0
> 0:56 2025] hv_[n e 698.944283] Oops: Oops: 0002 [#1] SMP NOPTI
> tvsc f8615163-00[ 698.944611] CPU: 28 UID: 0 PID: 249 Comm: kworker/28:1
> ...
> [Sat Dec 6 07:50:56 2025] R10: [ 699.121594] mana 7870:00:00.0 enP30832s1: Configured vPort 0 PD 18 DB 16
> 000000000000001b R11: 0000000000000000 R12: ff44cf3f40270000
> [Sat Dec 6 07:50:56 2025] R13: 0000000000000001 R14: ff44cf3f402700c8 R15: ff44cf3f4021b405
> [Sat Dec 6 07:50:56 2025] FS: 0000000000000000(0000) GS:ff44cf7e9fcf9000(0000) knlGS:0000000000000000
> [Sat Dec 6 07:50:56 2025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [Sat Dec 6 07:50:56 2025] CR2: ff6c2b638088508d CR3: 000000011fe43001 CR4: 0000000000b73ef0
> [Sat Dec 6 07:50:56 2025] Call Trace:
> [Sat Dec 6 07:50:56 2025] <TASK>
> [Sat Dec 6 07:50:56 2025] mana_serv_func+0x24/0x50 [mana]
> [Sat Dec 6 07:50:56 2025] process_one_work+0x190/0x350
> [Sat Dec 6 07:50:56 2025] worker_thread+0x2b7/0x3d0
> [Sat Dec 6 07:50:56 2025] kthread+0xf3/0x200
> [Sat Dec 6 07:50:56 2025] ? __pfx_worker_thread+0x10/0x10
> [Sat Dec 6 07:50:56 2025] ? __pfx_kthread+0x10/0x10
> [Sat Dec 6 07:50:56 2025] ret_from_fork+0x21a/0x250
> [Sat Dec 6 07:50:56 2025] ? __pfx_kthread+0x10/0x10
> [Sat Dec 6 07:50:56 2025] ret_from_fork_asm+0x1a/0x30
> [Sat Dec 6 07:50:56 2025] </TASK>
>
> Fix this by returning immediately after mana_serv_rescan() to avoid
> accessing GC state that may no longer be valid.
>
> Fixes: 9bf66036d686 ("net: mana: Handle hardware recovery events when probing the device")
>
nit: no blank line here please - tags should all appear in one block
> Signed-off-by: Dipayaan Roy <dipayanroy@...ux.microsoft.com>
I see that this patch is targeted at net-next.
But this is a fix for a patch present in net.
So it should be targeted at net instead
Subject: [PATCH net] ...
Probably it is not necessary to repost in order to address the minor
feedback I've provided above. But if you do, please be sure to observe
the 24h rule and wait that long between posting revisions of that patch.
https://docs.kernel.org/process/maintainer-netdev.html
The above not withstanding, this patch looks good to me.
Reviewed-by: Simon Horman <horms@...nel.org>
Powered by blists - more mailing lists