[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2025041615-CVE-2025-22086-babb@gregkh>
Date: Wed, 16 Apr 2025 16:12:57 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: CVE-2025-22086: RDMA/mlx5: Fix mlx5_poll_one() cur_qp update flow
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
RDMA/mlx5: Fix mlx5_poll_one() cur_qp update flow
When cur_qp isn't NULL, in order to avoid fetching the QP from
the radix tree again we check if the next cqe QP is identical to
the one we already have.
The bug however is that we are checking if the QP is identical by
checking the QP number inside the CQE against the QP number inside the
mlx5_ib_qp, but that's wrong since the QP number from the CQE is from
FW so it should be matched against mlx5_core_qp which is our FW QP
number.
Otherwise we could use the wrong QP when handling a CQE which could
cause the kernel trace below.
This issue is mainly noticeable over QPs 0 & 1, since for now they are
the only QPs in our driver whereas the QP number inside mlx5_ib_qp
doesn't match the QP number inside mlx5_core_qp.
BUG: kernel NULL pointer dereference, address: 0000000000000012
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP
CPU: 0 UID: 0 PID: 7927 Comm: kworker/u62:1 Not tainted 6.14.0-rc3+ #189
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
RIP: 0010:mlx5_ib_poll_cq+0x4c7/0xd90 [mlx5_ib]
Code: 03 00 00 8d 58 ff 21 cb 66 39 d3 74 39 48 c7 c7 3c 89 6e a0 0f b7 db e8 b7 d2 b3 e0 49 8b 86 60 03 00 00 48 c7 c7 4a 89 6e a0 <0f> b7 5c 98 02 e8 9f d2 b3 e0 41 0f b7 86 78 03 00 00 83 e8 01 21
RSP: 0018:ffff88810511bd60 EFLAGS: 00010046
RAX: 0000000000000010 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88885fa1b3c0 RDI: ffffffffa06e894a
RBP: 00000000000000b0 R08: 0000000000000000 R09: ffff88810511bc10
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88810d593000
R13: ffff88810e579108 R14: ffff888105146000 R15: 00000000000000b0
FS: 0000000000000000(0000) GS:ffff88885fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000012 CR3: 00000001077e6001 CR4: 0000000000370eb0
Call Trace:
<TASK>
? __die+0x20/0x60
? page_fault_oops+0x150/0x3e0
? exc_page_fault+0x74/0x130
? asm_exc_page_fault+0x22/0x30
? mlx5_ib_poll_cq+0x4c7/0xd90 [mlx5_ib]
__ib_process_cq+0x5a/0x150 [ib_core]
ib_cq_poll_work+0x31/0x90 [ib_core]
process_one_work+0x169/0x320
worker_thread+0x288/0x3a0
? work_busy+0xb0/0xb0
kthread+0xd7/0x1f0
? kthreads_online_cpu+0x130/0x130
? kthreads_online_cpu+0x130/0x130
ret_from_fork+0x2d/0x50
? kthreads_online_cpu+0x130/0x130
ret_from_fork_asm+0x11/0x20
</TASK>
The Linux kernel CVE team has assigned CVE-2025-22086 to this issue.
Affected and fixed versions
===========================
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 5.4.292 with commit 3b97d77049856865ac5ce8ffbc6e716928310f7f
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 5.10.236 with commit 856d9e5d72dc44eca6d5a153581c58fbd84e92e1
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 5.15.180 with commit f0447ceb8a31d79bee7144f98f9a13f765531e1a
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 6.1.134 with commit dc7139b7031d877acd73d7eff55670f22f48cd5e
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 6.6.87 with commit 7c51a6964b45b6d40027abd77e89cef30d26dc5a
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 6.12.23 with commit cad677085274ecf9c7565b5bfc5d2e49acbf174c
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 6.13.11 with commit 55c65a64aefa6267b964d90e9a4039cb68ec73a5
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 6.14.2 with commit d52636eb13ccba448a752964cc6fc49970912874
Issue introduced in 3.11 with commit e126ba97dba9edeb6fafa3665b5f8497fc9cdf8c and fixed in 6.15-rc1 with commit 5ed3b0cb3f827072e93b4c5b6e2b8106fd7cccbd
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2025-22086
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
drivers/infiniband/hw/mlx5/cq.c
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/3b97d77049856865ac5ce8ffbc6e716928310f7f
https://git.kernel.org/stable/c/856d9e5d72dc44eca6d5a153581c58fbd84e92e1
https://git.kernel.org/stable/c/f0447ceb8a31d79bee7144f98f9a13f765531e1a
https://git.kernel.org/stable/c/dc7139b7031d877acd73d7eff55670f22f48cd5e
https://git.kernel.org/stable/c/7c51a6964b45b6d40027abd77e89cef30d26dc5a
https://git.kernel.org/stable/c/cad677085274ecf9c7565b5bfc5d2e49acbf174c
https://git.kernel.org/stable/c/55c65a64aefa6267b964d90e9a4039cb68ec73a5
https://git.kernel.org/stable/c/d52636eb13ccba448a752964cc6fc49970912874
https://git.kernel.org/stable/c/5ed3b0cb3f827072e93b4c5b6e2b8106fd7cccbd
Powered by blists - more mailing lists