[<prev] [next>] [day] [month] [year] [list]
Message-ID: <2024052143-CVE-2021-47246-9896@gregkh>
Date: Tue, 21 May 2024 16:20:00 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-cve-announce@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: CVE-2021-47246: net/mlx5e: Fix page reclaim for dead peer hairpin
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Fix page reclaim for dead peer hairpin
When adding a hairpin flow, a firmware-side send queue is created for
the peer net device, which claims some host memory pages for its
internal ring buffer. If the peer net device is removed/unbound before
the hairpin flow is deleted, then the send queue is not destroyed which
leads to a stack trace on pci device remove:
[ 748.005230] mlx5_core 0000:08:00.2: wait_func:1094:(pid 12985): MANAGE_PAGES(0x108) timeout. Will cause a leak of a command resource
[ 748.005231] mlx5_core 0000:08:00.2: reclaim_pages:514:(pid 12985): failed reclaiming pages: err -110
[ 748.001835] mlx5_core 0000:08:00.2: mlx5_reclaim_root_pages:653:(pid 12985): failed reclaiming pages (-110) for func id 0x0
[ 748.002171] ------------[ cut here ]------------
[ 748.001177] FW pages counter is 4 after reclaiming all pages
[ 748.001186] WARNING: CPU: 1 PID: 12985 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:685 mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core] [ +0.002771] Modules linked in: cls_flower mlx5_ib mlx5_core ptp pps_core act_mirred sch_ingress openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_umad ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay fuse [last unloaded: pps_core]
[ 748.007225] CPU: 1 PID: 12985 Comm: tee Not tainted 5.12.0+ #1
[ 748.001376] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 748.002315] RIP: 0010:mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core]
[ 748.001679] Code: 28 00 00 00 0f 85 22 01 00 00 48 81 c4 b0 00 00 00 31 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 c7 c7 40 cc 19 a1 e8 9f 71 0e e2 <0f> 0b e9 30 ff ff ff 48 c7 c7 a0 cc 19 a1 e8 8c 71 0e e2 0f 0b e9
[ 748.003781] RSP: 0018:ffff88815220faf8 EFLAGS: 00010286
[ 748.001149] RAX: 0000000000000000 RBX: ffff8881b4900280 RCX: 0000000000000000
[ 748.001445] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102a441f51
[ 748.001614] RBP: 00000000000032b9 R08: 0000000000000001 R09: ffffed1054a15ee8
[ 748.001446] R10: ffff8882a50af73b R11: ffffed1054a15ee7 R12: fffffbfff07c1e30
[ 748.001447] R13: dffffc0000000000 R14: ffff8881b492cba8 R15: 0000000000000000
[ 748.001429] FS: 00007f58bd08b580(0000) GS:ffff8882a5080000(0000) knlGS:0000000000000000
[ 748.001695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 748.001309] CR2: 000055a026351740 CR3: 00000001d3b48006 CR4: 0000000000370ea0
[ 748.001506] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 748.001483] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 748.001654] Call Trace:
[ 748.000576] ? mlx5_satisfy_startup_pages+0x290/0x290 [mlx5_core]
[ 748.001416] ? mlx5_cmd_teardown_hca+0xa2/0xd0 [mlx5_core]
[ 748.001354] ? mlx5_cmd_init_hca+0x280/0x280 [mlx5_core]
[ 748.001203] mlx5_function_teardown+0x30/0x60 [mlx5_core]
[ 748.001275] mlx5_uninit_one+0xa7/0xc0 [mlx5_core]
[ 748.001200] remove_one+0x5f/0xc0 [mlx5_core]
[ 748.001075] pci_device_remove+0x9f/0x1d0
[ 748.000833] device_release_driver_internal+0x1e0/0x490
[ 748.001207] unbind_store+0x19f/0x200
[ 748.000942] ? sysfs_file_ops+0x170/0x170
[ 748.001000] kernfs_fop_write_iter+0x2bc/0x450
[ 748.000970] new_sync_write+0x373/0x610
[ 748.001124] ? new_sync_read+0x600/0x600
[ 748.001057] ? lock_acquire+0x4d6/0x700
[ 748.000908] ? lockdep_hardirqs_on_prepare+0x400/0x400
[ 748.001126] ? fd_install+0x1c9/0x4d0
[ 748.000951] vfs_write+0x4d0/0x800
[ 748.000804] ksys_write+0xf9/0x1d0
[ 748.000868] ? __x64_sys_read+0xb0/0xb0
[ 748.000811] ? filp_open+0x50/0x50
[ 748.000919] ? syscall_enter_from_user_mode+0x1d/0x50
[ 748.001223] do_syscall_64+0x3f/0x80
[ 748.000892] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 748.001026] RIP: 0033:0x7f58bcfb22f7
[ 748.000944] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 748.003925] RSP: 002b:00007fffd7f2aaa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 748.001732] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f58bcfb22f7
[ 748.001426] RDX: 000000000000000d RSI: 00007fffd7f2abc0 RDI: 0000000000000003
[ 748.001746] RBP: 00007fffd7f2abc0 R08: 0000000000000000 R09: 0000000000000001
[ 748.001631] R10: 00000000000001b6 R11: 0000000000000246 R12: 000000000000000d
[ 748.001537] R13: 00005597ac2c24a0 R14: 000000000000000d R15: 00007f58bd084700
[ 748.001564] irq event stamp: 0
[ 748.000787] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 748.001399] hardirqs last disabled at (0): [<ffffffff813132cf>] copy_process+0x146f/0x5eb0
[ 748.001854] softirqs last enabled at (0): [<ffffffff8131330e>] copy_process+0x14ae/0x5eb0
[ 748.013431] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 748.001492] ---[ end trace a6fabd773d1c51ae ]---
Fix by destroying the send queue of a hairpin peer net device that is
being removed/unbound, which returns the allocated ring buffer pages to
the host.
The Linux kernel CVE team has assigned CVE-2021-47246 to this issue.
Affected and fixed versions
===========================
Issue introduced in 4.19 with commit 4d8fcf216c90 and fixed in 5.4.128 with commit 4b16118665e9
Issue introduced in 4.19 with commit 4d8fcf216c90 and fixed in 5.10.46 with commit be7f3f401d22
Issue introduced in 4.19 with commit 4d8fcf216c90 and fixed in 5.12.13 with commit b374c1304f6d
Issue introduced in 4.19 with commit 4d8fcf216c90 and fixed in 5.13 with commit a3e5fd9314df
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2021-47246
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
drivers/net/ethernet/mellanox/mlx5/core/transobj.c
include/linux/mlx5/transobj.h
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/4b16118665e94c90a3e84a5190486fd0e4eedd74
https://git.kernel.org/stable/c/be7f3f401d224e1efe8112b2fa8b837eeb8c5e52
https://git.kernel.org/stable/c/b374c1304f6d3d4752ad1412427b7bf02bb1fd61
https://git.kernel.org/stable/c/a3e5fd9314dfc4314a9567cde96e1aef83a7458a
Powered by blists - more mailing lists