[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0a501939-3361-428e-97c4-6f041a9ec1f9@linux.ibm.com>
Date: Wed, 3 Jan 2024 10:33:25 +0100
From: Wenjia Zhang <wenjia@...ux.ibm.com>
To: Wen Gu <guwen@...ux.alibaba.com>, jaka@...ux.ibm.com, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com
Cc: alibuda@...ux.alibaba.com, tonylu@...ux.alibaba.com,
ubraun@...ux.vnet.ibm.com, linux-s390@...r.kernel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] net/smc: fix invalid link access in dumping SMC-R
connections
On 27.12.23 08:40, Wen Gu wrote:
> A crash was found when dumping SMC-R connections. It can be reproduced
> by following steps:
>
> - environment: two RNICs on both sides.
> - run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
> will be created.
> - set the first RNIC down on either side and link group will turn to
> SMC_LGR_ASYMMETRIC_LOCAL then.
> - run 'smcss -R' and the crash will be triggered.
>
> BUG: kernel NULL pointer dereference, address: 0000000000000010
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W E 6.7.0-rc6+ #51
> RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
> Call Trace:
> <TASK>
> ? __die+0x24/0x70
> ? page_fault_oops+0x66/0x150
> ? exc_page_fault+0x69/0x140
> ? asm_exc_page_fault+0x26/0x30
> ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
> smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
> smc_diag_dump+0x26/0x60 [smc_diag]
> netlink_dump+0x19f/0x320
> __netlink_dump_start+0x1dc/0x300
> smc_diag_handler_dump+0x6a/0x80 [smc_diag]
> ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
> sock_diag_rcv_msg+0x121/0x140
> ? __pfx_sock_diag_rcv_msg+0x10/0x10
> netlink_rcv_skb+0x5a/0x110
> sock_diag_rcv+0x28/0x40
> netlink_unicast+0x22a/0x330
> netlink_sendmsg+0x240/0x4a0
> __sock_sendmsg+0xb0/0xc0
> ____sys_sendmsg+0x24e/0x300
> ? copy_msghdr_from_user+0x62/0x80
> ___sys_sendmsg+0x7c/0xd0
> ? __do_fault+0x34/0x1a0
> ? do_read_fault+0x5f/0x100
> ? do_fault+0xb0/0x110
> __sys_sendmsg+0x4d/0x80
> do_syscall_64+0x45/0xf0
> entry_SYSCALL_64_after_hwframe+0x6e/0x76
>
> When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
> asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
> by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
> in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
> in this issue. So fix it by accessing the right link.
>
> Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
> Reported-by: henaumars <henaumars@...a.com>
> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616
> Signed-off-by: Wen Gu <guwen@...ux.alibaba.com>
That is really good catch and good description! Thank you, Wen Gu, for
fixing it!
Reviewed-and-tested-by: Wenjia Zhang <wenjia@...ux.ibm.com>
Powered by blists - more mailing lists