lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f134294c-2919-6069-d362-87a84c846690@linux.ibm.com>
Date:   Sat, 27 May 2023 12:22:59 +0200
From:   Wenjia Zhang <wenjia@...ux.ibm.com>
To:     Wen Gu <guwen@...ux.alibaba.com>, kgraul@...ux.ibm.com,
        jaka@...ux.ibm.com, davem@...emloft.net, edumazet@...gle.com,
        kuba@...nel.org, pabeni@...hat.com
Cc:     linux-s390@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 2/2] net/smc: Don't use RMBs not mapped to new link in
 SMCRv2 ADD LINK



On 26.05.23 13:49, Wen Gu wrote:
> We encountered a crash when using SMCRv2. It is caused by a logical
> error in smc_llc_fill_ext_v2().
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000014
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: 0000 [#1] PREEMPT SMP PTI
>   CPU: 7 PID: 453 Comm: kworker/7:4 Kdump: loaded Tainted: G        W   E      6.4.0-rc3+ #44
>   Workqueue: events smc_llc_add_link_work [smc]
>   RIP: 0010:smc_llc_fill_ext_v2+0x117/0x280 [smc]
>   RSP: 0018:ffffacb5c064bd88 EFLAGS: 00010282
>   RAX: ffff9a6bc1c3c02c RBX: ffff9a6be3558000 RCX: 0000000000000000
>   RDX: 0000000000000002 RSI: 0000000000000002 RDI: 000000000000000a
>   RBP: ffffacb5c064bdb8 R08: 0000000000000040 R09: 000000000000000c
>   R10: ffff9a6bc0910300 R11: 0000000000000002 R12: 0000000000000000
>   R13: 0000000000000002 R14: ffff9a6bc1c3c02c R15: ffff9a6be3558250
>   FS:  0000000000000000(0000) GS:ffff9a6eefdc0000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000014 CR3: 000000010b078003 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>    <TASK>
>    smc_llc_send_add_link+0x1ae/0x2f0 [smc]
>    smc_llc_srv_add_link+0x2c9/0x5a0 [smc]
>    ? cc_mkenc+0x40/0x60
>    smc_llc_add_link_work+0xb8/0x140 [smc]
>    process_one_work+0x1e5/0x3f0
>    worker_thread+0x4d/0x2f0
>    ? __pfx_worker_thread+0x10/0x10
>    kthread+0xe5/0x120
>    ? __pfx_kthread+0x10/0x10
>    ret_from_fork+0x2c/0x50
>    </TASK>
> 
> When an alernate RNIC is available in system, SMC will try to add a new
> link based on the RNIC for resilience. All the RMBs in use will be mapped
> to the new link. Then the RMBs' MRs corresponding to the new link will be
> filled into SMCRv2 LLC ADD LINK messages.
> 
> However, smc_llc_fill_ext_v2() mistakenly accesses to unused RMBs which
> haven't been mapped to the new link and have no valid MRs, thus causing
> a crash. So this patch fixes the logic.
> 
> Fixes: b4ba4652b3f8 ("net/smc: extend LLC layer for SMC-Rv2")
> Signed-off-by: Wen Gu <guwen@...ux.alibaba.com>
> ---
>   net/smc/smc_llc.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/smc/smc_llc.c b/net/smc/smc_llc.c
> index 8423e8e..7a8d916 100644
> --- a/net/smc/smc_llc.c
> +++ b/net/smc/smc_llc.c
> @@ -617,6 +617,8 @@ static int smc_llc_fill_ext_v2(struct smc_llc_msg_add_link_v2_ext *ext,
>   		goto out;
>   	buf_pos = smc_llc_get_first_rmb(lgr, &buf_lst);
>   	for (i = 0; i < ext->num_rkeys; i++) {
> +		while (buf_pos && !(buf_pos)->used)
> +			buf_pos = smc_llc_get_next_rmb(lgr, &buf_lst, buf_pos);
>   		if (!buf_pos)
>   			break;
>   		rmb = buf_pos;
> @@ -626,8 +628,6 @@ static int smc_llc_fill_ext_v2(struct smc_llc_msg_add_link_v2_ext *ext,
>   			cpu_to_be64((uintptr_t)rmb->cpu_addr) :
>   			cpu_to_be64((u64)sg_dma_address(rmb->sgt[lnk_idx].sgl));
>   		buf_pos = smc_llc_get_next_rmb(lgr, &buf_lst, buf_pos);
> -		while (buf_pos && !(buf_pos)->used)
> -			buf_pos = smc_llc_get_next_rmb(lgr, &buf_lst, buf_pos);
>   	}
>   	len += i * sizeof(ext->rt[0]);
>   out:

I'm wondering if this crash is introduced by the first fix patch you wrote.

Thanks,
Wenjia

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ