[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8f857cf7-6e02-ac80-e873-f9c40b743566@oracle.com>
Date: Sun, 1 Mar 2020 11:55:25 -0800
From: santosh.shilimkar@...cle.com
To: Håkon Bugge <haakon.bugge@...cle.com>,
Hillf Danton <hdanton@...a.com>
Cc: syzbot <syzbot+274094e62023782eeb17@...kaller.appspotmail.com>,
davem@...emloft.net, kuba@...nel.org, linux-kernel@...r.kernel.org,
OFED mailing list <linux-rdma@...r.kernel.org>,
netdev@...r.kernel.org, rds-devel@....oracle.com,
syzkaller-bugs@...glegroups.com,
Andy Grover <andy.grover@...cle.com>
Subject: Re: general protection fault in rds_ib_add_one
On 3/1/20 9:46 AM, Håkon Bugge wrote:
>
>
>> On 25 Feb 2020, at 19:05, santosh.shilimkar@...cle.com wrote:
>>
>>
>>
>> On 2/24/20 8:47 PM, Hillf Danton wrote:
>>> On Mon, 24 Feb 2020 09:51:01 -0800 Santosh Shilimkar wrote:
>>>> On 2/24/20 2:39 AM, Hillf Danton wrote:
>>>>>
>>>>> Fall back to NUMA_NO_NODE if needed.
>>> [...]
>>>>>
>>>> This seems good. Can you please post it as properly formatted patch ?
>> Thanks !!
>>
>>> ---8<---
>>> Subject: [PATCH] net/rds: fix gpf in rds_ib_add_one
>>> From: Hillf Danton <hdanton@...a.com>
>>> The devoted syzbot posted a gpf report.
>>> general protection fault, probably for non-canonical address 0xdffffc0000000086: 0000 [#1] PREEMPT SMP KASAN
>>> KASAN: null-ptr-deref in range [0x0000000000000430-0x0000000000000437]
>>> CPU: 0 PID: 8852 Comm: syz-executor043 Not tainted 5.6.0-rc2-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> RIP: 0010:dev_to_node include/linux/device.h:663 [inline]
>>> RIP: 0010:rds_ib_add_one+0x81/0xe50 net/rds/ib.c:140
>>> Code: b7 a8 06 00 00 4c 89 f0 48 c1 e8 03 42 80 3c 28 00 74 08 4c 89 f7 e8 0e e4 1d fa bb 30 04 00 00 49 03 1e 48 89 d8 48 c1 e8 03 <42> 8a 04 28 84 c0 0f 85 f0 0a 00 00 8b 1b 48 c7 c0 28 0c 09 89 48
>>> RSP: 0018:ffffc90003087298 EFLAGS: 00010202
>>> RAX: 0000000000000086 RBX: 0000000000000430 RCX: 0000000000000000
>>> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
>>> RBP: ffffc900030872f0 R08: ffffffff87964c3c R09: ffffed1014fd109c
>>> R10: ffffed1014fd109c R11: 0000000000000000 R12: 0000000000000000
>>> R13: dffffc0000000000 R14: ffff8880a7e886a8 R15: ffff8880a7e88000
>>> FS: 0000000000c3d880(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007f0318ed0000 CR3: 00000000a3167000 CR4: 00000000001406f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>> add_client_context+0x482/0x660 drivers/infiniband/core/device.c:681
>>> enable_device_and_get+0x15b/0x370 drivers/infiniband/core/device.c:1316
>>> ib_register_device+0x124d/0x15b0 drivers/infiniband/core/device.c:1382
>>> rxe_register_device+0x3f6/0x530 drivers/infiniband/sw/rxe/rxe_verbs.c:1231
>>> rxe_add+0x1373/0x14f0 drivers/infiniband/sw/rxe/rxe.c:302
>>> rxe_net_add+0x79/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:539
>>> rxe_newlink+0x31/0x90 drivers/infiniband/sw/rxe/rxe.c:318
>>> nldev_newlink+0x403/0x4a0 drivers/infiniband/core/nldev.c:1538
>>> rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline]
>>> rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline]
>>> rdma_nl_rcv+0x701/0xa20 drivers/infiniband/core/netlink.c:259
>>> netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
>>> netlink_unicast+0x766/0x920 net/netlink/af_netlink.c:1328
>>> netlink_sendmsg+0xa2b/0xd40 net/netlink/af_netlink.c:1917
>>> sock_sendmsg_nosec net/socket.c:652 [inline]
>>> sock_sendmsg net/socket.c:672 [inline]
>>> ____sys_sendmsg+0x4f7/0x7f0 net/socket.c:2343
>>> ___sys_sendmsg net/socket.c:2397 [inline]
>>> __sys_sendmsg+0x1ed/0x290 net/socket.c:2430
>>> __do_sys_sendmsg net/socket.c:2439 [inline]
>>> __se_sys_sendmsg net/socket.c:2437 [inline]
>>> __x64_sys_sendmsg+0x7f/0x90 net/socket.c:2437
>>> do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294
>>> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> It's fixed by falling back to NUMA_NO_NODE if needed while allocating
>>> memory slices for send/recv rings at some cost of dip in performance.
>>> Reported-by: syzbot <syzbot+274094e62023782eeb17@...kaller.appspotmail.com>
>>> Fixes: e4c52c98e049 ("RDS/IB: add _to_node() macros for numa and use {k,v}malloc_node()")
>>> Cc: Santosh Shilimkar <santosh.shilimkar@...cle.com>
>>> Cc: Andy Grover <andy.grover@...cle.com>
>>> Signed-off-by: Hillf Danton <hdanton@...a.com>
>>> ---
>> Acked-by: Santosh Shilimkar <santosh.shilimkar@...cle.com>
>>
>>> --- a/net/rds/ib.c
>>> +++ b/net/rds/ib.c
>>> @@ -137,7 +137,8 @@ static void rds_ib_add_one(struct ib_dev
>>> return;
>>> rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL,
>>> - ibdev_to_node(device));
>>> + device->dev.parent ?
>>> + ibdev_to_node(device) : NUMA_NO_NODE);
>
> I would strongly advice this fix to be applied to the define itself, so the fix will be made for all 4 calls as well. Aka:
>
> #define ibdev_to_node(ibdev) (ibdev)->dev.parent ? dev_to_node((ibdev)->dev.parent) : NUMA_NO_NODE
>
Indeed.
Hillf, Can you please spin V2 with it ?
Powered by blists - more mailing lists