lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 7 Nov 2017 08:41:29 -0800
From:   Santosh Shilimkar <santosh.shilimkar@...cle.com>
To:     Håkon Bugge <Haakon.Bugge@...cle.com>
Cc:     "David S . Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        linux-rdma@...r.kernel.org, rds-devel@....oracle.com,
        linux-kernel@...r.kernel.org, knut.omang@...cle.com,
        wei.lin.guay@...cle.com
Subject: Re: [PATCH net] rds: ib: Fix NULL pointer dereference in debug code

On 11/7/2017 7:33 AM, Håkon Bugge wrote:
> rds_ib_recv_refill() is a function that refills an IB receive
> queue. It can be called from both the CQE handler (tasklet) and a
> worker thread.
> 
> Just after the call to ib_post_recv(), a debug message is printed with
> rdsdebug():
> 
>              ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
>              rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
>                       recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
>                       (long) ib_sg_dma_address(
>                              ic->i_cm_id->device,
>                              &recv->r_frag->f_sg),
>                      ret);
> 
> Now consider an invocation of rds_ib_recv_refill() from the worker
> thread, which is preemptible. Further, assume that the worker thread
> is preempted between the ib_post_recv() and rdsdebug() statements.
> 
> Then, if the preemption is due to a receive CQE event, the
> rds_ib_recv_cqe_handler() will be invoked. This function processes
> receive completions, including freeing up data structures, such as the
> recv->r_frag.
> 
> In this scenario, rds_ib_recv_cqe_handler() will process the receive
> WR posted above. That implies, that the recv->r_frag has been freed
> before the above rdsdebug() statement has been executed. When it is
> later executed, we will have a NULL pointer dereference:
>
Yes. Good catch.

> [ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
> [ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma]
> [ 4088.082686] PGD 0 P4D 0
> [ 4088.085515] Oops: 0000 [#1] SMP
> [ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
> [ 4088.168486]  fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds]
> [ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G           OE   4.14.0-rc7.master.20171105.ol7.x86_64 #1
> [ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
> [ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm]
> [ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000
> [ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma]
> [ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286
> [ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050
> [ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580
> [ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000
> [ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000
> [ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000
> [ 4088.280397] FS:  0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000
> [ 4088.289434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0
> [ 4088.303816] Call Trace:
> [ 4088.306557]  rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma]
> [ 4088.312982]  ? __dynamic_pr_debug+0x8c/0xb0
> [ 4088.317664]  ? __queue_work+0x142/0x3c0
> [ 4088.321944]  rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma]
> [ 4088.328370]  cma_ib_handler+0xcd/0x280 [rdma_cm]
> [ 4088.333522]  cm_process_work+0x25/0x120 [ib_cm]
> [ 4088.338580]  cm_work_handler+0xd6b/0x17aa [ib_cm]
> [ 4088.343832]  process_one_work+0x149/0x360
> [ 4088.348307]  worker_thread+0x4d/0x3e0
> [ 4088.352397]  kthread+0x109/0x140
> [ 4088.355996]  ? rescuer_thread+0x380/0x380
> [ 4088.360467]  ? kthread_park+0x60/0x60
> [ 4088.364563]  ret_from_fork+0x25/0x30
> [ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 <4c> 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24
> [ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68
> [ 4088.397772] CR2: 0000000000000020
> [ 4088.401505] ---[ end trace fe922e6ccf004431 ]---
>
Next time get rid of the timestamp in the stack trace.

> This bug was provoked by compiling rds out-of-tree with
> EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
> between the rdsdebug() and ib_ib_port_recv() statements:
> 
>     	       /* XXX when can this fail? */
> 	       ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
> +		if (can_wait)
> +			usleep_range(1000, 5000);
> 	       rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
> 			recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
> 			(long) ib_sg_dma_address(
> 
> The fix is simply to move the rdsdebug() statement up before the
> ib_post_recv() and remove the printing of ret, which is taken care of
> anyway by the non-debug code.
> 
> Signed-off-by: Håkon Bugge <haakon.bugge@...cle.com>
> Reviewed-by: Knut Omang <knut.omang@...cle.com>
> Reviewed-by: Wei Lin Guay <wei.lin.guay@...cle.com>
> ---
>   net/rds/ib_recv.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
> 
Acked-by: Santosh Shilimkar <santosh.shilimkar@...cle.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ