[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20171110.145516.1895467386398674445.davem@davemloft.net>
Date: Fri, 10 Nov 2017 14:55:16 +0900 (KST)
From: David Miller <davem@...emloft.net>
To: Haakon.Bugge@...cle.com
Cc: santosh.shilimkar@...cle.com, netdev@...r.kernel.org,
linux-rdma@...r.kernel.org, rds-devel@....oracle.com,
linux-kernel@...r.kernel.org, knut.omang@...cle.com,
wei.lin.guay@...cle.com
Subject: Re: [PATCH net] rds: ib: Fix NULL pointer dereference in debug code
From: Håkon Bugge <Haakon.Bugge@...cle.com>
Date: Tue, 7 Nov 2017 16:33:34 +0100
> rds_ib_recv_refill() is a function that refills an IB receive
> queue. It can be called from both the CQE handler (tasklet) and a
> worker thread.
>
> Just after the call to ib_post_recv(), a debug message is printed with
> rdsdebug():
>
> ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
> rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
> recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
> (long) ib_sg_dma_address(
> ic->i_cm_id->device,
> &recv->r_frag->f_sg),
> ret);
>
> Now consider an invocation of rds_ib_recv_refill() from the worker
> thread, which is preemptible. Further, assume that the worker thread
> is preempted between the ib_post_recv() and rdsdebug() statements.
>
> Then, if the preemption is due to a receive CQE event, the
> rds_ib_recv_cqe_handler() will be invoked. This function processes
> receive completions, including freeing up data structures, such as the
> recv->r_frag.
>
> In this scenario, rds_ib_recv_cqe_handler() will process the receive
> WR posted above. That implies, that the recv->r_frag has been freed
> before the above rdsdebug() statement has been executed. When it is
> later executed, we will have a NULL pointer dereference:
...
> This bug was provoked by compiling rds out-of-tree with
> EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
> between the rdsdebug() and ib_ib_port_recv() statements:
>
> /* XXX when can this fail? */
> ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
> + if (can_wait)
> + usleep_range(1000, 5000);
> rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
> recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
> (long) ib_sg_dma_address(
>
> The fix is simply to move the rdsdebug() statement up before the
> ib_post_recv() and remove the printing of ret, which is taken care of
> anyway by the non-debug code.
>
> Signed-off-by: Håkon Bugge <haakon.bugge@...cle.com>
> Reviewed-by: Knut Omang <knut.omang@...cle.com>
> Reviewed-by: Wei Lin Guay <wei.lin.guay@...cle.com>
Applied, thank you.
Powered by blists - more mailing lists