linux-kernel - Re: [PATCH 2/9] IB: add a proper completion queue abstraction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151113182513.GB21808@obsidianresearch.com>
Date:	Fri, 13 Nov 2015 11:25:13 -0700
From:	Jason Gunthorpe <jgunthorpe@...idianresearch.com>
To:	Christoph Hellwig <hch@....de>
Cc:	linux-rdma@...r.kernel.org, sagig@....mellanox.co.il,
	bart.vanassche@...disk.com, axboe@...com,
	linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/9] IB: add a proper completion queue abstraction

On Fri, Nov 13, 2015 at 02:46:43PM +0100, Christoph Hellwig wrote:
> This adds an abstraction that allows ULP to simply pass a completion
> object and completion callback with each submitted WR and let the RDMA
> core handle the nitty gritty details of how to handle completion
> interrupts and poll the CQ.

This looks pretty nice, I'd really like to look it over carefully
after SC|15..

I know Bart and others have attempted to have switching between event
and polling driven operation, but there were problems resolving the
races. Would be nice to review that conversation.. Do you remember the
details Bart?

> +static int __ib_process_cq(struct ib_cq *cq, int budget)
> +{
> +	int i, n, completed = 0;
> +
> +	while ((n = ib_poll_cq(cq, IB_POLL_BATCH, cq->wc)) > 0) {
> +		completed += n;
> +		if (completed >= budget)
> +			break;

For instance, like this, not fulling draining the cq and then doing:

> +	completed = __ib_process_cq(cq, budget);
> +	if (completed < budget) {
> +		irq_poll_complete(&cq->iop);
> +		if (ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0) {

Doesn't seem entirely right? There is no point in calling
ib_req_notify_cq if the code knows there is still stuff in the CQ and
has already, independently, arranged for ib_poll_hander to be
guarenteed called.

> +			if (!irq_poll_sched_prep(&cq->iop))
> +				irq_poll_sched(&cq->iop);

Which, it seems, is what this is doing.

Assuming irq_poll_sched is safe to call from a hard irq context, this
looks sane, at first glance.

> +	completed = __ib_process_cq(cq, IB_POLL_BUDGET_WORKQUEUE);
> +	if (completed >= IB_POLL_BUDGET_WORKQUEUE ||
> +	    ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0)
> +		queue_work(ib_comp_wq, &cq->work);

Same comment here..

> +static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
> +{
> +	queue_work(ib_comp_wq, &cq->work);

> +	switch (cq->poll_ctx) {
> +	case IB_POLL_DIRECT:
> +		cq->comp_handler = ib_cq_completion_direct;
> +		break;
> +	case IB_POLL_SOFTIRQ:
> +		cq->comp_handler = ib_cq_completion_softirq;
> +
> +		irq_poll_init(&cq->iop, IB_POLL_BUDGET_IRQ, ib_poll_handler);
> +		irq_poll_enable(&cq->iop);
> +		ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
> +		break;

I understand several drivers are not using a hard irq context for the
comp_handler call back. Is there any way to exploit that in this new
API so we don't have to do so many context switches? Ie if the driver
already is using a softirq when calling comp_handler can we somehow
just rig ib_poll_handler directly and avoid the overhead? (Future)

At first glance this seems so much saner than what we have..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/