netdev - Re: [PATCH V1 net-next 03/10] net/mlx4_core: Use tasklet for user-space CQ completion events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1418225599.27198.18.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Wed, 10 Dec 2014 07:33:19 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Or Gerlitz <ogerlitz@...lanox.com>
Cc:	"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
	Matan Barak <matanb@...lanox.com>,
	Amir Vadai <amirv@...lanox.com>, Tal Alon <talal@...lanox.com>,
	Jack Morgenstein <jackm@....mellanox.co.il>
Subject: Re: [PATCH V1 net-next 03/10] net/mlx4_core: Use tasklet for
 user-space CQ completion events

On Wed, 2014-12-10 at 15:09 +0200, Or Gerlitz wrote:
> From: Matan Barak <matanb@...lanox.com>
> 
> Previously, we've fired all our completion callbacks straight from our ISR.
> 
> Some of those callbacks were lightweight (for example, mlx4_en's and
> IPoIB napi callbacks), but some of them did more work (for example,
> the user-space RDMA stack uverbs' completion handler). Besides that,
> doing more than the minimal work in ISR is generally considered wrong,
> it could even lead to a hard lockup of the system. Since when a lot
> of completion events are generated by the hardware, the loop over those
> events could be so long, that we'll get into a hard lockup by the system
> watchdog.

...

> +#define TASKLET_THRESHOLD 1000
> +
> +void mlx4_cq_tasklet_cb(unsigned long data)
> +{
> +	unsigned long flags;
> +	unsigned int i = 0;
> +	struct mlx4_eq_tasklet *ctx = (struct mlx4_eq_tasklet *)data;
> +	struct mlx4_cq *mcq, *temp;
> +
> +	spin_lock_irqsave(&ctx->lock, flags);
> +	list_splice_tail_init(&ctx->list, &ctx->process_list);
> +	spin_unlock_irqrestore(&ctx->lock, flags);
> +
> +	list_for_each_entry_safe(mcq, temp, &ctx->process_list, tasklet_ctx.list) {
> +		list_del_init(&mcq->tasklet_ctx.list);
> +		mcq->tasklet_ctx.comp(mcq);
> +		if (atomic_dec_and_test(&mcq->refcount))
> +			complete(&mcq->free);
> +		if (++i == TASKLET_THRESHOLD)
> +			break;
> +	}
> +
> +	if (i == TASKLET_THRESHOLD)
> +		tasklet_schedule(&ctx->task);
> +}
> +

What is the max duration of doing this loop up to 1000 times ?

I suspect it might be too long, but not necessarily detected by
conventional watchdog.

__do_softirq() uses both a counter and a test against jiffies, with a 2
ms limit.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html