linux-kernel - Re: [PATCH RFC] sched/rt: preserve global runtime/period ratio in do_balance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <519C7AAC.1010707@linux.vnet.ibm.com>
Date:	Wed, 22 May 2013 15:58:36 +0800
From:	Michael Wang <wangyun@...ux.vnet.ibm.com>
To:	Peter Boonstoppel <pboonstoppel@...dia.com>
CC:	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Paul Walmsley <pwalmsley@...dia.com>
Subject: Re: [PATCH RFC] sched/rt: preserve global runtime/period ratio in
 do_balance_runtime()

Hi, Peter

On 05/22/2013 05:30 AM, Peter Boonstoppel wrote:
> RT throttling aims to prevent starvation of non-SCHED_FIFO threads
> when a rogue RT thread is hogging the CPU. It does so by piggybacking
> on the rt_bandwidth system and allocating at most rt_runtime per
> rt_period to SCHED_FIFO tasks (e.g. 950ms out of every second,
> allowing 'regular' tasks to run for at least 50ms every second).
> 
> However, when multiple cores are available, rt_bandwidth allows cores
> to borrow rt_runtime from one another. This means that a core with a
> rogue RT thread, consuming 100% CPU cycles, can borrow enough runtime
> from other cores to allow the RT thread to run continuously, with no
> runtime for regular tasks on this core.

IMHO, such kind of starving should attributed to the Admin...

Reserve cpu will make realtime misnomer, then Admin will blame the
scheduler when his RT task got a higher latency...

Regards,
Michael Wang

> 
> Although regular tasks can get scheduled on other available cores
> (which are guaranteed to have some non-RT runtime avaible, since they
> just lent some RT time to us), tasks that are specifically affined to
> a particular core may not be able to make progress (e.g. workqueues,
> timer functions). This can break e.g. watchdog-like functionality that
> is supposed to kill the rogue RT thread.
> 
> This patch changes do_balance_runtime() in such a way that no core can
> aquire (borrow) more runtime than the globally set rt_runtime /
> rt_period ratio. This guarantees there will always be some non-RT
> runtime available on every individual core.
> 
> Signed-off-by: Peter Boonstoppel <pboonstoppel@...dia.com>
> ---
>  kernel/sched/rt.c |   21 ++++++++++++++++++---
>  1 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 127a2c4..5ec4eab 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -571,11 +571,25 @@ static int do_balance_runtime(struct rt_rq *rt_rq)
>  	struct root_domain *rd = rq_of_rt_rq(rt_rq)->rd;
>  	int i, weight, more = 0;
>  	u64 rt_period;
> +	u64 max_runtime;
> 
>  	weight = cpumask_weight(rd->span);
> 
>  	raw_spin_lock(&rt_b->rt_runtime_lock);
>  	rt_period = ktime_to_ns(rt_b->rt_period);
> +
> +	/* Don't allow more runtime than global ratio */
> +	if (global_rt_runtime() == RUNTIME_INF)
> +		max_runtime = rt_period;
> +	else
> +		max_runtime = div64_u64(global_rt_runtime() * rt_period,
> +					global_rt_period());
> +
> +	if (rt_rq->rt_runtime >= max_runtime) {
> +		raw_spin_unlock(&rt_b->rt_runtime_lock);
> +		return more;
> +	}
> +
>  	for_each_cpu(i, rd->span) {
>  		struct rt_rq *iter = sched_rt_period_rt_rq(rt_b, i);
>  		s64 diff;
> @@ -592,6 +606,7 @@ static int do_balance_runtime(struct rt_rq *rt_rq)
>  		if (iter->rt_runtime == RUNTIME_INF)
>  			goto next;
> 
> +
>  		/*
>  		 * From runqueues with spare time, take 1/n part of their
>  		 * spare time, but no more than our period.
> @@ -599,12 +614,12 @@ static int do_balance_runtime(struct rt_rq *rt_rq)
>  		diff = iter->rt_runtime - iter->rt_time;
>  		if (diff > 0) {
>  			diff = div_u64((u64)diff, weight);
> -			if (rt_rq->rt_runtime + diff > rt_period)
> -				diff = rt_period - rt_rq->rt_runtime;
> +			if (rt_rq->rt_runtime + diff > max_runtime)
> +				diff = max_runtime - rt_rq->rt_runtime;
>  			iter->rt_runtime -= diff;
>  			rt_rq->rt_runtime += diff;
>  			more = 1;
> -			if (rt_rq->rt_runtime == rt_period) {
> +			if (rt_rq->rt_runtime == max_runtime) {
>  				raw_spin_unlock(&iter->rt_runtime_lock);
>  				break;
>  			}
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/