linux-kernel - Re: [PATCH -next 1/1] sched/rt: Try to restart rt period timer when rt runtime exceeded

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YZZvMbX719ZKS0CQ@hirez.programming.kicks-ass.net>
Date:   Thu, 18 Nov 2021 16:20:17 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Li Hua <hucool.lihua@...wei.com>
Cc:     Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        yuehaibing@...wei.com, weiyongjun1@...wei.com,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        linux-kernel@...r.kernel.org, w.f@...wei.com,
        cj.chengjian@...wei.com, judy.chenhui@...wei.com
Subject: Re: [PATCH -next 1/1] sched/rt: Try to restart rt period timer when
 rt runtime exceeded

On Mon, Nov 15, 2021 at 01:46:28AM +0000, Li Hua wrote:
> When rt_runtime is modified from -1 to a valid control value, it may
> cause the task to be throttled all the time. Operations like the following
> will trigger the bug. E.g:
> 1. echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> 2. Run a FIFO task named A that executes while(1)
> 3. echo 950000 > /proc/sys/kernel/sched_rt_runtime_us
> 
> When rt_runtime is -1, The rt period timer will not be activated when task A
> enqueued. And then the task will be throttled after setting rt_runtime to
> 950,000. The task will always be throttled because the rt period timer is not
> activated.
> 
> Reported-by: Hulk Robot <hulkci@...wei.com>
> Signed-off-by: Li Hua <hucool.lihua@...wei.com>
> ---
>  kernel/sched/rt.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index bb945f8faeca..630f2cbe37d0 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -947,6 +947,23 @@ static inline int rt_se_prio(struct sched_rt_entity *rt_se)
>  	return rt_task_of(rt_se)->prio;
>  }
>  
> +static inline void try_start_rt_bandwidth(struct rt_bandwidth *rt_b)
> +{
> +	raw_spin_lock(&rt_b->rt_runtime_lock);
> +	if (!rt_bandwidth_enabled() || rt_b->rt_runtime == RUNTIME_INF) {
> +		raw_spin_unlock(&rt_b->rt_runtime_lock);
> +		return;
> +	}
> +
> +	if (!rt_b->rt_period_active) {
> +		rt_b->rt_period_active = 1;
> +		hrtimer_forward_now(&rt_b->rt_period_timer, rt_b->rt_period);
> +		hrtimer_start_expires(&rt_b->rt_period_timer,
> +				      HRTIMER_MODE_ABS_PINNED_HARD);
> +	}
> +	raw_spin_unlock(&rt_b->rt_runtime_lock);
> +}

This is almost a verbatim copy of start_rt_bandwidth() surely we can do
better.

> +
>  static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
>  {
>  	u64 runtime = sched_rt_runtime(rt_rq);
> @@ -1027,11 +1044,16 @@ static void update_curr_rt(struct rq *rq)
>  		struct rt_rq *rt_rq = rt_rq_of_se(rt_se);
>  
>  		if (sched_rt_runtime(rt_rq) != RUNTIME_INF) {
> +			int exceeded;
> +
>  			raw_spin_lock(&rt_rq->rt_runtime_lock);
>  			rt_rq->rt_time += delta_exec;
> -			if (sched_rt_runtime_exceeded(rt_rq))
> +			exceeded = sched_rt_runtime_exceeded(rt_rq);
> +			if (exceeded)
>  				resched_curr(rq);
>  			raw_spin_unlock(&rt_rq->rt_runtime_lock);
> +			if (exceeded)
> +				try_start_rt_bandwidth(sched_rt_bandwidth(rt_rq));
>  		}
>  	}
>  }
> @@ -2905,8 +2927,10 @@ static int sched_rt_global_validate(void)
>  
>  static void sched_rt_do_global(void)
>  {
> +	raw_spin_lock(&def_rt_bandwidth.rt_runtime_lock);
>  	def_rt_bandwidth.rt_runtime = global_rt_runtime();
>  	def_rt_bandwidth.rt_period = ns_to_ktime(global_rt_period());
> +	raw_spin_unlock(&def_rt_bandwidth.rt_runtime_lock);

And that's just wrong I think; did you test this with lockdep enabled?
IIRC this lock is irq-safe, it has to be if you're using it form a timer
context.