linux-kernel - Re: [PATCH v2] sched/task_group: Re-layout structure to reduce false sharing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230627101053.GX4253@hirez.programming.kicks-ass.net>
Date:   Tue, 27 Jun 2023 12:10:53 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Aaron Lu <aaron.lu@...el.com>
Cc:     Chen Yu <yu.c.chen@...el.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Deng Pan <pan.deng@...el.com>, tim.c.chen@...el.com,
        linux-kernel@...r.kernel.org, tianyou.li@...el.com,
        yu.ma@...el.com, lipeng.zhu@...el.com,
        Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Re: [PATCH v2] sched/task_group: Re-layout structure to reduce false
 sharing

On Mon, Jun 26, 2023 at 08:53:35PM +0800, Aaron Lu wrote:
> On Mon, Jun 26, 2023 at 03:52:17PM +0800, Chen Yu wrote:
> > Besides the cache line alignment, if the task is not a rt one,
> > why do we have to touch that, I wonder if the following change can avoid that:
> > 
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index ec7b3e0a2b20..067f1310bad2 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -1958,8 +1958,10 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu)
> >  #endif
> >  
> >  #ifdef CONFIG_RT_GROUP_SCHED
> > -	p->rt.rt_rq  = tg->rt_rq[cpu];
> > -	p->rt.parent = tg->rt_se[cpu];
> > +	if (p->sched_class = &rt_sched_class) {
>                            ==  :-)
> 
> > +		p->rt.rt_rq  = tg->rt_rq[cpu];
> > +		p->rt.parent = tg->rt_se[cpu];
> > +	}
> >  #endif
> >  }
> 
> If a task starts life as a SCHED_NORMAL one and then after some time
> it's changed to a RT one, then during its next ttwu(), if it didn't
> migrate, then set_task_rq() will not be called and p->rt.rt_rq will
> keep as NULL which will cause problem when this task gets enqueued as
> a rt one.
> 
> The follow diff seems to cure this issue:
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index c7db597e8175..8c57148e668c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7801,6 +7801,20 @@ static int __sched_setscheduler(struct task_struct *p,
>  	}
>  	__setscheduler_uclamp(p, attr);
>  
> +#ifdef CONFIG_RT_GROUP_SCHED
> +	/*
> +	 * Make sure when this task becomes a rt one,
> +	 * its rt fields have valid value.
> +	 */
> +	if (rt_prio(newprio)) {
> +		struct task_group *tg = task_group(p);
> +		int cpu = cpu_of(rq);
> +
> +		p->rt.rt_rq = tg->rt_rq[cpu];
> +		p->rt.parent = tg->rt_se[cpu];
> +	}
> +#endif
> +
>  	if (queued) {
>  		/*
>  		 * We enqueue to tail when the priority of a task is
> 
> But I'm not sure if it's worth the trouble.

Not sufficient, you can become RT through PI and not pass
__sched_setscheduler().

The common code-path in this case would be check_class_changed(), that's
called for oth PI and __sched_setscheduler().

Anyway, not against this per-se, but RT_GROUP_SCHED is utter shite and
nobody should be using it. Also, if there's no measurable performance
gain (as stated elsewhere IIRC) we shouldn't be adding complexity.