lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xm26inxsyfed.fsf@bsegall-linux.mtv.corp.google.com>
Date:	Wed, 01 Jun 2016 10:42:34 -0700
From:	bsegall@...gle.com
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Vincent Guittot <vincent.guittot@...aro.org>, mingo@...nel.org,
	linux-kernel@...r.kernel.org, pjt@...gle.com, yuyang.du@...el.com,
	dietmar.eggemann@....com
Subject: Re: [RFC PATCH] sched: fix hierarchical order in rq->leaf_cfs_rq_list

Peter Zijlstra <peterz@...radead.org> writes:

> On Tue, May 24, 2016 at 11:55:10AM +0200, Vincent Guittot wrote:
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 218f8e8..6d3fbf2 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -290,15 +290,31 @@ static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
>>  		 * Ensure we either appear before our parent (if already
>>  		 * enqueued) or force our parent to appear after us when it is
>>  		 * enqueued.  The fact that we always enqueue bottom-up
>> +		 * reduces this to two cases and a specila case for the root
>
> 'special'
>
>> +		 * cfs_rq.
>>  		 */
>>  		if (cfs_rq->tg->parent &&
>>  		    cfs_rq->tg->parent->cfs_rq[cpu_of(rq_of(cfs_rq))]->on_list) {
>> +			/* Add the child just before its parent */
>> +			list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
>> +				&(cfs_rq->tg->parent->cfs_rq[cpu_of(rq_of(cfs_rq))]->leaf_cfs_rq_list));
>> +			rq_of(cfs_rq)->leaf_alone = &rq_of(cfs_rq)->leaf_cfs_rq_list;
>> +		} else if (!cfs_rq->tg->parent) {
>> +			/*
>> +			 * cfs_rq without parent should be
>> +			 * at the end of the list
>> +			 */
>>  			list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
>>  				&rq_of(cfs_rq)->leaf_cfs_rq_list);
>> +			rq_of(cfs_rq)->leaf_alone = &rq_of(cfs_rq)->leaf_cfs_rq_list;
>> +		} else {
>> +			/*
>> +			 * Our parent has not already been added so make sure
>> +			 * that it will be put after us
>> +			 */
>> +			list_add_rcu(&cfs_rq->leaf_cfs_rq_list,
>> +				rq_of(cfs_rq)->leaf_alone);
>> +			rq_of(cfs_rq)->leaf_alone = &cfs_rq->leaf_cfs_rq_list;
>>  		}
>>  
>>  		cfs_rq->on_list = 1;
>
> Paul, Ben ?
>
> This used to be critical for update_shares() (now
> update_blocked_averages), but IIRC is not critical for that since
> PELT.

Yeah, given that we no longer update_cfs_shares in that path, it
shouldn't be as important (unless new code is being added that it will
be useful for). That said, I honestly don't remember why we don't
update_cfs_shares, as it could affect the load.weight being used in a
parent's computation. Paul, do you remember? Was it just too expensive
and less necessary given the other improvements?

>
> I find its more readable with like so..
>
>
> Also; I feel the comments can use more love; my head hurts ;-)

Yeah


leaf_alone here is basically a temporary for the duration of an
enqueue_task_fair call, yes? A name suggesting that might be useful, or
else a comment mentioning that one of the first two cases will always
clear the otherwise unsafe reference before it can be a problem.

I think this also only barely works with throttling: even if the tg as a
whole is out of runtime, an individual cfs_rq can't be throttled until
just one line after list_add_cfs_rq, and we never list_del until cgroup
destruction. A throttled !on_list cfs_rq would cause us to never reset
leaf_alone, but I don't think that can quite happen.

>
> ---
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -286,35 +286,38 @@ static inline struct cfs_rq *group_cfs_r
>  static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq)
>  {
>  	if (!cfs_rq->on_list) {
> +		struct rq *rq = rq_of(cfs_rq);
> +		int cpu = cpu_of(rq);
> +
>  		/*
>  		 * Ensure we either appear before our parent (if already
>  		 * enqueued) or force our parent to appear after us when it is
>  		 * enqueued.  The fact that we always enqueue bottom-up
> -		 * reduces this to two cases and a specila case for the root
> +		 * reduces this to two cases and a special case for the root
>  		 * cfs_rq.
>  		 */
>  		if (cfs_rq->tg->parent &&
> -		    cfs_rq->tg->parent->cfs_rq[cpu_of(rq_of(cfs_rq))]->on_list) {
> +		    cfs_rq->tg->parent->cfs_rq[cpu]->on_list) {
>  			/* Add the child just before its parent */
>  			list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
> -				&(cfs_rq->tg->parent->cfs_rq[cpu_of(rq_of(cfs_rq))]->leaf_cfs_rq_list));
> -			rq_of(cfs_rq)->leaf_alone = &rq_of(cfs_rq)->leaf_cfs_rq_list;
> +				&(cfs_rq->tg->parent->cfs_rq[cpu]->leaf_cfs_rq_list));
> +			rq->leaf_alone = &rq->leaf_cfs_rq_list;
>  		} else if (!cfs_rq->tg->parent) {
>  			/*
>  			 * cfs_rq without parent should be
>  			 * at the end of the list
>  			 */
>  			list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list,
> -				&rq_of(cfs_rq)->leaf_cfs_rq_list);
> -			rq_of(cfs_rq)->leaf_alone = &rq_of(cfs_rq)->leaf_cfs_rq_list;
> +					  &rq->leaf_cfs_rq_list);
> +			rq->leaf_alone = &rq->leaf_cfs_rq_list;
>  		} else {
>  			/*
>  			 * Our parent has not already been added so make sure
>  			 * that it will be put after us
>  			 */
>  			list_add_rcu(&cfs_rq->leaf_cfs_rq_list,
> -				rq_of(cfs_rq)->leaf_alone);
> -			rq_of(cfs_rq)->leaf_alone = &cfs_rq->leaf_cfs_rq_list;
> +				     rq->leaf_alone);
> +			rq->leaf_alone = &cfs_rq->leaf_cfs_rq_list;
>  		}
>  
>  		cfs_rq->on_list = 1;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ