lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190130130620.GB3103@hirez.programming.kicks-ass.net>
Date:   Wed, 30 Jan 2019 14:06:20 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com, tj@...nel.org,
        sargun@...gun.me
Subject: Re: [PATCH v2] sched/fair: Fix insertion in rq->leaf_cfs_rq_list

On Wed, Jan 30, 2019 at 02:04:10PM +0100, Peter Zijlstra wrote:
> On Wed, Jan 30, 2019 at 06:22:47AM +0100, Vincent Guittot wrote:
> 
> > The algorithm used to order cfs_rq in rq->leaf_cfs_rq_list assumes that
> > it will walk down to root the 1st time a cfs_rq is used and we will finish
> > to add either a cfs_rq without parent or a cfs_rq with a parent that is
> > already on the list. But this is not always true in presence of throttling.
> > Because a cfs_rq can be throttled even if it has never been used but other CPUs
> > of the cgroup have already used all the bandwdith, we are not sure to go down to
> > the root and add all cfs_rq in the list.
> > 
> > Ensure that all cfs_rq will be added in the list even if they are throttled.
> 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index e2ff4b6..826fbe5 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -352,6 +352,20 @@ static inline void list_del_leaf_cfs_rq(struct cfs_rq *cfs_rq)
> >  	}
> >  }
> >  
> > +static inline void list_add_branch_cfs_rq(struct sched_entity *se, struct rq *rq)
> > +{
> > +	struct cfs_rq *cfs_rq;
> > +
> > +	for_each_sched_entity(se) {
> > +		cfs_rq = cfs_rq_of(se);
> > +		list_add_leaf_cfs_rq(cfs_rq);
> > +
> > +		/* If parent is already in the list, we can stop */
> > +		if (rq->tmp_alone_branch == &rq->leaf_cfs_rq_list)
> > +			break;
> > +	}
> > +}
> > +
> >  /* Iterate through all leaf cfs_rq's on a runqueue: */
> >  #define for_each_leaf_cfs_rq(rq, cfs_rq) \
> >  	list_for_each_entry_rcu(cfs_rq, &rq->leaf_cfs_rq_list, leaf_cfs_rq_list)
> 
> > @@ -5179,6 +5197,9 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >  
> >  	}
> >  
> > +	/* Ensure that all cfs_rq have been added to the list */
> > +	list_add_branch_cfs_rq(se, rq);
> > +
> >  	hrtick_update(rq);
> >  }
> 
> So I don't much like this; at all. But maybe I misunderstand, this is
> somewhat tricky stuff and I've not looked at it in a while.
> 
> So per normal we do:
> 
> 	enqueue_task_fair()
> 	  for_each_sched_entity() {
> 	    if (se->on_rq)
> 	      break;
> 	    enqueue_entity()
> 	      list_add_leaf_cfs_rq();
> 	  }
> 
> This ensures that all parents are already enqueued, right? because this
> is what enqueues those parents.
> 
> And in this case you add an unconditional second
> for_each_sched_entity(); even though it is completely redundant, afaict.

Ah, it doesn't do a second iteration; it continues where the previous
two left off.

Still, why isn't this in unthrottle?

> The problem seems to stem from the whole throttled crud; which (also)
> breaks the above enqueue loop on throttle state, and there the parent can
> go missing.
> 
> So why doesn't this live in unthrottle_cfs_rq() ?
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ