[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160413182025.GN8697@intel.com>
Date: Thu, 14 Apr 2016 02:20:25 +0800
From: Yuyang Du <yuyang.du@...el.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Benjamin Segall <bsegall@...gle.com>,
Paul Turner <pjt@...gle.com>,
Morten Rasmussen <morten.rasmussen@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Juri Lelli <juri.lelli@....com>
Subject: Re: [PATCH 4/4] sched/fair: Implement flat hierarchical structure
for util_avg
Hi Vincent,
On Wed, Apr 13, 2016 at 01:27:21PM +0200, Vincent Guittot wrote:
> >> Why not using the sched_avg of the rq->cfs in order to track the
> >> utilization of the root cfs_rq instead of adding a new sched_avg into
> >> the rq ? Then you call update_cfs_rq_load_avg(rq->cfs) when you want
> >> to update/sync the utilization of the rq->cfs and for one call you
> >> will update both the load_avg and the util_avg of the root cfs instead
> >> of duplicating the sequence in _update_load_avg
> >
> > This is the approach taken by Dietmar in his patch, a fairly easy approach.
> > The problem is though, if so, we update the root cfs_rq only when it is
> > the root cfs_rq to update. A simple contrived case will make it never
> > updated except in update_blocked_averages(). My impression is that this
> > might be too much precision lost.
> >
> > And thus we take this alternative approach, and thus I revisited
> > __update_load_avg() to optimize it.
> >
> > [snip]
> >
> >> > - if (atomic_long_read(&cfs_rq->removed_util_avg)) {
> >> > - long r = atomic_long_xchg(&cfs_rq->removed_util_avg, 0);
> >> > - sa->util_avg = max_t(long, sa->util_avg - r, 0);
> >> > - sa->util_sum = max_t(s32, sa->util_sum - r * LOAD_AVG_MAX, 0);
> >> > + if (atomic_long_read(&rq->removed_util_avg)) {
> >> > + long r = atomic_long_xchg(&rq->removed_util_avg, 0);
> >> > + rq->avg.util_avg = max_t(long, rq->avg.util_avg - r, 0);
> >> > + rq->avg.util_sum = max_t(s32, rq->avg.util_sum - r * LOAD_AVG_MAX, 0);
> >>
> >> I see one potential issue here because the rq->util_avg may (surely)
> >> have been already updated and decayed during the update of a
> >> sched_entity but before we substract the removed_util_avg
> >
> > This is the same now, because cfs_rq will be regularly updated in
> > update_blocked_averages(), which basically means cfs_rq will be newer
> > than task for sure, although task tries to catch up when removed.
>
> I don't agree on that part. At now, we check and substract
> removed_util_avg before calling __update_load_avg for a cfs_rq, so it
> will be removed before changing last_update_time.
Despite the cross CPU issue, you are right.
> With your patch, we update rq->avg.util_avg and last_update_time
> without checking removed_util_avg.
But, yes, we do.
Powered by blists - more mailing lists