[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABk29Nuyp=Ba=qiJAospx-SR2ZQM9GrKW0pDUeJ3sfgNB4uLFg@mail.gmail.com>
Date: Mon, 21 Nov 2022 11:37:14 -0800
From: Josh Don <joshdon@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Chengming Zhou <zhouchengming@...edance.com>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
Michal Koutný <mkoutny@...e.com>,
Christian Brauner <brauner@...nel.org>,
Zefan Li <lizefan.x@...edance.com>
Subject: Re: [PATCH v3] sched: async unthrottling for cfs bandwidth
On Mon, Nov 21, 2022 at 3:58 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Sun, Nov 20, 2022 at 10:22:40AM +0800, Chengming Zhou wrote:
> > > + if (cfs_rq->runtime_remaining > 0) {
> > > + if (cpu_of(rq) != this_cpu ||
> > > + SCHED_WARN_ON(local_unthrottle)) {
> > > + unthrottle_cfs_rq_async(cfs_rq);
> > > + } else {
> > > + local_unthrottle = cfs_rq;
> > > + }
> > > + } else {
> > > + throttled = true;
> > > + }
> >
> > Hello,
> >
> > I don't get the point why local unthrottle is put after all the remote cpus,
> > since this list is FIFO? (earliest throttled cfs_rq is at the head)
>
> Let the local completion time for a CPU be W. Then if we queue a remote
> work after the local synchronous work, the lower bound for total
> completion is at least 2W.
>
> OTOH, if we first queue all remote work and then process the local
> synchronous work, the lower bound for total completion is W.
>
> The practical difference is that all relevant CPUs get unthrottled
> rougly at the same point in time, unlike with the original case, where
> some CPUs have the opportunity to consume W runtime while another is
> still throttled.
Yep, this tradeoff feels "best", but there are some edge cases where
this could potentially disrupt fairness. For example, if we have
non-trivial W, a lot of cpus to iterate through for dispatching remote
unthrottle, and quota is small. Doesn't help that the timer is pinned
so that this will continually hit the same cpu. But as I alluded to, I
think the net benefit here is greater with the local unthrottling
ordered last.
Powered by blists - more mailing lists