lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Nov 2022 10:12:48 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Josh Don <joshdon@...gle.com>
Cc:     Chengming Zhou <zhouchengming@...edance.com>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
        Michal Koutný <mkoutny@...e.com>,
        Christian Brauner <brauner@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Frederic Weisbecker <fweisbec@...il.com>,
        anna-maria@...utronix.de
Subject: Re: [PATCH v3] sched: async unthrottling for cfs bandwidth

On Fri, Nov 25, 2022 at 09:59:23AM +0100, Peter Zijlstra wrote:
> On Fri, Nov 25, 2022 at 09:57:09AM +0100, Peter Zijlstra wrote:
> > On Tue, Nov 22, 2022 at 11:35:48AM +0100, Peter Zijlstra wrote:
> > > On Mon, Nov 21, 2022 at 11:37:14AM -0800, Josh Don wrote:
> > > > Yep, this tradeoff feels "best", but there are some edge cases where
> > > > this could potentially disrupt fairness. For example, if we have
> > > > non-trivial W, a lot of cpus to iterate through for dispatching remote
> > > > unthrottle, and quota is small. Doesn't help that the timer is pinned
> > > > so that this will continually hit the same cpu.
> > > 
> > > We could -- if we wanted to -- manually rotate the timer around the
> > > relevant CPUs. Doing that sanely would require a bit of hrtimer surgery
> > > though I'm afraid.
> > 
> > Here; something like so should enable us to cycle the bandwidth timer.
> > Just need to figure out a way to find another CPU or something.
> 
> Some more preparation...

And then I think something like so.. That migrates the timer to the CPU
of the first throttled entry -- possibly not the best heuristic, but its
the simplest.

NOTE: none of this has seen a compiler up close.

---
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5595,13 +5595,21 @@ static bool distribute_cfs_runtime(struc
  */
 static int do_sched_cfs_period_timer(struct cfs_bandwidth *cfs_b, int overrun, unsigned long flags)
 {
-	int throttled;
+	struct cfs_rq *first_cfs_rq;
+	int throttled = 0;
+	int cpu;
 
 	/* no need to continue the timer with no bandwidth constraint */
 	if (cfs_b->quota == RUNTIME_INF)
 		goto out_deactivate;
 
-	throttled = !list_empty(&cfs_b->throttled_cfs_rq);
+	first_cfs_rq = list_first_entry_or_null(&cfs_b->throttled_cfs_rq,
+						struct cfs_rq, throttled_list);
+	if (first_cfs_rq) {
+		throttled = 1;
+		cpu = cpu_of(rq_of(first_cfs_rq));
+	}
+
 	cfs_b->nr_periods += overrun;
 
 	/* Refill extra burst quota even if cfs_b->idle */
@@ -5641,7 +5649,7 @@ static int do_sched_cfs_period_timer(str
 	 */
 	cfs_b->idle = 0;
 
-	return HRTIMER_RESTART;
+	return HRTIMER_RESTART_MIGRATE + cpu;
 
 out_deactivate:
 	return HRTIMER_NORESTART;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ