lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Oct 2020 15:18:17 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Dietmar Eggemann <dietmar.eggemann@....com>
Cc:     tglx@...utronix.de, mingo@...nel.org, linux-kernel@...r.kernel.org,
        bigeasy@...utronix.de, qais.yousef@....com, swood@...hat.com,
        valentin.schneider@....com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, rostedt@...dmis.org,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vincent.donnefort@....com, tj@...nel.org
Subject: Re: [PATCH -v2 07/17] sched: Fix hotplug vs CPU bandwidth control

On Mon, Oct 12, 2020 at 02:52:00PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 09, 2020 at 10:41:11PM +0200, Dietmar Eggemann wrote:
> > On 05/10/2020 16:57, Peter Zijlstra wrote:
> > > Since we now migrate tasks away before DYING, we should also move
> > > bandwidth unthrottle, otherwise we can gain tasks from unthrottle
> > > after we expect all tasks to be gone already.
> > > 
> > > Also; it looks like the RT balancers don't respect cpu_active() and
> > > instead rely on rq->online in part, complete this. This too requires
> > > we do set_rq_offline() earlier to match the cpu_active() semantics.
> > > (The bigger patch is to convert RT to cpu_active() entirely)
> > > 
> > > Since set_rq_online() is called from sched_cpu_activate(), place
> > > set_rq_offline() in sched_cpu_deactivate().
> 
> > [   76.215229] WARNING: CPU: 1 PID: 1913 at kernel/irq_work.c:95 irq_work_queue_on+0x108/0x110
> 
> > [   76.341076]  irq_work_queue_on+0x108/0x110
> > [   76.349185]  pull_rt_task+0x58/0x68
> > [   76.352673]  balance_rt+0x84/0x88
> 
> > balance_rt() checks via need_pull_rt_task() that rq is online but it
> > looks like that with RT_PUSH_IPI pull_rt_task() -> tell_cpu_to_push()
> > calls irq_work_queue_on() with cpu = rto_next_cpu(rq->rd) and this one
> > can be offline here as well now.
> 
> Hurmph... so if I read this right, we reach offline with overload set?
> 
> Oooh, I think I see how that happens..

I think the below two hunks need to be reverted from the patch. Can you
confirm?

--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2326,9 +2326,6 @@ static void rq_online_dl(struct rq *rq)
 /* Assumes rq->lock is held */
 static void rq_offline_dl(struct rq *rq)
 {
-	if (rq->dl.overloaded)
-		dl_clear_overload(rq);
-
 	cpudl_clear(&rq->rd->cpudl, rq->cpu);
 	cpudl_clear_freecpu(&rq->rd->cpudl, rq->cpu);
 }
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2245,9 +2245,6 @@ static void rq_online_rt(struct rq *rq)
 /* Assumes rq->lock is held */
 static void rq_offline_rt(struct rq *rq)
 {
-	if (rq->rt.overloaded)
-		rt_clear_overload(rq);
-
 	__disable_runtime(rq);
 
 	cpupri_set(&rq->rd->cpupri, rq->cpu, CPUPRI_INVALID);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ