lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fc7efe6dd23ba1d25c29441fc8132ea2bbf7b5fb.camel@redhat.com>
Date:   Tue, 28 Apr 2020 17:52:13 -0500
From:   Scott Wood <swood@...hat.com>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Rik van Riel <riel@...riel.com>,
        Mel Gorman <mgorman@...e.de>, linux-kernel@...r.kernel.org,
        linux-rt-users <linux-rt-users@...r.kernel.org>
Subject: Re: [RFC PATCH 3/3] sched,rt: break out of load balancing if an RT
 task appears

On Tue, 2020-04-28 at 17:33 -0500, Scott Wood wrote:
> On Tue, 2020-04-28 at 22:56 +0100, Valentin Schneider wrote:
> > On 28/04/20 06:02, Scott Wood wrote:
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index dfde7f0ce3db..e7437e4e40b4 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -9394,6 +9400,10 @@ static int should_we_balance(struct lb_env
> > > *env)
> > >       struct sched_group *sg = env->sd->groups;
> > >       int cpu, balance_cpu = -1;
> > > 
> > > +	/* Run the realtime task now; load balance later. */
> > > +	if (rq_has_runnable_rt_task(env->dst_rq))
> > > +		return 0;
> > > +
> > 
> > I have a feeling this isn't very nice to CFS tasks, since we would now
> > "waste" load-balance attempts if they happen to coincide with an RT task
> > being runnable.
> > 
> > On your 72 CPUs machine, the system-wide balance happens (at best) every
> > 72ms if you have idle time, every ~2300ms otherwise (every balance
> > CPU gets to try to balance however, so it's not as horrible as I'm
> > making
> > it sound). This is totally worst-case scenario territory, and you'd hope
> > newidle_balance() could help here and there (as it isn't gated by any
> > balance interval).
> > 
> > Still, even for a single rq, postponing a system-wide balance for a
> > full balance interval (i.e. ~2 secs worst case here) just because we had
> > a
> > single RT task running when we tried to balance seems a bit much.
> > 
> > It may be possible to hack something to detect those cases and reset the
> > interval to "now" when e.g. dequeuing the last RT task (& after having
> > previously aborted a load-balance due to RT/DL/foobar).
> 
> Yeah, some way to retry at an appropriate time after aborting a rebalance
> would be good.

Another option is to limit the bailing out to newidle balancing (as the
patchset currently stands, it isn't checking the right rq for global
balancing anyway).  On RT the softirq runs from thread context, so enabling
interrupts and (on RT) preemption should suffice to avoid latency problems
in the global rebalance.

-Scott


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ