[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210922185724.GD3959@techsingularity.net>
Date: Wed, 22 Sep 2021 19:57:24 +0100
From: Mel Gorman <mgorman@...hsingularity.net>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Mike Galbraith <efault@....de>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Valentin Schneider <valentin.schneider@....com>,
Aubrey Li <aubrey.li@...ux.intel.com>,
Barry Song <song.bao.hua@...ilicon.com>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] sched/fair: Scale wakeup granularity relative to
nr_running
On Wed, Sep 22, 2021 at 08:22:43PM +0200, Vincent Guittot wrote:
> > > > > In
> > > > > your case, you want hackbench threads to not preempt each others
> > > > > because they tries to use same resources so it's probably better to
> > > > > let the current one to move forward but that's not a universal policy.
> > > > >
> > > >
> > > > No, but have you a better suggestion? hackbench might be stupid but it's
> > > > an example of where a workload can excessively preempt itself. While
> > > > overloading an entire machine is stupid, it could also potentially occurs
> > > > for applications running within a constrained cpumask.
> > >
> > > But this is property that is specific to each application. Some can
> > > have a lot of running threads but few wakes up which have to preempt
> > > current threads quickly but others just want the opposite
> > > So because it is a application specific property we should define it
> > > this way instead of trying to guess
> >
> > I'm not seeing an alternative suggestion that could be turned into
> > an implementation. The current value for sched_wakeup_granularity
> > was set 12 years ago was exposed for tuning which is no longer
> > the case. The intent was to allow some dynamic adjustment between
> > sysctl_sched_wakeup_granularity and sysctl_sched_latency to reduce
> > over-scheduling in the worst case without disabling preemption entirely
> > (which the first version did).
> >
> > Should we just ignore this problem and hope it goes away or just let
> > people keep poking silly values into debugfs via tuned?
>
> We should certainly not add a bandaid because people will continue to
> poke silly value at the end. And increasing
> sysctl_sched_wakeup_granularity based on the number of running threads
> is not the right solution. According to the description of your
> problem that the current task doesn't get enough time to move forward,
> sysctl_sched_min_granularity should be part of the solution. Something
> like below will ensure that current got a chance to move forward
>
That's a very interesting idea! I've queued it up for further testing
and as a comparison to the bandaid.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists