linux-kernel - Re: [PATCH 2/2] sched/fair: Scale wakeup granularity relative to nr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210922185724.GD3959@techsingularity.net>
Date:   Wed, 22 Sep 2021 19:57:24 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Mike Galbraith <efault@....de>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Aubrey Li <aubrey.li@...ux.intel.com>,
        Barry Song <song.bao.hua@...ilicon.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] sched/fair: Scale wakeup granularity relative to
 nr_running

On Wed, Sep 22, 2021 at 08:22:43PM +0200, Vincent Guittot wrote:
> > > > > In
> > > > > your case, you want hackbench threads to not preempt each others
> > > > > because they tries to use same resources so it's probably better to
> > > > > let the current one to move forward but that's not a universal policy.
> > > > >
> > > >
> > > > No, but have you a better suggestion? hackbench might be stupid but it's
> > > > an example of where a workload can excessively preempt itself. While
> > > > overloading an entire machine is stupid, it could also potentially occurs
> > > > for applications running within a constrained cpumask.
> > >
> > > But this is property that is specific to each application. Some can
> > > have a lot of running threads but few wakes up which have to preempt
> > > current threads quickly but others just want the opposite
> > > So because it is a application specific property we should define it
> > > this way instead of trying to guess
> >
> > I'm not seeing an alternative suggestion that could be turned into
> > an implementation. The current value for sched_wakeup_granularity
> > was set 12 years ago was exposed for tuning which is no longer
> > the case. The intent was to allow some dynamic adjustment between
> > sysctl_sched_wakeup_granularity and sysctl_sched_latency to reduce
> > over-scheduling in the worst case without disabling preemption entirely
> > (which the first version did).
> >
> > Should we just ignore this problem and hope it goes away or just let
> > people keep poking silly values into debugfs via tuned?
> 
> We should certainly not add a bandaid because people will continue to
> poke silly value at the end. And increasing
> sysctl_sched_wakeup_granularity based on the number of running threads
> is not the right solution. According to the description of your
> problem that the current task doesn't get enough time to move forward,
> sysctl_sched_min_granularity should be part of the solution. Something
> like below will ensure that current got a chance to move forward
> 

That's a very interesting idea! I've queued it up for further testing
and as a comparison to the bandaid.

-- 
Mel Gorman
SUSE Labs