linux-kernel - Re: [PATCH v2] sched: async unthrottling for cfs bandwidth

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y2Gf3zxJqxRnkVyf@slm.duckdns.org>
Date:   Tue, 1 Nov 2022 12:38:23 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Josh Don <joshdon@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        linux-kernel@...r.kernel.org,
        Joel Fernandes <joel@...lfernandes.org>,
        Michal Koutný <mkoutny@...e.com>,
        Christian Brauner <brauner@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>
Subject: Re: [PATCH v2] sched: async unthrottling for cfs bandwidth

Hello,

(cc'ing Michal, Christian and Li for context)

On Tue, Nov 01, 2022 at 02:59:56PM -0700, Josh Don wrote:
> > If most of the problems were with cpu bw control, fixing that should do for
> > the time being. Otherwise, we'll have to think about finishing kernfs
> > locking granularity improvements and doing something similar to cgroup
> > locking too.
> 
> Oh we've easily hit stalls measured in multiple seconds. We
> extensively use cpu.idle to group batch tasks. One of the memory
> bandwidth mitigations implemented in userspace is cpu jailing, which
> can end up pushing lots and lots of these batch threads onto a small
> number of cpus. 5ms min gran * 200 threads is already one second :)

Ah, I see.

> We're in the process of transitioning to using bw instead for this
> instead in order to maintain parallelism. Fixing bw is definitely
> going to be useful, but I'm afraid we'll still likely have some issues
> from low throughput for non-bw reasons (some of which we can't
> directly control, since arbitrary jobs can spin up and configure their
> hierarchy/threads in antagonistic ways, in effect pushing out the
> latency of some of their threads).

Yeah, thanks for the explanation. Making the lock more granular is tedious
but definitely doable. I don't think I can work on it in the near future but
will keep it on mind. If anyone's interested in attacking it, please be my
guest.

Thanks.

-- 
tejun