linux-kernel - Re: [RFC PATCH v3 10/10] sched/fair: Throttle CFS tasks on return to userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xm26a5i7zi7g.fsf@google.com>
Date: Tue, 23 Jul 2024 18:34:11 -0700
From: Benjamin Segall <bsegall@...gle.com>
To: Valentin Schneider <vschneid@...hat.com>
Cc: linux-kernel@...r.kernel.org,  rcu@...r.kernel.org,  Peter Zijlstra
 <peterz@...radead.org>,  Ingo Molnar <mingo@...hat.com>,  Juri Lelli
 <juri.lelli@...hat.com>,  Vincent Guittot <vincent.guittot@...aro.org>,
  Dietmar Eggemann <dietmar.eggemann@....com>,  Steven Rostedt
 <rostedt@...dmis.org>,  Mel Gorman <mgorman@...e.de>,  Phil Auld
 <pauld@...hat.com>,  Clark Williams <williams@...hat.com>,  Tomas Glozar
 <tglozar@...hat.com>,  "Paul E. McKenney" <paulmck@...nel.org>,  Frederic
 Weisbecker <frederic@...nel.org>,  Neeraj Upadhyay
 <neeraj.upadhyay@...nel.org>,  Joel Fernandes <joel@...lfernandes.org>,
  Josh Triplett <josh@...htriplett.org>,  Boqun Feng
 <boqun.feng@...il.com>,  Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>,  Lai Jiangshan <jiangshanlai@...il.com>,
  Zqiang <qiang.zhang1211@...il.com>,  Alexander Gordeev
 <agordeev@...ux.ibm.com>,  Catalin Marinas <catalin.marinas@....com>,
  Arnd Bergmann <arnd@...db.de>,  Guo Ren <guoren@...nel.org>,  Palmer
 Dabbelt <palmer@...osinc.com>,  Andrew Morton <akpm@...ux-foundation.org>,
  Oleg Nesterov <oleg@...hat.com>,  Jens Axboe <axboe@...nel.dk>
Subject: Re: [RFC PATCH v3 10/10] sched/fair: Throttle CFS tasks on return
 to userspace

Valentin Schneider <vschneid@...hat.com> writes:

> On 18/07/24 17:25, Benjamin Segall wrote:
>> Valentin Schneider <vschneid@...hat.com> writes:
>>
>>> I've tested a 10ms runtime / 100ms period cgroup with an always running
>>> task: upstream gets a "clean" periodic pattern of 10ms runtime every 100ms,
>>> whereas this gets something more like 40ms runtime every 400ms.
>>
>> Hmm, this seems a little odd since TWA_RESUME does a kick_process.
>
> I didn't ponder too much on the workload used here, but the point I wanted
> to bring up is: if you give a cgroup X amount of runtime, it may still
> consume more than that within a single period because execution in
> kernelspace isn't immediately stopped/throttled.
>
> It means the "standard" bandwidth control behaviour becomes a bit more
> bursty.

Yeah, more bursty behavior when doing cpu-burning syscalls is expected.
With the check on exit to user I wouldn't expect anything worse than the
duration of the syscall though, so it depends on what your test was.

>>> +
>>> +	/*
>>> +	 * Account tasks woken up in children; by this point all direct children
>>> +	 * have been visited.
>>> +	 */
>>> +	task_delta += cfs_rq->unthrottled_h_nr_running;
>>> +	idle_task_delta += cfs_rq->unthrottled_idle_h_nr_running;
>>> +
>>> +	cfs_rq->h_nr_running += task_delta;
>>> +	cfs_rq->idle_h_nr_running += idle_task_delta;
>>> +
>>> +	/*
>>> +	 * unthrottle_cfs_rq() needs a value to up-propagate above the
>>> +	 * freshly unthrottled cfs_rq.
>>> +	 */
>>> +	cfs_rq->unthrottled_h_nr_running = task_delta;
>>> +	cfs_rq->unthrottled_idle_h_nr_running = idle_task_delta;
>>
>> I think this should have no effect, right?
>
> Hm so my thoughts here are:
> The walk_tg_tree_from(tg_unthrottle_up) will update *nr_running counts up
> to the cfs_rq->tg->se[cpu_of(rq)]. However if that cfs_rq isn't the root
> one, we'll need the for_each_sched_entity() loop further down
> unthrottle_cfs_rq() to update the upper part of the hierarchy. The values
> that will be up-propagated there are the ones being saved here.

I'm pretty sure this comment was left over from when I didn't understand
what they were being used for. I'm pretty sure I remember intending to
remove it.