lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241114142810.794657-1-juri.lelli@redhat.com>
Date: Thu, 14 Nov 2024 14:28:08 +0000
From: Juri Lelli <juri.lelli@...hat.com>
To: Waiman Long <longman@...hat.com>,
	Tejun Heo <tj@...nel.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Koutny <mkoutny@...e.com>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Phil Auld <pauld@...hat.com>
Cc: Qais Yousef <qyousef@...alina.io>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	"Joel Fernandes (Google)" <joel@...lfernandes.org>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Aashish Sharma <shraash@...gle.com>,
	Shin Kawamura <kawasin@...gle.com>,
	Vineeth Remanan Pillai <vineeth@...byteword.org>,
	linux-kernel@...r.kernel.org,
	cgroups@...r.kernel.org,
	Juri Lelli <juri.lelli@...hat.com>
Subject: [PATCH v2 0/2] Fix DEADLINE bandwidth accounting in root domain changes and hotplug

Hello!

v2 of a patch series [3] that addresses two issues affecting DEADLINE
bandwidth accounting during non-destructive changes to root domains and
hotplug operations. The series is based on top of Waiman's
"cgroup/cpuset: Remove redundant rebuild_sched_domains_locked() calls"
series [1] which is now merged into cgroups/for-6.13 (this series is
based on top of that, commit c4c9cebe2fb9). The discussion that
eventually led to these two series can be found at [2].

Waiman reported that v1 still failed to make his test_cpuset_prs.sh
happy, so I had to change both patches a little. It now seems to pass on
my runs.

Patch 01/02 deals with non-destructive root domain changes. With respect
to v1 we now always restore dl_server contributions, considering root
domain span and active cpus mask (otherwise accounting on the default
root domain would end up to be incorrect).

Patch 02/02 deals with hotplug. With respect to v1 I added special
casing when total_bw = 0 (so no DEADLINE tasks to consider) and when a
root domain is left with no cpus due to hotplug.

In all honesty, I still see intermittent issues that seems to however be
related to the dance we do in sched_cpu_deactivate(), where we first
turn everything related to a cpu/rq off and revert that if
cpuset_cpu_inactive() reveals failing DEADLINE checks. But, since these
seem to be orthogonal to the original discussion we started from, I
wanted to send this out as an hopefully meaningful update/improvement
since yesterday. Will continue looking into this.

Please go forth and test/review.

Series also available at

git@...hub.com:jlelli/linux.git upstream/dl-server-apply

Best,
Juri

[1] https://lore.kernel.org/lkml/20241110025023.664487-1-longman@redhat.com/
[2] https://lore.kernel.org/lkml/20241029225116.3998487-1-joel@joelfernandes.org/
[3] v1 - https://lore.kernel.org/lkml/20241113125724.450249-1-juri.lelli@redhat.com/

Juri Lelli (2):
  sched/deadline: Restore dl_server bandwidth on non-destructive root
    domain changes
  sched/deadline: Correctly account for allocated bandwidth during
    hotplug

 kernel/sched/core.c     |  2 +-
 kernel/sched/deadline.c | 65 +++++++++++++++++++++++++++++++++--------
 kernel/sched/sched.h    |  2 +-
 kernel/sched/topology.c |  8 +++--
 4 files changed, 60 insertions(+), 17 deletions(-)

-- 
2.47.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ