[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7WsRvsVCWu_By1c@jlelli-thinkpadt14gen4.remote.csb>
Date: Wed, 19 Feb 2025 11:02:46 +0100
From: Juri Lelli <juri.lelli@...hat.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Jon Hunter <jonathanh@...dia.com>,
Christian Loehle <christian.loehle@....com>,
Thierry Reding <treding@...dia.com>,
Waiman Long <longman@...hat.com>, Tejun Heo <tj@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Michal Koutny <mkoutny@...e.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Phil Auld <pauld@...hat.com>, Qais Yousef <qyousef@...alina.io>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
"Joel Fernandes (Google)" <joel@...lfernandes.org>,
Suleiman Souhlal <suleiman@...gle.com>,
Aashish Sharma <shraash@...gle.com>,
Shin Kawamura <kawasin@...gle.com>,
Vineeth Remanan Pillai <vineeth@...byteword.org>,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier
for hotplug
On 19/02/25 10:29, Dietmar Eggemann wrote:
...
> I did now.
Thanks!
> Patch-wise I have:
>
> (1) Putting 'fair_server's __dl_server_[de|at]tach_root() under if
> '(cpumask_test_cpu(rq->cpu, [old_rd->online|cpu_active_mask))' in
> rq_attach_root()
>
> https://lkml.kernel.org/r/Z7RhNmLpOb7SLImW@jlelli-thinkpadt14gen4.remote.csb
>
> (2) Create __dl_server_detach_root() and call it in rq_attach_root()
>
> https://lkml.kernel.org/r/Z4fd_6M2vhSMSR0i@jlelli-thinkpadt14gen4.remote.csb
>
> plus debug patch:
>
> https://lkml.kernel.org/r/Z6M5fQB9P1_bDF7A@jlelli-thinkpadt14gen4.remote.csb
>
> plus additional debug.
So you don't have the one with which we ignore special tasks while
rebuilding domains?
https://lore.kernel.org/all/Z6spnwykg6YSXBX_@jlelli-thinkpadt14gen4.remote.csb/
Could you please double check again against
git@...hub.com:jlelli/linux.git experimental/dl-debug
> The suspend issue still persists.
>
> My hunch is that it's rather an issue with having 0 CPUs left in DEF
> while deactivating the last isol CPU (CPU3) so we set overflow = 1 w/o
> calling __dl_overflow(). We want to account fair_server_bw=52428
> against 0 CPUs.
>
> l B B l l l
>
> ^^^
> isolcpus=[3,4]
>
>
> cpumask_and(mask, rd->span, cpu_active_mask)
>
> mask = [3-5] & [0-3] = [3] -> dl_bw_cpus(3) = 1
>
> ---
>
> dl_bw_deactivate() called cpu=5
>
> dl_bw_deactivate() called cpu=4
>
> dl_bw_deactivate() called cpu=3
>
> dl_bw_cpus() cpu=6 rd->span=3-5 cpu_active_mask=0-3 cpus=1 type=DEF
> ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
> cpumask_subset(rd->span, cpu_active_mask) is false
>
> for_each_cpu_and(i, rd->span, cpu_active_mask)
> cpus++ <-- cpus is 1 !!!
>
> dl_bw_manage: cpu=3 cap=0 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=1 type=DEF span=3-5
^^^^^^
This still looks wrong: with a single cpu remaining we should only have
the corresponding dl server bandwidth present (unless there is some
other DL task running.
If you already had the patch ignoring sugovs bandwidth in your set, could
you please share the full dmesg?
Thanks!
Powered by blists - more mailing lists