lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91010eff-cbac-4772-8b95-1ce9bb56d9e0@redhat.com>
Date: Wed, 13 Nov 2024 13:19:17 -0500
From: Waiman Long <llong@...hat.com>
To: Juri Lelli <juri.lelli@...hat.com>, Waiman Long <llong@...hat.com>
Cc: Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>,
 Michal Koutny <mkoutny@...e.com>, Ingo Molnar <mingo@...hat.com>,
 Peter Zijlstra <peterz@...radead.org>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
 Qais Yousef <qyousef@...alina.io>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 "Joel Fernandes (Google)" <joel@...lfernandes.org>,
 Suleiman Souhlal <suleiman@...gle.com>, Aashish Sharma <shraash@...gle.com>,
 Shin Kawamura <kawasin@...gle.com>,
 Vineeth Remanan Pillai <vineeth@...byteword.org>,
 linux-kernel@...r.kernel.org, cgroups@...r.kernel.org
Subject: Re: [PATCH 2/2] sched/deadline: Correctly account for allocated
 bandwidth during hotplug


On 11/13/24 1:11 PM, Juri Lelli wrote:
> On 13/11/24 11:50, Waiman Long wrote:
>> On 11/13/24 11:42 AM, Waiman Long wrote:
>>> On 11/13/24 11:40 AM, Juri Lelli wrote:
>>>> On 13/11/24 11:06, Waiman Long wrote:
>>>>
>>>> ...
>>>>
>>>>> This part can still cause a failure in one of test cases in my cpuset
>>>>> partition test script. In this particular case, the CPU to be
>>>>> offlined is an
>>>>> isolated CPU with scheduling disabled. As a result, total_bw is
>>>>> 0 and the
>>>>> __dl_overflow() test failed. Is there a way to skip the
>>>>> __dl_overflow() test
>>>>> for isolated CPUs? Can we use a null total_bw as a proxy for that?
>>>> Can you please share the repro script? Would like to check locally what
>>>> is going on.
>>> Just run tools/testing/selftests/cgroup/test_cpuset_prs.sh.
>> The failing test is
>>
>>          # Remote partition offline tests
>>          " C0-3:S+ C1-3:S+ C2-3     .    X2-3   X2-3 X2-3:P2:O2=0 .   0
>> A1:0-1,A2:1,A3:3 A1:P0,A3:P2 2-3"
>>
>> You can remove all the previous lines in the TEST_MATRIX to get to failed
>> test case immediately eliminating unnecessary noise in your testing.
> So, IIUC this test is doing the following
>
> # echo +cpuset >cgroup/cgroup.subtree_control
> # mkdir cgroup/A1
> # echo 0-3 >cgroup/A1/cpuset.cpus
> # echo +cpuset >cgroup/A1/cgroup.subtree_control
> # mkdir cgroup/A1/A2
> # echo 1-3 >cgroup/A1/A2/cpuset.cpus
> # echo +cpuset >cgroup/A1/A2/cgroup.subtree_control
> # mkdir cgroup/A1/A2/A3
> # echo 2-3 >cgroup/A1/A2/A3/cpuset.cpus
> # echo 2-3 >cgroup/A1/cpuset.cpus.exclusive
> # echo 2-3 >cgroup/A1/A2/cpuset.cpus.exclusive
> # echo 2-3 >cgroup/A1/A2/A3/cpuset.cpus.exclusive
> # echo isolated >cgroup/A1/A2/A3/cpuset.cpus.partition
>
> With the last command, we get to one root domain with span: 0-1,4-7 (in
> my setup with 8 CPUs) and no root domain for 2,3, since they are
> isolated.
>
> The test then tries to hotplug CPU 2, but fails to do so and so the
> reported error.
>
> total_bw for CPU 2 and CPU 3 is indeed 0, and I guess we could special
> case this as you suggest (nothing to really worry about if we don't have
> DEADLINE tasks affined to these CPUs). But I would have expected the
> fair server contribution to still show up in total_bw, so this is
> something a need to check.

Thanks for looking into this. So the test script does create a lot of 
different corner cases to test the correctness of the cpuset partition 
code. Hopefully that will help you to improve the DL code to better 
handle these corner cases.

Cheers,
Longman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ