lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4822d111-b02d-469a-a457-46392c35021f@redhat.com>
Date: Tue, 3 Sep 2024 21:23:53 -0400
From: Waiman Long <longman@...hat.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 Juri Lelli <juri.lelli@...hat.com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] sched/isolation: Add HK_FLAG_SCHED to nohz_full

On 9/3/24 17:32, Frederic Weisbecker wrote:
> Le Tue, Sep 03, 2024 at 09:24:08AM -0400, Waiman Long a écrit :
>> On 9/3/24 09:10, Frederic Weisbecker wrote:
>>> Le Sun, Aug 18, 2024 at 07:45:18PM -0400, Waiman Long a écrit :
>>>> The HK_FLAG_SCHED/HK_TYPE_SCHED flag is defined and is also used
>>>> in kernel/sched/fair.c since commit de201559df87 ("sched/isolation:
>>>> Introduce housekeeping flags"). However, the corresponding cpumask isn't
>>>> currently updated anywhere. So the mask is always cpu_possible_mask.
>>>>
>>>> Add it in nohz_full setup so that nohz_full CPUs will now be removed
>>>> from HK_TYPE_SCHED cpumask.
>>>>
>>>> Signed-off-by: Waiman Long <longman@...hat.com>
>>>> ---
>>>>    kernel/sched/isolation.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>>>> index 5891e715f00d..a514994af319 100644
>>>> --- a/kernel/sched/isolation.c
>>>> +++ b/kernel/sched/isolation.c
>>>> @@ -196,7 +196,7 @@ static int __init housekeeping_nohz_full_setup(char *str)
>>>>    	unsigned long flags;
>>>>    	flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
>>>> -		HK_FLAG_MISC | HK_FLAG_KTHREAD;
>>>> +		HK_FLAG_MISC | HK_FLAG_KTHREAD | HK_FLAG_SCHED;
>>>>    	return housekeeping_setup(str, flags);
>>>>    }
>>> find_new_ilb() already has HK_FLAG_MISC to prevent an isolated CPU
>>> from being elected as an ilb. So I think we should simply remove HK_FLAG_SCHED.
>> There is a check for HK_TYPE_SCHED in nohz_balance_enter_idle() and
>> nohz_newidle_balance(), though it is essentially a no-op as the cpumask has
>> all the CPUs. If we remove HK_TYPE_SCHED, the question now will be whether
>> we should remove the checks at these 2 functions or change them to
>> HK_TYPE_MISC.
> Just remove those two. They are dead code and the nohz_full handling
> of load balancing needs a rethink anyway.
OK, I will modified the patch to remove the dead code.
>
> After discussing with Peter lately, the rules should be:
>
> 1) If a nohz_full CPU is part of a multi-CPU domain, then it should
>     be part of load balancing. Peter even says that nohz_full should be
>     forbidden in this case, because the tick plays a role in the
>     load balancing.

My understand is that most users will use nohz_full together with 
isolcpus. So nohz_full CPUs are also isolated and not in a sched domain. 
There may still be user setting nohz_full without isolcpus though, but 
that should be relatively rare.

Anyway, all these nohz_full/kernel_nose setting will only apply to CPUs 
in isolated cpuset partitions which will not be in a sched domain.

>
> 2) Otherwise, if CPU is not part of a domain or it is the only CPU of all its
>     domains, then it can be out of the load balancing machinery.
I am aware that a single-cpu domain is the same as being isolated with 
no load balancing.
>
> I'm a bit scared about rule 1) because I know there are existing users of
> nohz_full on multi-CPU domains... So I feel a bit trapped.

As stated before, this is not a common use case.

The isolcpus boot option is deprecated, as stated in 
kernel-parameters.txt. My plan is to deprecate nohz_full as well once we 
are able to make dynamic CPU isolation via cpuset works almost as good 
as isolcpus + nohz_full.

Cheers,
Longman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ