[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <163e1980-41ff-4a5f-9d93-431e65fd3a9d@bursov.com>
Date: Thu, 28 Mar 2024 19:10:41 +0200
From: Vitalii Bursov <vitaly@...sov.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Daniel Bristot de Oliveira
<bristot@...hat.com>, Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] sched/fair: allow disabling newidle_balance with
sched_relax_domain_level
On 28.03.24 18:48, Vincent Guittot wrote:
> On Thu, 28 Mar 2024 at 17:27, Vitalii Bursov <vitaly@...sov.com> wrote:
>>
>>
>> On 28.03.24 16:43, Vincent Guittot wrote:
>>> On Thu, 28 Mar 2024 at 01:31, Vitalii Bursov <vitaly@...sov.com> wrote:
>>>>
>>>> Change relax_domain_level checks so that it would be possible
>>>> to exclude all domains from newidle balancing.
>>>>
>>>> This matches the behavior described in the documentation:
>>>> -1 no request. use system default or follow request of others.
>>>> 0 no search.
>>>> 1 search siblings (hyperthreads in a core).
>>>>
>>>> "2" enables levels 0 and 1, level_max excludes the last (level_max)
>>>> level, and level_max+1 includes all levels.
>>>
>>> I was about to say that max+1 is useless because it's the same as -1
>>> but it's not exactly the same because it can supersede the system wide
>>> default_relax_domain_level. I wonder if one should be able to enable
>>> more levels than what the system has set by default.
>>
>> I don't know is such systems exist, but cpusets.rst suggests that
>> increasing it beyoud the default value is possible:
>>> If your situation is:
>>>
>>> - The migration costs between each cpu can be assumed considerably
>>> small(for you) due to your special application's behavior or
>>> special hardware support for CPU cache etc.
>>> - The searching cost doesn't have impact(for you) or you can make
>>> the searching cost enough small by managing cpuset to compact etc.
>>> - The latency is required even it sacrifices cache hit rate etc.
>>> then increasing 'sched_relax_domain_level' would benefit you.
>
> Fair enough. The doc should be updated as we can now clear the flags
> but not set them
>
SD_BALANCE_NEWIDLE is always set by default in sd_init() and cleared
in set_domain_attribute() depending on default_relax_domain_level
("relax_domain_level" kernel parameter) and cgroup configuration
if it's present.
So, it should work both ways - clearing flags when relax level
is decreasing, and not clearing the flag when it's increasing,
isn't it?
Also, after a closer look at set_domain_attribute(), it looks like
default_relax_domain_level is -1 on all systems, so if cgroup does
not set relax level, it won't clear any flags, which probably means
that level_max+1 is redundant today.
Powered by blists - more mailing lists