[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08e39be6-2815-4385-7703-7acc93f85c7f@arm.com>
Date: Thu, 7 Sep 2023 12:34:02 +0200
From: Pierre Gondois <pierre.gondois@....com>
To: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>,
Tim Chen <tim.c.chen@...ux.intel.com>
Cc: dietmar.eggemann@....com, vincent.guittot@...aro.org,
peterz@...radead.org, mingo@...hat.com, vschneid@...hat.com,
linux-kernel@...r.kernel.org, ionela.voinescu@....com,
quentin.perret@....com, srikar@...ux.vnet.ibm.com,
mgorman@...hsingularity.net, mingo@...nel.org, yu.c.chen@...el.com
Subject: Re: [PATCH v2] sched/topology: remove sysctl_sched_energy_aware
depending on the architecture
Hello Shrikanth,
On 9/6/23 13:35, Shrikanth Hegde wrote:
>
>
> On 9/5/23 7:33 PM, Pierre Gondois wrote:
>> Hello Shrikanth,
>> I tried the patch (on a platform using the cppc_cpufreq driver). The
>> platform
>> normally has EAS enabled, but the patch removed the sched_energy_aware
>> sysctl.
>> It seemed the following happened (in the below order):
>>
>> 1. sched_energy_aware_sysctl_init()
>> Doesn't set sysctl_sched_energy_aware as cpufreq_freq_invariance isn't set
>> and arch_scale_freq_invariant() returns false
>>
>> 2. cpufreq_register_driver()
>> Sets cpufreq_freq_invariance during cpufreq initialization
>> sched_energy_set()
>>
>> 3. sched_energy_set()
>> Is called with has_eas=0 since build_perf_domains() doesn't see the
>> platform
>> as EAS compatible. Indeed sysctl_sched_energy_aware=0.
>> So with sysctl_sched_energy_aware=0 and has_eas=0, sched_energy_aware
>> sysctl
>> is not enabled even though EAS should be possible.
>>
>>
>> On 9/1/23 08:52, Shrikanth Hegde wrote:
>>> Currently sysctl_sched_energy_aware doesn't alter the said behaviour on
>>> some of the architectures. IIUC its meant to either force rebuild the
>>> perf domains or cleanup the perf domains by echoing 1 or 0 respectively.
>>
>> There is a definition of the sysctl at:
>> Documentation/admin-guide/sysctl/kernel.rst::sched_energy_aware
> [...]
>>>
>>>
>>> +static unsigned int sysctl_sched_energy_aware;
>>> +static struct ctl_table_header *sysctl_eas_header;
>>
>> The variables around the presence/absence of EAS are:
>> - sched_energy_present:
>> EAS is up and running
>>
>> - sysctl_sched_energy_aware:
>> The user wants to use EAS (or not). Doesn't mean EAS can run on the
>> platform.
>>
>> - sched_energy_set/partition_sched_domains_locked's "has_eas":
>> Local variable. Represent whether EAS can run on the platform.
>>
>> IMO it would be simpler to (un)register sched_energy_aware sysctl
>> in partition_sched_domains_locked(), based on the value of "has_eas".
>> This would allow to let all the logic as it is right now, inside
>> build_perf_domains(), and then advertise sched_energy_aware sysctl
>> if EAS can run on the platform.
>> sched_energy_aware_sysctl_init() would be deleted then.
>>
>>
> yes. that is true. and there is no variable which holds the info if the system
> is capable of EAS.
>
> Retrospecting, the reason for starting this patch series was this,
> sysctl_sched_energy_aware didnt make sense on power10 platform since it
> has SMT and symmetric CPU capacities. with current code writing 1 to
> it cause rebuild of sched domains but EAS wouldn't be possible.
>
>
> Possible Approaches:
>
> 1.
> Make this sysctl write as NOP if the platform doesn't has EAS capabilities at
> the moment. Do those checks in sched_energy_aware_handler before handling the
> change in value. Return EINVAL. And Update sysctl description that on such
> platforms value change is NOP. Relatively simpler change.
>
> 2.
> Current patch approach, remove the sysctl completely on non supported
> architectures and re-enable it if the system becomes capable of doing EAS.
> With the current patch, instead of using sched_energy_update, use another
> variable called sched_energy_change_in_sysctl(maybe different name). I think
> that would handle all the cases. Another variable can be avoided by encoding
> the info in sysctl_sched_energy_aware itself in the handler call, since it
> takes only 1 or 0 as the value. upper bits are free to use. update the sysctl
> as well with this behavior. plus minor cleanup to remove the init of sysctl.
>
FWIW I think it makes more sense to remove the sysctl if EAS isn't possible on
the platform, as this patch intends to do.
From a code perspective I m not sure I follow exactly your intend. I can test
your v3 if necessary,
Regards,
Pierre
Powered by blists - more mailing lists