lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xhsmha5tjvfgm.mognet@vschneid.remote.csb>
Date:   Mon, 18 Sep 2023 14:22:17 +0200
From:   Valentin Schneider <vschneid@...hat.com>
To:     Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
Cc:     dietmar.eggemann@....com, linux-kernel@...r.kernel.org,
        ionela.voinescu@....com, quentin.perret@....com,
        srikar@...ux.vnet.ibm.com, mgorman@...hsingularity.net,
        mingo@...nel.org, pierre.gondois@....com, yu.c.chen@...el.com,
        tim.c.chen@...ux.intel.com, mingo@...hat.com, peterz@...radead.org,
        vincent.guittot@...aro.org
Subject: Re: [PATCH v3] sched/topology: remove sysctl_sched_energy_aware
 depending on the architecture

On 15/09/23 23:40, Shrikanth Hegde wrote:
> On 9/15/23 5:30 PM, Valentin Schneider wrote:
>> On 14/09/23 23:26, Shrikanth Hegde wrote:
>>> On 9/14/23 9:51 PM, Valentin Schneider wrote:
>>>> On 13/09/23 17:18, Shrikanth Hegde wrote:
>>>>> sysctl_sched_energy_aware is available for the admin to disable/enable
>>>>> energy aware scheduling(EAS). EAS is enabled only if few conditions are
>>>>> met by the platform. They are, asymmetric CPU capacity, no SMT,
>>>>> valid cpufreq policy, frequency invariant load tracking. It is possible
>>>>> platform when booting may not have EAS capability, but can do that after.
>>>>> For example, changing/registering the cpufreq policy.
>>>>>
>>>>> At present, though platform doesn't support EAS, this sysctl is still
>>>>> present and it ends up calling rebuild of sched domain on write to 1 and
>>>>> NOP when writing to 0. That is confusing and un-necessary.
>>>>>
>>>>
>>>
>>> Hi Valentin, Thanks for taking a look at this patch.
>>>
>>>> But why would you write to it in the first place? Or do you mean to use
>>>> this as an indicator for userspace that EAS is supported?
>>>>
>>>
>>> Since this sysctl is present and its value being 1, it gives the
>>> impression to the user that EAS is supported when it is not.
>>> So its an attempt to correct that part.
>>>
>> 
>> Ah, I see. Then how about just making the sysctl return 0 when EAS isn't
>> supported? And on top of it, prevent all writes when EAS isn't supported
>> (perf domains cannot be built, so there would be no point in forcing a
>> rebuild that will do nothing).
>
> Yes. That's another way. Thats what I had as possible approach in 
> https://lore.kernel.org/lkml/d2c945d6-c4f0-a096-0623-731b11484f51@linux.vnet.ibm.com/
>

Thanks for the link; and apologies for bringing up topics that have been
discussed already.

>
>
>> 
>> I can never remember how to properly use the sysctl API, so that's a very
>> crude implementation, but something like so?
>> 
>> ---
>> 
>> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
>> index 05a5bc678c089..dadfc5afc4121 100644
>> --- a/kernel/sched/topology.c
>> +++ b/kernel/sched/topology.c
>> @@ -230,9 +230,28 @@ static int sched_energy_aware_handler(struct ctl_table *table, int write,
>>  	if (write && !capable(CAP_SYS_ADMIN))
>>  		return -EPERM;
>> 
>> +	if (!sched_energy_enabled()) {
>
> Use of sched_energy_enabled won't work as Pierre has indicated. 
>
> Instead this can be done by adding those checks in a helper function to
> do similar checks as done build_perf_domains. 
>
> I can send v4 with this approach if it makes more sense. Please let me know.
>

So what I'm thinking is the standard approach seems to be to keep the knobs
visible, but change how reads/writes to them are handled.

For instance, SMT support has

  /sys/devices/system/cpu/smt
    /control
    /active

And a system with CONFIG_HOTPLUG_SMT=y but no actual hardware SMT will
have:

    /control = notsupported
    /active  = 0

So IMO it would make sense to keep sched_energy_aware around, but make it
read 0 and prevent writes for systems that have the software support
compiled but don't have the actual hardware support.

In a pinch it also helps to know if CONFIG_ENERGY_MODEL was selected,
though that's obvious enough with CONFIG_SCHED_DEBUG=y.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ