[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f35dc0e3-c85e-30da-6119-fcf398b4b428@arm.com>
Date: Wed, 12 Feb 2020 19:24:13 +0000
From: Suzuki K Poulose <suzuki.poulose@....com>
To: Ionela.Voinescu@....com
Cc: catalin.marinas@....com, will@...nel.org, mark.rutland@....com,
maz@...nel.org, sudeep.holla@....com, lukasz.luba@....com,
valentin.schneider@....com, rjw@...ysocki.net,
peterz@...radead.org, mingo@...hat.com, vincent.guittot@...aro.org,
viresh.kumar@...aro.org, linux-arm-kernel@...ts.infradead.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org
Subject: Re: [PATCH v3 1/7] arm64: add support for the AMU extension v1
Hi Ionela,
On 02/12/2020 06:20 PM, Ionela Voinescu wrote:
> Hi Suzuki,
>
> On Wednesday 12 Feb 2020 at 16:20:56 (+0000), Suzuki Kuruppassery Poulose wrote:
>>>>> +static bool has_amu(const struct arm64_cpu_capabilities *cap,
>>>>> + int __unused)
>>>>> +{
>>>>> + /*
>>>>> + * The AMU extension is a non-conflicting feature: the kernel can
>>>>> + * safely run a mix of CPUs with and without support for the
>>>>> + * activity monitors extension. Therefore, if not disabled through
>>>>> + * the kernel command line early parameter, enable the capability
>>>>> + * to allow any late CPU to use the feature.
>>>>> + *
>>>>> + * With this feature enabled, the cpu_enable function will be called
>>>>> + * for all CPUs that match the criteria, including secondary and
>>>>> + * hotplugged, marking this feature as present on that respective CPU.
>>>>> + * The enable function will also print a detection message.
>>>>> + */
>>>>> +
>>>>> + if (!disable_amu && !zalloc_cpumask_var(&amu_cpus, GFP_KERNEL)) {
>>>>
>>>> This looks problematic. Don't we end up in allocating the memory during
>>>> "each CPU" check and thus leaking memory ? Do we really need to allocate
>>>> this dynamically ?
>>>>
>>>
>>> Yes, it does make some assumptions. Given that the AMU capability is
>>> a WEAK_LOCAL_CPU_FEATURE I relied on the match function being called
>>> only once, when the return value is true. If the return value is false,
>>
>> That is not correct. A WEAK_LOCAL_CPU_FEATURE is still SCOPE_LOCAL_CPU,
>> implies it is run on all the booting CPUs (including the hotplugged
>> ones). The WEAK is there to imply that its "permitted" or "optional"
>> for a hotplugged CPU. So, eventually you will re-allocate this variable
>> every single time a CPU turns up, where you could also loose the current
>> state.
>>
>
>>> which will result in it being called multiple times, it's either because
>>> disable_amu == false, or it has become false due to a previous failed
>>> allocation, in which case a new allocation will not be attempted.
>
> First of all, I agree with you that this should be corrected.
>
> But for completion (and my education) I retraced my steps in regards
> to my assumption above. While cpu_enable is called for all CPUs - boot,
> secondary, hotplugged, the matches function (in this case has_amu) is
> not always called for all CPUs, and that's where the confusion came
> from.
>
> Looking over the update_cpu_capabilities function, that's called from
> both setup_boot_cpu_capabilities and check_local_cpu_capabilities
> (secondary CPUs) for SCOPE_LOCAL_CPU:
>
> -----
> static void update_cpu_capabilities(u16 scope_mask)
> {
> int i;
> const struct arm64_cpu_capabilities *caps;
>
> scope_mask &= ARM64_CPUCAP_SCOPE_MASK;
> for (i = 0; i < ARM64_NCAPS; i++) {
> caps = cpu_hwcaps_ptrs[i];
> if (!caps || !(caps->type & scope_mask) ||
> cpus_have_cap(caps->capability) ||
> !caps->matches(caps, cpucap_default_scope(caps)))
> continue;
>
> --> The matches function is only called if !cpus_have_cap
Agreed. Your analysis is correct. This was done as a micro
optimization(!) as it is pointless to check if something should be set,
when it is already set.
>
>
> if (caps->desc)
> pr_info("detected: %s\n", caps->desc);
> cpus_set_cap(caps->capability);
>
> --> If matches returns true we mark it as present in cpu_hwcaps.
>
> if ((scope_mask & SCOPE_BOOT_CPU) && (caps->type & SCOPE_BOOT_CPU))
> set_bit(caps->capability, boot_capabilities);
> }
> }
> ---
>
> Therefore caps->matches (in this case has_amu) will only be called as
> long as it returns false. This is where my assumption above came from.
> Also, this is the reason it was working nicely in my testing, as I did
> not test hotplug this time.
>
> Where the has_amu code breaks is when we end up calling
> verify_local_cpu_capabilities instead of update_cpu_capabilities after
> sys_caps_initialised, which will happen for hotplugged CPUs.
> In that case we call caps->matches for all CPUs. Also, if anyone in the
> future ends up calling this_cpu_has_cap for the AMU capability.
True.
>
> I will fix this.
Cheers
Suzuki
Powered by blists - more mailing lists