[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <801be69a-0881-aa84-16fb-4b5782d95860@redhat.com>
Date: Thu, 17 Sep 2020 14:43:24 -0400
From: Nitesh Narayan Lal <nitesh@...hat.com>
To: Jesse Brandeburg <jesse.brandeburg@...el.com>
Cc: linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
linux-pci@...r.kernel.org, frederic@...nel.org,
mtosatti@...hat.com, sassmann@...hat.com,
jeffrey.t.kirsher@...el.com, jacob.e.keller@...el.com,
jlelli@...hat.com, hch@...radead.org, bhelgaas@...gle.com,
mike.marciniszyn@...el.com, dennis.dalessandro@...el.com,
thomas.lendacky@....com, jerinj@...vell.com,
mathias.nyman@...el.com, jiri@...dia.com,
"Luis Claudio R. Goncalves" <lgoncalv@...hat.com>
Subject: Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of
hosekeeping CPUs
On 9/17/20 2:18 PM, Jesse Brandeburg wrote:
> Nitesh Narayan Lal wrote:
>
>> Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
>> the number of housekeeping CPUs by reading an atomic variable
>> __num_housekeeping_cpus. This variable is set from housekeeping_setup().
>>
>> This API is introduced for the purpose of drivers that were previously
>> relying only on num_online_cpus() to determine the number of MSIX vectors
>> to create. In an RT environment with large isolated but a fewer
>> housekeeping CPUs this was leading to a situation where an attempt to
>> move all of the vectors corresponding to isolated CPUs to housekeeping
>> CPUs was failing due to per CPU vector limit.
>>
>> If there are no isolated CPUs specified then the API returns the number
>> of all online CPUs.
>>
>> Signed-off-by: Nitesh Narayan Lal <nitesh@...hat.com>
>> ---
>> include/linux/sched/isolation.h | 7 +++++++
>> kernel/sched/isolation.c | 23 +++++++++++++++++++++++
>> 2 files changed, 30 insertions(+)
> I'm not a scheduler expert, but a couple comments follow.
>
>> diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
>> index cc9f393e2a70..94c25d956d8a 100644
>> --- a/include/linux/sched/isolation.h
>> +++ b/include/linux/sched/isolation.h
>> @@ -25,6 +25,7 @@ extern bool housekeeping_enabled(enum hk_flags flags);
>> extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
>> extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
>> extern void __init housekeeping_init(void);
>> +extern unsigned int num_housekeeping_cpus(void);
>>
>> #else
>>
>> @@ -46,6 +47,12 @@ static inline bool housekeeping_enabled(enum hk_flags flags)
>> static inline void housekeeping_affine(struct task_struct *t,
>> enum hk_flags flags) { }
>> static inline void housekeeping_init(void) { }
>> +
>> +static unsigned int num_housekeeping_cpus(void)
>> +{
>> + return num_online_cpus();
>> +}
>> +
>> #endif /* CONFIG_CPU_ISOLATION */
>>
>> static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>> index 5a6ea03f9882..7024298390b7 100644
>> --- a/kernel/sched/isolation.c
>> +++ b/kernel/sched/isolation.c
>> @@ -13,6 +13,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
>> EXPORT_SYMBOL_GPL(housekeeping_overridden);
>> static cpumask_var_t housekeeping_mask;
>> static unsigned int housekeeping_flags;
>> +static atomic_t __num_housekeeping_cpus __read_mostly;
>>
>> bool housekeeping_enabled(enum hk_flags flags)
>> {
>> @@ -20,6 +21,27 @@ bool housekeeping_enabled(enum hk_flags flags)
>> }
>> EXPORT_SYMBOL_GPL(housekeeping_enabled);
>>
>> +/*
> use correct kdoc style, and you get free documentation from your source
> (you're so close!)
>
> should be (note the first line and the function title line change to
> remove parens:
> /**
> * num_housekeeping_cpus - Read the number of housekeeping CPUs.
> *
> * This function returns the number of available housekeeping CPUs
> * based on __num_housekeeping_cpus which is of type atomic_t
> * and is initialized at the time of the housekeeping setup.
> */
My bad, I missed that.
Thanks for pointing it out.
>
>> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
>> + *
>> + * This function returns the number of available housekeeping CPUs
>> + * based on __num_housekeeping_cpus which is of type atomic_t
>> + * and is initialized at the time of the housekeeping setup.
>> + */
>> +unsigned int num_housekeeping_cpus(void)
>> +{
>> + unsigned int cpus;
>> +
>> + if (static_branch_unlikely(&housekeeping_overridden)) {
>> + cpus = atomic_read(&__num_housekeeping_cpus);
>> + /* We should always have at least one housekeeping CPU */
>> + BUG_ON(!cpus);
> you need to crash the kernel because of this? maybe a WARN_ON? How did
> the global even get set to the bad value? It's going to blame the poor
> caller for this in the trace, but the caller likely had nothing to do
> with setting the value incorrectly!
Yes, ideally this should not be triggered, but if somehow it does then we have
a bug and that needs to be fixed. That's probably the only reason why I chose
BUG_ON.
But, I am not entirely against the usage of WARN_ON either, because we get a
stack trace anyways.
I will see if anyone else has any other concerns on this patch and then I can
post the next version.
>
>> + return cpus;
>> + }
>> + return num_online_cpus();
>> +}
>> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
--
Thanks
Nitesh
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists