lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Sep 2020 14:43:24 -0400
From:   Nitesh Narayan Lal <nitesh@...hat.com>
To:     Jesse Brandeburg <jesse.brandeburg@...el.com>
Cc:     linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        linux-pci@...r.kernel.org, frederic@...nel.org,
        mtosatti@...hat.com, sassmann@...hat.com,
        jeffrey.t.kirsher@...el.com, jacob.e.keller@...el.com,
        jlelli@...hat.com, hch@...radead.org, bhelgaas@...gle.com,
        mike.marciniszyn@...el.com, dennis.dalessandro@...el.com,
        thomas.lendacky@....com, jerinj@...vell.com,
        mathias.nyman@...el.com, jiri@...dia.com,
        "Luis Claudio R. Goncalves" <lgoncalv@...hat.com>
Subject: Re: [RFC][Patch v1 1/3] sched/isolation: API to get num of
 hosekeeping CPUs


On 9/17/20 2:18 PM, Jesse Brandeburg wrote:
> Nitesh Narayan Lal wrote:
>
>> Introduce a new API num_housekeeping_cpus(), that can be used to retrieve
>> the number of housekeeping CPUs by reading an atomic variable
>> __num_housekeeping_cpus. This variable is set from housekeeping_setup().
>>
>> This API is introduced for the purpose of drivers that were previously
>> relying only on num_online_cpus() to determine the number of MSIX vectors
>> to create. In an RT environment with large isolated but a fewer
>> housekeeping CPUs this was leading to a situation where an attempt to
>> move all of the vectors corresponding to isolated CPUs to housekeeping
>> CPUs was failing due to per CPU vector limit.
>>
>> If there are no isolated CPUs specified then the API returns the number
>> of all online CPUs.
>>
>> Signed-off-by: Nitesh Narayan Lal <nitesh@...hat.com>
>> ---
>>  include/linux/sched/isolation.h |  7 +++++++
>>  kernel/sched/isolation.c        | 23 +++++++++++++++++++++++
>>  2 files changed, 30 insertions(+)
> I'm not a scheduler expert, but a couple comments follow.
>
>> diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h
>> index cc9f393e2a70..94c25d956d8a 100644
>> --- a/include/linux/sched/isolation.h
>> +++ b/include/linux/sched/isolation.h
>> @@ -25,6 +25,7 @@ extern bool housekeeping_enabled(enum hk_flags flags);
>>  extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags);
>>  extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags);
>>  extern void __init housekeeping_init(void);
>> +extern unsigned int num_housekeeping_cpus(void);
>>  
>>  #else
>>  
>> @@ -46,6 +47,12 @@ static inline bool housekeeping_enabled(enum hk_flags flags)
>>  static inline void housekeeping_affine(struct task_struct *t,
>>  				       enum hk_flags flags) { }
>>  static inline void housekeeping_init(void) { }
>> +
>> +static unsigned int num_housekeeping_cpus(void)
>> +{
>> +	return num_online_cpus();
>> +}
>> +
>>  #endif /* CONFIG_CPU_ISOLATION */
>>  
>>  static inline bool housekeeping_cpu(int cpu, enum hk_flags flags)
>> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
>> index 5a6ea03f9882..7024298390b7 100644
>> --- a/kernel/sched/isolation.c
>> +++ b/kernel/sched/isolation.c
>> @@ -13,6 +13,7 @@ DEFINE_STATIC_KEY_FALSE(housekeeping_overridden);
>>  EXPORT_SYMBOL_GPL(housekeeping_overridden);
>>  static cpumask_var_t housekeeping_mask;
>>  static unsigned int housekeeping_flags;
>> +static atomic_t __num_housekeeping_cpus __read_mostly;
>>  
>>  bool housekeeping_enabled(enum hk_flags flags)
>>  {
>> @@ -20,6 +21,27 @@ bool housekeeping_enabled(enum hk_flags flags)
>>  }
>>  EXPORT_SYMBOL_GPL(housekeeping_enabled);
>>  
>> +/*
> use correct kdoc style, and you get free documentation from your source
> (you're so close!)
>
> should be (note the first line and the function title line change to
> remove parens:
> /**
>  * num_housekeeping_cpus - Read the number of housekeeping CPUs.
>  *
>  * This function returns the number of available housekeeping CPUs
>  * based on __num_housekeeping_cpus which is of type atomic_t
>  * and is initialized at the time of the housekeeping setup.
>  */

My bad, I missed that.
Thanks for pointing it out.

>
>> + * num_housekeeping_cpus() - Read the number of housekeeping CPUs.
>> + *
>> + * This function returns the number of available housekeeping CPUs
>> + * based on __num_housekeeping_cpus which is of type atomic_t
>> + * and is initialized at the time of the housekeeping setup.
>> + */
>> +unsigned int num_housekeeping_cpus(void)
>> +{
>> +	unsigned int cpus;
>> +
>> +	if (static_branch_unlikely(&housekeeping_overridden)) {
>> +		cpus = atomic_read(&__num_housekeeping_cpus);
>> +		/* We should always have at least one housekeeping CPU */
>> +		BUG_ON(!cpus);
> you need to crash the kernel because of this? maybe a WARN_ON? How did
> the global even get set to the bad value? It's going to blame the poor
> caller for this in the trace, but the caller likely had nothing to do
> with setting the value incorrectly!

Yes, ideally this should not be triggered, but if somehow it does then we have
a bug and that needs to be fixed. That's probably the only reason why I chose
BUG_ON.
But, I am not entirely against the usage of WARN_ON either, because we get a
stack trace anyways.
I will see if anyone else has any other concerns on this patch and then I can
post the next version.

>
>> +		return cpus;
>> +	}
>> +	return num_online_cpus();
>> +}
>> +EXPORT_SYMBOL_GPL(num_housekeeping_cpus);
-- 
Thanks
Nitesh



Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists