linux-kernel - Re: [RFC PATCH v2 6/7] lib/persubnode: Introducing a simple per-subnode APIs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <578538FF.7040306@hpe.com>
Date:	Tue, 12 Jul 2016 14:37:51 -0400
From:	Waiman Long <waiman.long@....com>
To:	Boqun Feng <boqun.feng@...il.com>
CC:	Alexander Viro <viro@...iv.linux.org.uk>, Jan Kara <jack@...e.com>,
	Jeff Layton <jlayton@...chiereds.net>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	Tejun Heo <tj@...nel.org>,
	Christoph Lameter <cl@...ux-foundation.org>,
	<linux-fsdevel@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Andi Kleen <andi@...stfloor.org>,
	Dave Chinner <dchinner@...hat.com>,
	Scott J Norton <scott.norton@....com>,
	Douglas Hatch <doug.hatch@....com>
Subject: Re: [RFC PATCH v2 6/7] lib/persubnode: Introducing a simple per-subnode
 APIs

On 07/11/2016 11:14 PM, Boqun Feng wrote:
> On Mon, Jul 11, 2016 at 01:32:11PM -0400, Waiman Long wrote:
>> +/*
>> + * Initialize the subnodes
>> + *
>> + * All the sibling CPUs will be in the same subnode. On top of that, we will
>> + * put at most 2 sibling groups into the same subnode. The percpu
>> + * topology_sibling_cpumask() and topology_core_cpumask() are used for
>> + * grouping CPUs into subnodes. The subnode ID is the CPU number of the
>> + * first CPU in the subnode.
>> + */
>> +static int __init subnode_init(void)
>> +{
>> +	int cpu;
>> +	int nr_subnodes = 0;
>> +	const int subnode_nr_cpus = 2;
>> +
>> +	/*
>> +	 * Some of the bits in the subnode_mask will be cleared as we proceed.
>> +	 */
>> +	for_each_cpu(cpu, subnode_mask) {
>> +		int ccpu, scpu;
>> +		int cpucnt = 0;
>> +
>> +		cpumask_var_t core_mask = topology_core_cpumask(cpu);
>> +		cpumask_var_t sibling_mask;
>> +
>> +		/*
>> +		 * Put subnode_nr_cpus of CPUs and their siblings into each
>> +		 * subnode.
>> +		 */
>> +		for_each_cpu_from(cpu, ccpu, core_mask) {
>> +			sibling_mask = topology_sibling_cpumask(ccpu);
>> +			for_each_cpu_from(ccpu, scpu, sibling_mask) {
>> +				/*
>> +				 * Clear the bits of the higher CPUs.
>> +				 */
>> +				if (scpu>  cpu)
>> +					cpumask_clear_cpu(scpu, subnode_mask);
> Do we also need to clear the 'core_mask' here? Consider a core consist
> of two sibling groups and each sibling group consist of two cpus. At the
> beginning of the outer loop(for_each_cpu_from(cpu, ccpu, core_mask)):
>
> 'core_mask' is 0b1111
>
> so at the beginning of the inner loop first time:
>
> 'ccpu' is 0, therefore 'sibling_mask' is 0b1100, in this loop we set the
> 'cpu_subnode_id' of cpu 0 and 1 to 0.
>
> at the beginning of the inner loop second time:
>
> 'ccpu' is 1 because we don't clear cpu 1 from 'core_mask'. Therefore
> sibling_mask is still 0b1100, so in this loop we still do the setting on
> 'cpu_subnode_id' of cpu 0 and 1.
>
> Am I missing something here?
>

You are right. The current code work in my test as the 2 sibling CPUs 
occupy the a lower and higher numbers like (0, 72) for a 72-core system. 
It may not work for other sibling CPU assignment.

The core_mask, however, is a global data variable and we cannot modify 
it. I will make the following change instead:

diff --git a/lib/persubnode.c b/lib/persubnode.c
index 9febe7c..d1c8c29 100644
--- a/lib/persubnode.c
+++ b/lib/persubnode.c
@@ -94,6 +94,8 @@ static int __init subnode_init(void)
                  * subnode.
                  */
                 for_each_cpu_from(cpu, ccpu, core_mask) {
+                       if (!cpumask_test_cpu(ccpu, subnode_mask))
+                               continue;       /* Skip allocated CPU */
                         sibling_mask = topology_sibling_cpumask(ccpu);
                         for_each_cpu_from(ccpu, scpu, sibling_mask) {
                                 /*

Thanks for catching this bug.

Cheers,
Longman