lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <d8da7d88bf91470cb1bc90630d6a7aff@baidu.com>
Date: Thu, 21 Aug 2025 12:19:04 +0000
From: "Li,Rongqing" <lirongqing@...du.com>
To: Valentin Schneider <vschneid@...hat.com>, "mingo@...hat.com"
	<mingo@...hat.com>, "peterz@...radead.org" <peterz@...radead.org>,
	"juri.lelli@...hat.com" <juri.lelli@...hat.com>, "vincent.guittot@...aro.org"
	<vincent.guittot@...aro.org>, "dietmar.eggemann@....com"
	<dietmar.eggemann@....com>, "rostedt@...dmis.org" <rostedt@...dmis.org>,
	"bsegall@...gle.com" <bsegall@...gle.com>, "mgorman@...e.de"
	<mgorman@...e.de>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: RE: Re: [PATCH] sched/fair: Optimize CPU iteration using
 for_each_cpu_and[not]



> On 15/08/25 09:15, lirongqing wrote:
> > From: Li RongQing <lirongqing@...du.com>
> >
> > Replace open-coded CPU iteration patterns with more efficient
> > for_each_cpu_and() and for_each_cpu_andnot() macros in three locations.
> >
> > This change both simplifies the code and provides minor performance
> > improvements by using the more specialized iteration macros.
> >
> 
> TBF I'm not sure it does improve anything for the SMT cases considering we
> don't see much more than SMT8.
> 

I did the blow simple test on 128 cpu, smt 2 machine, and result shows for_each_cpu_andnot is better :

for_each_cpu + if()    vs  for_each_cpu_andnot()
5026373            vs   3398283
4034229            vs   2711302



#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/cpumask.h>
#include <linux/sched/clock.h>

static int test_init(void)
{

        int cpu, sibling;
        int i = 0;
        int loop = 1000;
        u64 now;

        now = local_clock();

        while (loop--) {
                for (cpu = 0; cpu < 128; cpu++) {
                        for_each_cpu(sibling, cpu_smt_mask(cpu)) {
                                if (cpu == sibling)
                                        continue;
                                i++;
                        }
                }
        }
        printk("%lld %d", local_clock() - now);

        i =0;
        loop = 1000;

        now = local_clock();
        while (loop--) {
                for (cpu = 0; cpu < 128; cpu++) {
                        for_each_cpu_andnot(sibling, cpu_smt_mask(cpu), cpumask_of(cpu)) {
                                i++;
                        }
                }
        }

        printk("%lld %d", local_clock() - now);


        return -1;
}

module_init(livepatch_init);
MODULE_LICENSE("GPL");
MODULE_INFO(livepatch, "Y");

Thanks

-Li





> The task_numa_find_cpu() one I do agree makes things better.
> 
> > Signed-off-by: Li RongQing <lirongqing@...du.com>
> 
> Reviewed-by: Valentin Schneider <vschneid@...hat.com>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ