lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 20 Feb 2022 14:33:51 +1300
From:   Barry Song <21cnbao@...il.com>
To:     21cnbao@...il.com
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linuxarm@...wei.com, maz@...nel.org, song.bao.hua@...ilicon.com,
        tglx@...utronix.de, will@...nel.org
Subject: Re: [PATCH] irqchip/gic-v3: use dsb(ishst) to synchronize data to smp before issuing ipi

> So there is no much difference between vanilla and patched kernel.

Sorry, let me correct it.

I realize I should write some data before sending IPI. So I have changed the module
to be as below:

#include <linux/module.h>
#include <linux/timekeeping.h>

volatile int data0 ____cacheline_aligned;
volatile int data1 ____cacheline_aligned;
volatile int data2 ____cacheline_aligned;
volatile int data3 ____cacheline_aligned;
volatile int data4 ____cacheline_aligned;
volatile int data5 ____cacheline_aligned;
volatile int data6 ____cacheline_aligned;

static void ipi_latency_func(void *val)
{
}

static int __init ipi_latency_init(void)
{

        ktime_t stime, etime, delta;
        int cpu, i;
        int start = smp_processor_id();

        stime = ktime_get();
        for ( i = 0; i < 1000; i++)
                for (cpu = 0; cpu < 96; cpu++) {
                        data0 = data1 = data2 = data3 = data4 = data5 = data6 = cpu;
                        smp_call_function_single(cpu, ipi_latency_func, NULL, 1); 
                }   
        etime = ktime_get();

        delta = ktime_sub(etime, stime);

        printk("%s ipi from cpu%d to cpu0-95 delta of 1000times:%lld\n",
                        __func__, start, delta);

        return 0;
}
module_init(ipi_latency_init);

static void ipi_latency_exit(void)
{
}
module_exit(ipi_latency_exit);

MODULE_DESCRIPTION("IPI benchmark");
MODULE_LICENSE("GPL");

after that, I can see ~1% difference between patched kernel and vanilla:

vanilla:
[  375.220131] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126757449
[  375.382596] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126784249
[  375.537975] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126177703
[  375.686823] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:127022281
[  375.849967] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126184883
[  375.999173] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:127374585
[  376.149565] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:125778089
[  376.298743] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126974441
[  376.451125] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:127357625
[  376.606006] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126228184

[  371.405378] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151851181
[  371.591642] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151568608
[  371.767906] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151853441
[  371.944031] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:152065453
[  372.114085] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:146122093
[  372.291345] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151379636
[  372.459812] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151854411
[  372.629708] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:145750720
[  372.807574] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151629448
[  372.994979] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151050253

patched kernel:
[  105.598815] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:124467401
[  105.748368] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123474209
[  105.900400] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123558497
[  106.043890] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:122993951
[  106.191845] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:122984223
[  106.348215] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123323609
[  106.501448] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:124507583
[  106.656358] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123386963
[  106.804367] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123340664
[  106.956331] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123285324

[  108.930802] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:143616067
[  109.094750] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:148969821
[  109.267428] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149648418
[  109.443274] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149448903
[  109.621760] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:147882917
[  109.794611] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:148700282
[  109.975197] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149050595
[  110.141543] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:143566604
[  110.315213] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149202898
[  110.491008] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:148958261

as you can see, while cpu0 is the source, vanilla takes 125xxxxxx-127xxxxxx ns, patched
kernel takes 122xxxxxx-124xxxxxx ns.

Thanks
Barry

Powered by blists - more mailing lists