[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220220013351.311430-1-21cnbao@gmail.com>
Date: Sun, 20 Feb 2022 14:33:51 +1300
From: Barry Song <21cnbao@...il.com>
To: 21cnbao@...il.com
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linuxarm@...wei.com, maz@...nel.org, song.bao.hua@...ilicon.com,
tglx@...utronix.de, will@...nel.org
Subject: Re: [PATCH] irqchip/gic-v3: use dsb(ishst) to synchronize data to smp before issuing ipi
> So there is no much difference between vanilla and patched kernel.
Sorry, let me correct it.
I realize I should write some data before sending IPI. So I have changed the module
to be as below:
#include <linux/module.h>
#include <linux/timekeeping.h>
volatile int data0 ____cacheline_aligned;
volatile int data1 ____cacheline_aligned;
volatile int data2 ____cacheline_aligned;
volatile int data3 ____cacheline_aligned;
volatile int data4 ____cacheline_aligned;
volatile int data5 ____cacheline_aligned;
volatile int data6 ____cacheline_aligned;
static void ipi_latency_func(void *val)
{
}
static int __init ipi_latency_init(void)
{
ktime_t stime, etime, delta;
int cpu, i;
int start = smp_processor_id();
stime = ktime_get();
for ( i = 0; i < 1000; i++)
for (cpu = 0; cpu < 96; cpu++) {
data0 = data1 = data2 = data3 = data4 = data5 = data6 = cpu;
smp_call_function_single(cpu, ipi_latency_func, NULL, 1);
}
etime = ktime_get();
delta = ktime_sub(etime, stime);
printk("%s ipi from cpu%d to cpu0-95 delta of 1000times:%lld\n",
__func__, start, delta);
return 0;
}
module_init(ipi_latency_init);
static void ipi_latency_exit(void)
{
}
module_exit(ipi_latency_exit);
MODULE_DESCRIPTION("IPI benchmark");
MODULE_LICENSE("GPL");
after that, I can see ~1% difference between patched kernel and vanilla:
vanilla:
[ 375.220131] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126757449
[ 375.382596] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126784249
[ 375.537975] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126177703
[ 375.686823] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:127022281
[ 375.849967] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126184883
[ 375.999173] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:127374585
[ 376.149565] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:125778089
[ 376.298743] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126974441
[ 376.451125] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:127357625
[ 376.606006] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:126228184
[ 371.405378] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151851181
[ 371.591642] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151568608
[ 371.767906] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151853441
[ 371.944031] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:152065453
[ 372.114085] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:146122093
[ 372.291345] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151379636
[ 372.459812] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151854411
[ 372.629708] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:145750720
[ 372.807574] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151629448
[ 372.994979] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:151050253
patched kernel:
[ 105.598815] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:124467401
[ 105.748368] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123474209
[ 105.900400] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123558497
[ 106.043890] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:122993951
[ 106.191845] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:122984223
[ 106.348215] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123323609
[ 106.501448] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:124507583
[ 106.656358] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123386963
[ 106.804367] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123340664
[ 106.956331] ipi_latency_init ipi from cpu0 to cpu0-95 delta of 1000times:123285324
[ 108.930802] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:143616067
[ 109.094750] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:148969821
[ 109.267428] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149648418
[ 109.443274] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149448903
[ 109.621760] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:147882917
[ 109.794611] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:148700282
[ 109.975197] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149050595
[ 110.141543] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:143566604
[ 110.315213] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:149202898
[ 110.491008] ipi_latency_init ipi from cpu48 to cpu0-95 delta of 1000times:148958261
as you can see, while cpu0 is the source, vanilla takes 125xxxxxx-127xxxxxx ns, patched
kernel takes 122xxxxxx-124xxxxxx ns.
Thanks
Barry
Powered by blists - more mailing lists