lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Dec 2017 17:30:25 +0100
From:   Christian Borntraeger <borntraeger@...ibm.com>
To:     Yury Norov <ynorov@...iumnetworks.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ashish Kalra <Ashish.Kalra@...ium.com>,
        Christoffer Dall <christoffer.dall@...aro.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Linu Cherian <Linu.Cherian@...ium.com>,
        Sunil Goutham <Sunil.Goutham@...ium.com>
Subject: Re: [PATCH] IPI performance benchmark



On 12/11/2017 03:55 PM, Yury Norov wrote:
> On Mon, Dec 11, 2017 at 03:35:02PM +0100, Christian Borntraeger wrote:
>>
>>
>> On 12/11/2017 03:16 PM, Yury Norov wrote:
>>> This benchmark sends many IPIs in different modes and measures
>>> time for IPI delivery (first column), and total time, ie including
>>> time to acknowledge the receive by sender (second column).
>>>
>>> The scenarios are:
>>> Dry-run:	do everything except actually sending IPI. Useful
>>> 		to estimate system overhead.
>>> Self-IPI:	Send IPI to self CPU.
>>> Normal IPI:	Send IPI to some other CPU.
>>> Broadcast IPI:	Send broadcast IPI to all online CPUs.
>>>
>>> For virtualized guests, sending and reveiving IPIs causes guest exit.
>>> I used this test to measure performance impact on KVM subsystem of
>>> Christoffer Dall's series "Optimize KVM/ARM for VHE systems".
>>>
>>> https://www.spinics.net/lists/kvm/msg156755.html
>>>
>>> Test machine is ThunderX2, 112 online CPUs. Below the results normalized
>>> to host dry-run time. Smaller - better.
>>>
>>> Host, v4.14:
>>> Dry-run:	  0	    1
>>> Self-IPI:         9	   18
>>> Normal IPI:      81	  110
>>> Broadcast IPI:    0	 2106
>>>
>>> Guest, v4.14:
>>> Dry-run:          0	    1
>>> Self-IPI:        10	   18
>>> Normal IPI:     305	  525
>>> Broadcast IPI:    0    	 9729
>>>
>>> Guest, v4.14 + VHE:
>>> Dry-run:          0	    1
>>> Self-IPI:         9	   18
>>> Normal IPI:     176	  343
>>> Broadcast IPI:    0	 9885
[...]
>>> +static int __init init_bench_ipi(void)
>>> +{
>>> +	ktime_t ipi, total;
>>> +	int ret;
>>> +
>>> +	ret = bench_ipi(NTIMES, DRY_RUN, &ipi, &total);
>>> +	if (ret)
>>> +		pr_err("Dry-run FAILED: %d\n", ret);
>>> +	else
>>> +		pr_err("Dry-run:       %18llu, %18llu ns\n", ipi, total);
>>
>> you do not use NTIMES here to calculate the average value. Is that intended?
> 
> I think, it's more visually to represent all results in number of dry-run
> times, like I did in patch description. So on kernel side I expose raw data
> and calculate final values after finishing tests.

I think it is highly confusing that the output from the patch description does not
match the output from the real module. So can you make that match at least?
> 
> If you think that average values are preferable, I can do that in v2.

The raw numbers a propably fine, but then you might want to print the number of 
loop iterations in the output.
If we want to do something fancy, we could do a combination of a smaller inner
loop doing the test, then an outer loops redoing the inner loop and then you 
can do some min/max/average  calculation. Not s

Powered by blists - more mailing lists