linux-kernel - Re: [PATCH v2] IPI performance benchmark

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20171219155141.889253fe797ca838da71e88f@linux-foundation.org>
Date:   Tue, 19 Dec 2017 15:51:41 -0800
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     Yury Norov <ynorov@...iumnetworks.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        Ashish Kalra <Ashish.Kalra@...ium.com>,
        Christoffer Dall <christoffer.dall@...aro.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        Linu Cherian <Linu.Cherian@...ium.com>,
        Shih-Wei Li <shihwei@...columbia.edu>,
        Sunil Goutham <Sunil.Goutham@...ium.com>,
        Ingo Molnar <mingo@...e.hu>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH v2] IPI performance benchmark

On Tue, 19 Dec 2017 11:50:10 +0300 Yury Norov <ynorov@...iumnetworks.com> wrote:

> This benchmark sends many IPIs in different modes and measures
> time for IPI delivery (first column), and total time, ie including
> time to acknowledge the receive by sender (second column).
> 
> The scenarios are:
> Dry-run:	do everything except actually sending IPI. Useful
> 		to estimate system overhead.
> Self-IPI:	Send IPI to self CPU.
> Normal IPI:	Send IPI to some other CPU.
> Broadcast IPI:	Send broadcast IPI to all online CPUs.
> Broadcast lock:	Send broadcast IPI to all online CPUs and force them
>                 acquire/release spinlock.
> 
> The raw output looks like this:
> [  155.363374] Dry-run:                         0,            2999696 ns
> [  155.429162] Self-IPI:                 30385328,           65589392 ns
> [  156.060821] Normal IPI:              566914128,          631453008 ns
> [  158.384427] Broadcast IPI:                   0,         2323368720 ns
> [  160.831850] Broadcast lock:                  0,         2447000544 ns
> 
> For virtualized guests, sending and reveiving IPIs causes guest exit.
> I used this test to measure performance impact on KVM subsystem of
> Christoffer Dall's series "Optimize KVM/ARM for VHE systems" [1].
> 
> Test machine is ThunderX2, 112 online CPUs. Below the results normalized
> to host dry-run time, broadcast lock results omitted. Smaller - better.
> 
> Host, v4.14:
> Dry-run:	  0	    1
> Self-IPI:         9	   18
> Normal IPI:      81	  110
> Broadcast IPI:    0	 2106
> 
> Guest, v4.14:
> Dry-run:          0	    1
> Self-IPI:        10	   18
> Normal IPI:     305	  525
> Broadcast IPI:    0    	 9729
> 
> Guest, v4.14 + [1]:
> Dry-run:          0	    1
> Self-IPI:         9	   18
> Normal IPI:     176	  343
> Broadcast IPI:    0	 9885
> 

That looks handy.  Peter and Ingo might be interested.

I wonder if it should be in kernel/.  Perhaps it's better to accumulate
these things in lib/test_*.c, rather than cluttering up other top-level
directories.

> +static ktime_t __init send_ipi(int flags)
> +{
> +	ktime_t time = 0;
> +	DEFINE_SPINLOCK(lock);

I have some vague historical memory that an on-stack spinlock can cause
problems, perhaps with debugging code.  Can't remember, maybe I dreamed it.