linux-kernel - Re: [RFC PATCH 3/3 -v2] x86,smp: auto tune spinlock backoff delay factor

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANN689HBx357M+7ge3SQ_xtnyiTKPY=1v0oR+DS9EBiak-2BQg@mail.gmail.com>
Date:	Sat, 29 Dec 2012 02:27:29 -0800
From:	Michel Lespinasse <walken@...gle.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Rik van Riel <riel@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, aquini@...hat.com,
	lwoodman@...hat.com, jeremy@...p.org,
	Jan Beulich <JBeulich@...ell.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tom Herbert <therbert@...gle.com>
Subject: Re: [RFC PATCH 3/3 -v2] x86,smp: auto tune spinlock backoff delay factor

On Wed, Dec 26, 2012 at 11:10 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> I did some tests with your patches with following configuration :
>
> tc qdisc add dev eth0 root htb r2q 1000 default 3
> (to force a contention on qdisc lock, even with a multi queue net
> device)
>
> and 24 concurrent "netperf -t UDP_STREAM -H other_machine -- -m 128"
>
> Machine : 2 Intel(R) Xeon(R) CPU X5660  @ 2.80GHz
> (24 threads), and a fast NIC (10Gbps)
>
> Resulting in a 13 % regression (676 Mbits -> 595 Mbits)

I've been trying to use this workload on a similar machine. I am
getting some confusing results however:

with 24 concurrent netperf -t UDP_STREAM -H $target -- -m 128 -R 1 , I
am seeing some non-trivial run-to-run performance variation - about 5%
in v3.7 baseline, but very significant after applying rik's 3 patches.
my last few runs gave me results of 890.92, 1073.74, 963.13, 1234.41,
754.18, 893.82. This is generally better than what I'm getting with
baseline, but the variance is huge (which is somewhat surprising given
that rik's patches don't have the issue of hash collisions). Also,
this is significant in that I am not seeing the regression you were
observing with just these 3 patches.

If I add a 1 second delay in the netperf command line (netperf -t
UDP_STREAM -s 1 -H lpk18 -- -m 128 -R 1), I am seeing a very constant
660 Mbps result, but then I don't see any benefit from applying rik's
patches. I have no explanation for these results, but I am getting
them very consistently...

> In this workload we have at least two contended spinlocks, with
> different delays. (spinlocks are not held for the same duration)

Just to confirm, I believe you are refering to qdisc->q.lock and
qdisc->busylock ?

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/