linux-kernel - Re: [RFC PATCH 3/3 -v2] x86,smp: auto tune spinlock backoff delay factor

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Thu, 03 Jan 2013 10:17:48 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Michel Lespinasse <walken@...gle.com>
Cc:	Rik van Riel <riel@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, aquini@...hat.com,
	lwoodman@...hat.com, jeremy@...p.org,
	Jan Beulich <JBeulich@...ell.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tom Herbert <therbert@...gle.com>
Subject: Re: [RFC PATCH 3/3 -v2] x86,smp: auto tune spinlock backoff delay
 factor

On Sat, 2012-12-29 at 02:27 -0800, Michel Lespinasse wrote:
> On Wed, Dec 26, 2012 at 11:10 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> > I did some tests with your patches with following configuration :
> >
> > tc qdisc add dev eth0 root htb r2q 1000 default 3
> > (to force a contention on qdisc lock, even with a multi queue net
> > device)
> >
> > and 24 concurrent "netperf -t UDP_STREAM -H other_machine -- -m 128"
> >
> > Machine : 2 Intel(R) Xeon(R) CPU X5660  @ 2.80GHz
> > (24 threads), and a fast NIC (10Gbps)
> >
> > Resulting in a 13 % regression (676 Mbits -> 595 Mbits)
> 
> I've been trying to use this workload on a similar machine. I am
> getting some confusing results however:
> 
> with 24 concurrent netperf -t UDP_STREAM -H $target -- -m 128 -R 1 , I
> am seeing some non-trivial run-to-run performance variation - about 5%
> in v3.7 baseline, but very significant after applying rik's 3 patches.
> my last few runs gave me results of 890.92, 1073.74, 963.13, 1234.41,
> 754.18, 893.82. This is generally better than what I'm getting with
> baseline, but the variance is huge (which is somewhat surprising given
> that rik's patches don't have the issue of hash collisions).

You mean that with Rik's patch, there is definitely an issue, as it has
a single bucket. Chances of collisions are high.


Your numbers being very random, I suspect you might hit another limit.

My tests involved a NIC with 24 transmit queues, to remove the per TX
queue lock out of the bench equation.

My guess is you use a NIC with 4 or 8 TX queues.

"ethtool -S eth0" would probably give some hints.

>  Also,
> this is significant in that I am not seeing the regression you were
> observing with just these 3 patches.
> 
> If I add a 1 second delay in the netperf command line (netperf -t
> UDP_STREAM -s 1 -H lpk18 -- -m 128 -R 1), I am seeing a very constant
> 660 Mbps result, but then I don't see any benefit from applying rik's
> patches. I have no explanation for these results, but I am getting
> them very consistently...
> 
> > In this workload we have at least two contended spinlocks, with
> > different delays. (spinlocks are not held for the same duration)
> 
> Just to confirm, I believe you are refering to qdisc->q.lock and
> qdisc->busylock ?
> 

Yes.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/