lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5228C559.1090509@hp.com>
Date:	Thu, 05 Sep 2013 13:54:33 -0400
From:	Waiman Long <waiman.long@...com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
CC:	Heiko Carstens <heiko.carstens@...ibm.com>,
	Tony Luck <tony.luck@...el.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] lockref: remove cpu_relax() again

On 09/05/2013 11:31 AM, Linus Torvalds wrote:
> On Thu, Sep 5, 2013 at 6:18 AM, Heiko Carstens
> <heiko.carstens@...ibm.com>  wrote:
>> *If* however the cpu_relax() makes sense on other platforms maybe we could
>> add something like we have already with "arch_mutex_cpu_relax()":
> I actually think it won't.
>
> The lockref cmpxchg isn't waiting for something to change - it only
> loops _if_ something has changed, and rather than cpu_relax(), we most
> likely want to try to take advantage of the fact that we have the
> changed data in our exclusive cacheline, and try to get our ref update
> out as soon as possible.
>
> IOW, the lockref loop is not an idle loop like a spinlock "wait for
> lock to be released", it's very much an active loop of "oops,
> something changed".
>
> And there can't be any livelock, since by definition somebody else
> _did_ make progress. In fact, adding the cpu_relax() probably just
> makes things much less fair - once somebody else raced on you, the
> cpu_relax() now makes it more likely that _another_ cpu does so too.
>
> That said, let's see Tony's numbers are. On x86, it doesn't seem to
> matter, but as Tony noticed, the variability can be quite high (for
> me, the numbers tend to be quite stable when running the test program
> multiple times in a loop, but then variation between boots or after
> having done something else can be quite big - I suspect the cache
> access patterns end up varying wildly with different dentry layout and
> hash chain depth).
>
>                Linus
I have found that having a cpu_relax() at the bottom of the while
loop in CMPXCHG_LOOP() macro does help performance in some case on
x86-64 processors. I saw no noticeable difference for the AIM7's
short workload. On the high_systime workload, however, I saw about 5%
better performance with cpu_relax(). Below were the perf profile of
the 2 high_systime runs at 1500 users on a 80-core DL980 with HT off.

Without cpu_relax():

      4.60%     ls  [kernel.kallsyms]     [k] _raw_spin_lock
                   |--48.50%-- lockref_put_or_lock
                   |--46.97%-- lockref_get_or_lock
                   |--1.04%-- lockref_get

      2.95%     reaim  [kernel.kallsyms]     [k] _raw_spin_lock
                   |--29.70%-- lockref_put_or_lock
                   |--28.87%-- lockref_get_or_lock
                   |--0.63%-- lockref_get


With cpu_relax():

      1.67%     reaim  [kernel.kallsyms]     [k] _raw_spin_lock
                   |--14.80%-- lockref_put_or_lock
                   |--14.04%-- lockref_get_or_lock

      1.36%     ls  [kernel.kallsyms]     [k] _raw_spin_lock
                   |--44.94%-- lockref_put_or_lock
                   |--43.12%-- lockref_get_or_lock

So I would suggest having some kind of conditional cpu_relax() in
the loop.

-Longman



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ