lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 30 Sep 2013 22:41:17 -0400
From:	Waiman Long <waiman.long@...com>
To:	Tim Chen <tim.c.chen@...ux.intel.com>
CC:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Rik van Riel <riel@...hat.com>,
	Peter Hurley <peter@...leysoftware.com>,
	Davidlohr Bueso <davidlohr.bueso@...com>,
	Alex Shi <alex.shi@...el.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Matthew R Wilcox <matthew.r.wilcox@...el.com>,
	Dave Hansen <dave.hansen@...el.com>,
	Michel Lespinasse <walken@...gle.com>,
	Andi Kleen <andi@...stfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@...com>,
	"Norton, Scott J" <scott.norton@...com>
Subject: Re: [PATCH, v2] anon_vmas: Convert the rwsem to an rwlock_t

On 09/30/2013 03:47 PM, Tim Chen wrote:
>> My qrwlock doesn't enable qrwlock by default. You have to use menuconfig
>> to explicitly enable it. Have you done that when you build the test
>> kernel? I am thinking of explicitly enabling it for x86 if the anon-vma
>> lock is converted back to a rwlock.
>>
> Yes, I have explicitly enabled it during my testing.
>
> Thanks.
> Tim
>
Thank for the info.

I had tested Ingo's 2nd patch myself with the qrwlock patch on a 8-node
machine with a 3.12.0-rc2 kernel. The results of AIM7's high_systime
(at 1500 users) were:

Anon-vmas lock          JPM        %Change
--------------          ---        -------
     rwsem        148265           -
     rwlock        238715          +61%
     qrwlock        242048          +63%

So the queue rwlock was only slightly faster in this case. Below were
the perf profile with rwlock:

  18.20%   reaim  [kernel.kallsyms]  [k] __write_lock_failed
   9.36%   reaim  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   2.91%   reaim  [kernel.kallsyms]  [k] mspin_lock
   2.73%   reaim  [kernel.kallsyms]  [k] anon_vma_interval_tree_insert
   2.23%      ls  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   1.29%   reaim  [kernel.kallsyms]  [k] __read_lock_failed
   1.21%    true  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   1.14%   reaim  [kernel.kallsyms]  [k] zap_pte_range
   1.13%   reaim  [kernel.kallsyms]  [k] _raw_spin_lock
   1.04%   reaim  [kernel.kallsyms]  [k] mutex_spin_on_owner

The perf profile with qrwlock:

  10.57%   reaim  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   7.98%   reaim  [kernel.kallsyms]  [k] queue_write_lock_slowpath
   5.83%   reaim  [kernel.kallsyms]  [k] mspin_lock
   2.86%      ls  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   2.71%   reaim  [kernel.kallsyms]  [k] anon_vma_interval_tree_insert
   1.52%    true  [kernel.kallsyms]  [k] _raw_spin_lock_irqsave
   1.51%   reaim  [kernel.kallsyms]  [k] queue_read_lock_slowpath
   1.35%   reaim  [kernel.kallsyms]  [k] mutex_spin_on_owner
   1.12%   reaim  [kernel.kallsyms]  [k] zap_pte_range
   1.06%   reaim  [kernel.kallsyms]  [k] perf_event_aux_ctx
   1.01%   reaim  [kernel.kallsyms]  [k] perf_event_aux

In the qrwlock kernel, less time were spent in the rwlock slowpath
path (about half). However, more time were now spent in the spinlock
and mutex spinning. Another observation is that no noticeable idle
time was reported whereas the system could be half idle with rwsem.
There was also a lot less idle balancing activities.

The qrwlock is fair wrt the writers. So its performance may not be
as good as the fully unfair rwlock. However, queuing reduces a lot
of cache contention traffic, thus improving performance. It is the
interplay of these 2 factors that determine how much performance
benefit we can see. Another factor to consider is that when we have
less contention in anon-vmas, other areas of contentions will show up.

With qrwlock, the spinlock contention was:

  10.57%   reaim  [kernel.kallsyms]     [k] _raw_spin_lock_irqsave
              |--58.70%-- release_pages
              |--38.42%-- pagevec_lru_move_fn
              |--0.64%-- get_page_from_freelist
              |--0.64%-- __page_cache_release
               --1.60%-- [...]

   2.86%      ls  [kernel.kallsyms]     [k] _raw_spin_lock_irqsave
                 |--52.73%-- pagevec_lru_move_fn
                 |--46.25%-- release_pages
                  --1.02%-- [...]

   1.52%    true  [kernel.kallsyms]     [k] _raw_spin_lock_irqsave
               |--53.76%-- pagevec_lru_move_fn
               |--43.95%-- release_pages
               |--1.02%-- __page_cache_release

-Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ