lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d012c68e-4c43-04af-3505-4c980e2d00a5@redhat.com>
Date:   Mon, 15 Apr 2019 09:43:00 -0400
From:   Waiman Long <longman@...hat.com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Will Deacon <will.deacon@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        Davidlohr Bueso <dave@...olabs.net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        huang ying <huang.ying.caritas@...il.com>
Subject: Re: [PATCH-tip v3 02/14] locking/rwsem: Make owner available even if
 !CONFIG_RWSEM_SPIN_ON_OWNER

On 04/12/2019 10:24 PM, Waiman Long wrote:
> On 04/12/2019 02:05 PM, Waiman Long wrote:
>
>>>  [locking/rwsem] adc32e8877: will-it-scale.per_thread_ops -21.0% regression
>> Will look into that also.
> I can reproduce the regression on the same skylake system.
>
> The results of the page_fault1 will-it-scale test are as follows:
>
>  Threads   K2      K3       K4       K5
>  -------   --      --       --       --
>     20  5549772  5550332  5463961  5400064
>     40  9540445 10286071  9705062  7706082
>     60  8187245  8212307  7777247  6647705
>     89  8390758  9619271  9019454  7124407
>
> So the wake-all-reader patch is good for this benchmark. The performance
> was reduced a bit with the reader-spin-on-writer patch. It got even worse
> with the writer-spin-on-reader patch.
>
> I looked at the perf output, rwsem contention accounted for less than
> 1% of the total cpu cycles. So I believe the regression was caused by
> the behavior change introduced by the two reader optimistic spinning
> patches. These patch will make writer less preferred than before. I
> think the performance of this microbenchmark may be more dependent on
> writer performance.
>
> Looking at the lock event counts for K5:
>
>  rwsem_opt_fail=253647
>  rwsem_opt_nospin=8776
>  rwsem_opt_rlock=259941
>  rwsem_opt_wlock=2543
>  rwsem_rlock=237747
>  rwsem_rlock_fail=0
>  rwsem_rlock_fast=0
>  rwsem_rlock_handoff=0
>  rwsem_sleep_reader=237747
>  rwsem_sleep_writer=23098
>  rwsem_wake_reader=6033
>  rwsem_wake_writer=47032
>  rwsem_wlock=15890
>  rwsem_wlock_fail=10
>  rwsem_wlock_handoff=3991
>
> For K4, it was
>
>  rwsem_opt_fail=479626
>  rwsem_opt_rlock=8877
>  rwsem_opt_wlock=114
>  rwsem_rlock=453874
>  rwsem_rlock_fail=0
>  rwsem_rlock_fast=1234
>  rwsem_rlock_handoff=0
>  rwsem_sleep_reader=453058
>  rwsem_sleep_writer=25836
>  rwsem_wake_reader=11054
>  rwsem_wake_writer=71568
>  rwsem_wlock=24515
>  rwsem_wlock_fail=3
>  rwsem_wlock_handoff=5245
>
> It can be seen that a lot more readers got the lock via optimistic
> spinning.  One possibility is that reader optimistic spinning causes
> readers to spread out into more lock acquisition groups than without. The
> K3 results show that grouping more readers into one lock acquisition
> group help to improve performance for this microbenchmark. I will need
> to run more tests to find out the root cause of this regression. It is
> not an easy problem to solve.

Just an update on my will-it-scale regression investigation. I have
tried various ways to tune the rwsem code to get more performance out
from this benchmark. I got some minor improvements but nothing major. So
it looks like that there are some workloads that have performance hurted
by reader optimistic spinning and this benchmark is one of them. Now I
am testing an adaptive reader optimistic spinning disabling patch that
shows great promise as I was able to bring back a major portion of the
lost performance. I will try to make the patch more aggressive to see if
it can bring most of the lost performance back.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ