[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <abe71078-91b1-e4dd-878c-4a53d9349ecc@redhat.com>
Date: Thu, 25 Jul 2019 12:26:37 -0400
From: Waiman Long <longman@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Will Deacon <will.deacon@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Davidlohr Bueso <dave@...olabs.net>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Tim Chen <tim.c.chen@...ux.intel.com>,
huang ying <huang.ying.caritas@...il.com>
Subject: Re: [PATCH-tip] locking/rwsem: Make handoff writer optimistically
spin on owner
On 6/25/19 10:39 AM, Waiman Long wrote:
> When the handoff bit is set by a writer, no other tasks other than
> the setting writer itself is allowed to acquire the lock. If the
> to-be-handoff'ed writer goes to sleep, there will be a wakeup latency
> period where the lock is free, but no one can acquire it. That is less
> than ideal.
>
> To reduce that latency, the handoff writer will now optimistically spin
> on the owner if it happens to be a on-cpu writer. It will spin until
> it releases the lock and the to-be-handoff'ed writer can then acquire
> the lock immediately without any delay. Of course, if the owner is not
> a on-cpu writer, the to-be-handoff'ed writer will have to sleep anyway.
>
> The optimistic spinning code is also modified to not stop spinning
> when the handoff bit is set. This will prevent an occasional setting of
> handoff bit from causing a bunch of optimistic spinners from entering
> into the wait queue causing significant reduction in throughput.
>
> On a 1-socket 22-core 44-thread Skylake system, the AIM7 shared_memory
> workload was run with 7000 users. The throughput (jobs/min) of the
> following kernels were as follows:
>
> 1) 5.2-rc6
> - 8,092,486
> 2) 5.2-rc6 + tip's rwsem patches
> - 7,567,568
> 3) 5.2-rc6 + tip's rwsem patches + this patch
> - 7,954,545
>
> Using perf-record(1), the %cpu time used by rwsem_down_write_slowpath(),
> rwsem_down_write_failed() and their callees for the 3 kernels were 1.70%,
> 5.46% and 2.08% respectively.
>
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
> kernel/locking/rwsem.c | 32 ++++++++++++++++++++++++++------
> 1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 37524a47f002..5e022fbdb2bb 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -720,11 +720,12 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
>
> rcu_read_lock();
> for (;;) {
> - if (atomic_long_read(&sem->count) & RWSEM_FLAG_HANDOFF) {
> - state = OWNER_NONSPINNABLE;
> - break;
> - }
> -
> + /*
> + * When a waiting writer set the handoff flag, it may spin
> + * on the owner as well. Once that writer acquires the lock,
> + * we can spin on it. So we don't need to quit even when the
> + * handoff bit is set.
> + */
> new = rwsem_owner_flags(sem, &new_flags);
> if ((new != owner) || (new_flags != flags)) {
> state = rwsem_owner_state(new, new_flags, nonspinnable);
> @@ -970,6 +971,13 @@ static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
> {
> return false;
> }
> +
> +static inline int
> +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
> +{
> + return 0;
> +}
> +#define OWNER_NULL 1
> #endif
>
> /*
> @@ -1190,6 +1198,18 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>
> raw_spin_unlock_irq(&sem->wait_lock);
>
> + /*
> + * After setting the handoff bit and failing to acquire
> + * the lock, attempt to spin on owner to accelerate lock
> + * transfer. If the previous owner is a on-cpu writer and it
> + * has just released the lock, OWNER_NULL will be returned.
> + * In this case, we attempt to acquire the lock again
> + * without sleeping.
> + */
> + if ((wstate == WRITER_HANDOFF) &&
> + (rwsem_spin_on_owner(sem, 0) == OWNER_NULL))
> + goto trylock_again;
> +
> /* Block until there are no active lockers. */
> for (;;) {
> if (signal_pending_state(state, current))
> @@ -1224,7 +1244,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
> break;
> }
> }
> -
> +trylock_again:
> raw_spin_lock_irq(&sem->wait_lock);
> }
> __set_current_state(TASK_RUNNING);
Any comment on this patch?
Cheers,
Longman
Powered by blists - more mailing lists