[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <abe71078-91b1-e4dd-878c-4a53d9349ecc@redhat.com>
Date:   Thu, 25 Jul 2019 12:26:37 -0400
From:   Waiman Long <longman@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Will Deacon <will.deacon@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Davidlohr Bueso <dave@...olabs.net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        huang ying <huang.ying.caritas@...il.com>
Subject: Re: [PATCH-tip] locking/rwsem: Make handoff writer optimistically
 spin on owner
On 6/25/19 10:39 AM, Waiman Long wrote:
> When the handoff bit is set by a writer, no other tasks other than
> the setting writer itself is allowed to acquire the lock. If the
> to-be-handoff'ed writer goes to sleep, there will be a wakeup latency
> period where the lock is free, but no one can acquire it. That is less
> than ideal.
>
> To reduce that latency, the handoff writer will now optimistically spin
> on the owner if it happens to be a on-cpu writer. It will spin until
> it releases the lock and the to-be-handoff'ed writer can then acquire
> the lock immediately without any delay. Of course, if the owner is not
> a on-cpu writer, the to-be-handoff'ed writer will have to sleep anyway.
>
> The optimistic spinning code is also modified to not stop spinning
> when the handoff bit is set. This will prevent an occasional setting of
> handoff bit from causing a bunch of optimistic spinners from entering
> into the wait queue causing significant reduction in throughput.
>
> On a 1-socket 22-core 44-thread Skylake system, the AIM7 shared_memory
> workload was run with 7000 users. The throughput (jobs/min) of the
> following kernels were as follows:
>
>  1) 5.2-rc6
>     - 8,092,486
>  2) 5.2-rc6 + tip's rwsem patches
>     - 7,567,568
>  3) 5.2-rc6 + tip's rwsem patches + this patch
>     - 7,954,545
>
> Using perf-record(1), the %cpu time used by rwsem_down_write_slowpath(),
> rwsem_down_write_failed() and their callees for the 3 kernels were 1.70%,
> 5.46% and 2.08% respectively.
>
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
>  kernel/locking/rwsem.c | 32 ++++++++++++++++++++++++++------
>  1 file changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 37524a47f002..5e022fbdb2bb 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -720,11 +720,12 @@ rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
>  
>  	rcu_read_lock();
>  	for (;;) {
> -		if (atomic_long_read(&sem->count) & RWSEM_FLAG_HANDOFF) {
> -			state = OWNER_NONSPINNABLE;
> -			break;
> -		}
> -
> +		/*
> +		 * When a waiting writer set the handoff flag, it may spin
> +		 * on the owner as well. Once that writer acquires the lock,
> +		 * we can spin on it. So we don't need to quit even when the
> +		 * handoff bit is set.
> +		 */
>  		new = rwsem_owner_flags(sem, &new_flags);
>  		if ((new != owner) || (new_flags != flags)) {
>  			state = rwsem_owner_state(new, new_flags, nonspinnable);
> @@ -970,6 +971,13 @@ static inline bool rwsem_reader_phase_trylock(struct rw_semaphore *sem,
>  {
>  	return false;
>  }
> +
> +static inline int
> +rwsem_spin_on_owner(struct rw_semaphore *sem, unsigned long nonspinnable)
> +{
> +	return 0;
> +}
> +#define OWNER_NULL	1
>  #endif
>  
>  /*
> @@ -1190,6 +1198,18 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>  
>  		raw_spin_unlock_irq(&sem->wait_lock);
>  
> +		/*
> +		 * After setting the handoff bit and failing to acquire
> +		 * the lock, attempt to spin on owner to accelerate lock
> +		 * transfer. If the previous owner is a on-cpu writer and it
> +		 * has just released the lock, OWNER_NULL will be returned.
> +		 * In this case, we attempt to acquire the lock again
> +		 * without sleeping.
> +		 */
> +		if ((wstate == WRITER_HANDOFF) &&
> +		    (rwsem_spin_on_owner(sem, 0) == OWNER_NULL))
> +			goto trylock_again;
> +
>  		/* Block until there are no active lockers. */
>  		for (;;) {
>  			if (signal_pending_state(state, current))
> @@ -1224,7 +1244,7 @@ rwsem_down_write_slowpath(struct rw_semaphore *sem, int state)
>  				break;
>  			}
>  		}
> -
> +trylock_again:
>  		raw_spin_lock_irq(&sem->wait_lock);
>  	}
>  	__set_current_state(TASK_RUNNING);
Any comment on this patch?
Cheers,
Longman
Powered by blists - more mailing lists
 
