lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8ee10a0a-b257-4141-b955-0097b43405e6@redhat.com>
Date: Thu, 27 Mar 2025 23:58:51 -0400
From: Waiman Long <llong@...hat.com>
To: Cruz Zhao <CruzZhao@...ux.alibaba.com>, peterz@...radead.org,
 mingo@...hat.com, boqun.feng@...il.com, will@...nel.org
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH] percpu_rwsem: let percpu_rwsem writer get rwsem faster


On 3/27/25 11:05 PM, Cruz Zhao wrote:
> In the scenario where a large number of containers are created
> at the same time, there will be a lot of tasks created in a
> short time, and they will be written into cgroup.procs.
>
> copy_process() will require the cgroup_threadgroup_rwsem read
> lock, cgroup_procs_write will require the cgroup_threadgroup_rwsem
> write lock. As the readers will pre-increase the read_count and
> then check whether there is any writers, resulting that the
> writer may be starving, especially when there is a steady stream
> of readers.
>
> To alleviate this problem, we add one more check whether there
> are writers waiting before increasing the read_count, to make
> writers getting lock faster.
>
> Signed-off-by: Cruz Zhao <CruzZhao@...ux.alibaba.com>
> ---
>   kernel/locking/percpu-rwsem.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
> index 6083883c4fe0..66bf18c28b43 100644
> --- a/kernel/locking/percpu-rwsem.c
> +++ b/kernel/locking/percpu-rwsem.c
> @@ -47,6 +47,11 @@ EXPORT_SYMBOL_GPL(percpu_free_rwsem);
>   
>   static bool __percpu_down_read_trylock(struct percpu_rw_semaphore *sem)
>   {
> +	if (unlikely(atomic_read_acquire(&sem->block))) {
> +		rcuwait_wake_up(&sem->writer);
> +		return false;
> +	}
> +
>   	this_cpu_inc(*sem->read_count);
>   
>   	/*

The specific sequence of events are there for a reason. If we disturb 
the sequence like that, there is a possibility that a percpu_up_write() 
may miss a waiting reader, for example. So a more careful analysis has 
to be done.

BTW, how much performance benefit did you gain by making this change? We 
certainly need to see some performance metrics.

The design of percpu rwsem prefers readers more with much less 
performance overhead than regular rwsem. It also assumes writers come in 
once in a while. To be more fair to writer, we use rwsem.

Cheers,
Longman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ