lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <98bbac96-99df-46de-9066-2f8315c17eb7@redhat.com>
Date: Thu, 20 Nov 2025 11:43:49 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Allison Henderson <achender@...nel.org>, netdev@...r.kernel.org
Cc: edumazet@...gle.com, rds-devel@....oracle.com, kuba@...nel.org,
 horms@...nel.org, linux-rdma@...r.kernel.org
Subject: Re: [PATCH net-next v2 2/2] net/rds: Give each connection its own
 workqueue

On 11/17/25 9:23 PM, Allison Henderson wrote:
> From: Allison Henderson <allison.henderson@...cle.com>
> 
> RDS was written to require ordered workqueues for "cp->cp_wq":
> Work is executed in the order scheduled, one item at a time.
> 
> If these workqueues are shared across connections,
> then work executed on behalf of one connection blocks work
> scheduled for a different and unrelated connection.
> 
> Luckily we don't need to share these workqueues.
> While it obviously makes sense to limit the number of
> workers (processes) that ought to be allocated on a system,
> a workqueue that doesn't have a rescue worker attached,
> has a tiny footprint compared to the connection as a whole:
> A workqueue costs ~900 bytes, including the workqueue_struct,
> pool_workqueue, workqueue_attrs, wq_node_nr_active and the
> node_nr_active flex array.  While an RDS/IB connection
> totals only ~5 MBytes.

The above accounting still looks incorrect to me. AFAICS
pool_workqueue/cpu_pwq is a per CPU data. On recent hosts it will
require 64K or more.

Also it looks like it would a WQ per path, up to 8 WQs per connection.
> So we're getting a signficant performance gain
> (90% of connections fail over under 3 seconds vs. 40%)
> for a less than 0.02% overhead.
> 
> RDS doesn't even benefit from the additional rescue workers:
> of all the reasons that RDS blocks workers, allocation under
> memory pressue is the least of our concerns. And even if RDS
> was stalling due to the memory-reclaim process, the work
> executed by the rescue workers are highly unlikely to free up
> any memory. If anything, they might try to allocate even more.
> 
> By giving each connection its own workqueues, we allow RDS
> to better utilize the unbound workers that the system
> has available.
> 
> Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@...cle.com>
> Signed-off-by: Allison Henderson <allison.henderson@...cle.com>
> ---
>  net/rds/connection.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/net/rds/connection.c b/net/rds/connection.c
> index dc7323707f450..dcb554e10531f 100644
> --- a/net/rds/connection.c
> +++ b/net/rds/connection.c
> @@ -269,7 +269,15 @@ static struct rds_connection *__rds_conn_create(struct net *net,
>  		__rds_conn_path_init(conn, &conn->c_path[i],
>  				     is_outgoing);
>  		conn->c_path[i].cp_index = i;
> -		conn->c_path[i].cp_wq = rds_wq;
> +		conn->c_path[i].cp_wq = alloc_ordered_workqueue(
> +						"krds_cp_wq#%lu/%d", 0,
> +						rds_conn_count, i);
This has a reasonable chance of failure under memory pressure, what
about falling back to rds_wq usage instead of shutting down the
connection entirely?

/P


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ