[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <560D82C0.4020006@oracle.com>
Date: Thu, 1 Oct 2015 12:00:16 -0700
From: "santosh.shilimkar@...cle.com" <santosh.shilimkar@...cle.com>
To: David Laight <David.Laight@...LAB.COM>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc: "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"ssantosh@...nel.org" <ssantosh@...nel.org>
Subject: Re: [PATCH v2 00/14] RDS: connection scalability and performance
improvements
On 10/1/15 9:19 AM, David Laight wrote:
> From: Santosh Shilimkar
>> Sent: 30 September 2015 18:24
> ...
>> This is being addressed by simply using per bucket rw lock which makes the
>> locking simple and very efficient. The hash table size is still an issue and
>> I plan to address it by using re-sizable hash tables as suggested on the list.
>
> If the hash chains are short do you need the expense of a rw lock
> for each chain?
Chains can be really long on larger systems with many databases.
> A simple spinlock may be faster.
>
> If you use the hash chain lock for the reference count on the hashed
> objects you should be able to release the lock before locking the
> object itself.
>
Because of the shared socket nature of RDS, the chain needs to be
protected for parallel accesses for add/removal/lookup. Hashing is
really used to get to the bucket which holds the hlist.
Just to give a bit of history, RDS bind code has evolved over
few years of time. It started with a rb tree and a global rw
lock which wasn't very efficient.Then it was converted to rcu
hlist with spin lock to make the look ups faster. But that
scheme as well exploded on larger systems with truckloads of
sockets with read/write lock failures because of excessive
contention. As high as almost ~25% of system load. Per bucket
lock actually solved most of those issues. Bucket or chain table
increase(1k to 8K) was actually relatively smaller gain though still
helped to reduce the contention by almost to a nominal 1 or 2 %.
Am still getting my head around with rhashtable plumbing with
this usecase. Will CC you when I post the RFC patch for the
rhashtable conversion. Thanks for your comments so far.
Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists