linux-kernel - Re: [RFC PATCH v2 1/2] kernfs: use kernfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <d403d2ea-8124-9cec-eb45-7e61b9b68f21@oracle.com>
Date:   Thu, 13 Jan 2022 21:51:41 +1100
From:   Imran Khan <imran.f.khan@...cle.com>
To:     Greg KH <gregkh@...uxfoundation.org>, Tejun Heo <tj@...nel.org>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 1/2] kernfs: use kernfs_node specific mutex and
 spinlock.

Hello,

On 13/1/22 7:48 pm, Greg KH wrote:
> On Wed, Jan 12, 2022 at 10:08:55AM -1000, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Jan 11, 2022 at 10:42:31AM +1100, Imran Khan wrote:
>>> The database application has a health monitoring component which
>>> regularly collects stats from sysfs. With small number of databases this
>>> was not an issue but recently several customers did some consolidation
>>> and ended up having hundreds of databases, all running on the same
>>> server and in those setups the contention became more and more evident.
>>> As more and more customers are consolidating we have started to get more
>>> occurences of this issue and in this case it all depends on number of
>>> running databases on the server.
>>>
>>> I will have to reach out to application team to get a list of all sysfs
>>> files being accessed but one of them is
>>> "/sys/class/infiniband/<device>/ports/<port number>/gids/<gid index>".
>>
>> I can imagine a similar scenario w/ cgroups with heavy stacking - each
>> application fetches its own stat regularly which isn't a problem in
>> isolation but once you put thousands of them on a machine, the shared lock
>> can get pretty hot, and the cgroup scenario probably is more convincing in
>> that they'd be hitting different files but the lock gets hot it is shared
>> across them.
>>
>> Greg, I think the call for better scalability for read operations is
>> reasonably justified especially for heavy workload stacking which is a valid
>> use case and likely to become more prevalent. Given the requirements, hashed
>> locking seems like the best solution here. It doesn't cause noticeable space
>> overhead and is pretty easy to scale. What do you think?
> 
> I have no objection to changes that remove the lock contention, as long
> as they do not add huge additional memory requirements, like the
> original submission here did.  If using hashed locks is the solution,
> wonderful!
> 

Thanks Tejun and Greg, for suggestions and reviews so far.
I have sent v3 of this change at [1]. v3 uses hashed locks and overall
increase in memory footprint will be around 100K at most (with maximum
size of hash table). I will await your feedback regarding this.

[1]:
https://lore.kernel.org/lkml/20220113104259.1584491-1-imran.f.khan@oracle.com/

Thanks,
 -- Imran