[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3f340003-2339-85a9-7d3a-51a488dcc720@oracle.com>
Date: Mon, 3 Jan 2022 20:07:09 +1100
From: Imran Khan <imran.f.khan@...cle.com>
To: Greg KH <gregkh@...uxfoundation.org>
Cc: tj@...nel.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org
Subject: Re: [RFC PATCH] kernfs: use kernfs_node specific mutex and spinlock.
Hi Greg,
On 24/12/21 7:40 pm, Greg KH wrote:
> On Fri, Dec 24, 2021 at 09:52:51AM +1100, Imran Khan wrote:
>> Hi everyone,
>>
>> On 16/12/21 2:06 am, Imran Khan wrote:
>>> Right now a global mutex (kernfs_open_file_mutex) protects list of
>>> kernfs_open_file instances corresponding to a sysfs attribute, so even
>>> if different tasks are opening or closing different sysfs files they
>>> can contend on osq_lock of this mutex. The contention is more apparent
>>> in large scale systems with few hundred CPUs where most of the CPUs have
>>> running tasks that are opening, accessing or closing sysfs files at any
>>> point of time. Since each list of kernfs_open_file belongs to a
>>> kernfs_open_node instance which in turn corresponds to one kernfs_node,
>>> move global kernfs_open_file_mutex within kernfs_node so that it does
>>> not block access to kernfs_open_file lists corresponding to other
>>> kernfs_node.
>>>
>>> Also since kernfs_node->attr.open points to kernfs_open_node instance
>>> corresponding to the kernfs_node, we can use a kernfs_node specific
>>> spinlock in place of current global spinlock i.e kernfs_open_node_lock.
>>> So make this spinlock local to kernfs_node instance as well.
>>>
>>> Signed-off-by: Imran Khan <imran.f.khan@...cle.com>
>>> ---
>>> I have kept this patch as RFC, as I am not sure if I have overlooked any
>>> scenario(s) where these global locks are needed.
>>>
>>
>> Could someone please provide some feedback about this change? Also if
>> there is any issues in this change, can I make these locks per-fs as has
>> been done in [1].
>>
>> [1] https://urldefense.com/v3/__https://lore.kernel.org/lkml/YZvV0ESA*zHHqHBU@google.com/__;Kw!!ACWV5N9M2RV99hQ!ZNLlzuX1cVFEAE5Ila2y8AzhvA3xI4HG4q13ZdcaQN__JaPLy6yuzdV0lypeVEIOHA$
>
> Please test this using some tests to verify that sysfs is still working
> properly and that this actually takes less time overall. In the
> conversations about the last time kernfs was changed, there were lots of
> discussions about proving that it actually mattered.
>
Thanks for getting back on this.
Yes sysfs and cgroup are working with this change.
I verified the change:
1. Using LTP sysfs tests
2. Doing CPU hotplugging and reading CPU topology info from sysfs in
parallel. I was getting correct topology information or "No such file or
directory error"
If you could suggest me some further tests, I can test with those as well.
As far as overall time taken was concerned, I did not see any
improvement with my test application (I am running 200 instances of it
on a system with 384 CPUs).
The main loop of this application is as follows (One can use any other
sysfs hierarchy as well):
for (int loop = 0; loop <100 ; loop++)
{
for (int port_num = 1; port_num < 2; port_num++)
{
for (int gid_index = 0; gid_index < 254; gid_index++ )
{
char ret_buf[64], ret_buf_lo[64];
char gid_file_path[1024];
int ret_len, ret_fd;
ssize_t ret_rd;
unsigned int i, saved_errno;
memset(ret_buf, 0, sizeof(ret_buf));
memset(gid_file_path, 0, sizeof(gid_file_path));
ret_len = snprintf(gid_file_path, sizeof(gid_file_path),
"/sys/class/infiniband/%s/ports/%d/gids/%d",
dev_name, port_num, gid_index);
ret_fd = open(gid_file_path, O_RDONLY | O_CLOEXEC);
if (ret_fd < 0)
{
printf("Failed to open %s\n", gid_file_path);
continue;
}
/* Read the GID */
ret_rd = read(ret_fd, ret_buf, 40);
if (ret_rd == -1)
{
printf("Failed to read from file %s, errno: %u\n",
gid_file_path, saved_errno);
continue;
}
close(ret_fd);
}
}
}
The patch just moved the contention from osq_lock (corresponding to
kernfs_open_file_mutex) to read-write semaphore (kernfs_rwsem). I have
tried to address the kernfs_rwsem contention in v2 of this patch set at [1].
v2 of the patch set, reduces the test execution time to half (From ~36
secs to ~18 secs)
and contention around kernfs_rwsem lock is reduced to 1/3rd of earlier case.
8.61% 8.55% showgids [kernel.kallsyms] [k] down_read
7.80% 7.75% showgids [kernel.kallsyms] [k] up_read
I will await feedback regarding v2 of this patchset.
[1]
https://lore.kernel.org/lkml/20220103084544.1109829-1-imran.f.khan@oracle.com/
> thanks,
>
> greg k-h
Powered by blists - more mailing lists