[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxjLaGyOUd5GOV8oHwBY=nGGtgk4=5bRxmHTr5VsocrhiA@mail.gmail.com>
Date: Tue, 14 Jul 2020 16:10:33 +0300
From: Amir Goldstein <amir73il@...il.com>
To: Francesco Ruggeri <fruggeri@...sta.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Jan Kara <jack@...e.cz>
Subject: Re: soft lockup in fanotify_read
On Tue, Jul 14, 2020 at 5:54 AM Francesco Ruggeri <fruggeri@...sta.com> wrote:
>
> We are getting this soft lockup in fanotify_read.
> The reason is that this code does not seem to scale to cases where there
> are big bursts of events generated by fanotify_handle_event.
> fanotify_read acquires group->notification_lock for each event.
> fanotify_handle_event uses the lock to add one event, which also involves
> fanotify_merge, which scans the whole list trying to find an event to
> merge the new one with.
Yes, that is a terribly inefficient merge algorithm.
If it helps I am carrying a quick brown paper bag fix for this issue in my tree:
@@ -65,6 +74,8 @@ static int fanotify_merge(struct list_head *list,
struct fsnotify_event *event)
{
struct fsnotify_event *test_event;
struct fanotify_event *new;
+ int limit = 128;
+ int i = 0;
pr_debug("%s: list=%p event=%p\n", __func__, list, event);
new = FANOTIFY_E(event);
@@ -78,6 +89,9 @@ static int fanotify_merge(struct list_head *list,
struct fsnotify_event *event)
return 0;
list_for_each_entry_reverse(test_event, list, list) {
+ /* Event merges are expensive so should be limited */
+ if (++i > limit)
+ break;
if (should_merge(test_event, event)) {
It's somewhere down my TODO list to fix this properly with a hash table.
> In our case fanotify_read is invoked with a buffer big enough for 200
> events, and what happens is that every time fanotify_read dequeues an
> event and releases the lock, fanotify_handle_event adds several more,
> scanning a longer and longer list. This causes fanotify_read to wait
> longer and longer for the lock, and the soft lockup happens before
> fanotify_read can reach 200 events.
> Is it intentional for fanotify_read to acquire the lock for each event,
> rather than batching together a user buffer worth of events?
I think it is meant to allow for multiple reader threads to read events
with fairness, but not sure.
Even if it was fine to read a batch of events on every spinlock acquire
making the code in the fanotify_read() loop behave well in case of
an error in an event after reading a bunch of good events looks challenging,
but I didn't try. Anyway, the root cause of the issue seems to be the
inefficient merge and not the spinlock taken per one event read.
Thanks,
Amir.
Powered by blists - more mailing lists