[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc2ca6ec-191f-4b30-85f4-b3fc6ddd7323@linux.alibaba.com>
Date: Tue, 4 Jun 2024 20:24:06 +0800
From: Jingbo Xu <jefflexu@...ux.alibaba.com>
To: Bernd Schubert <bernd.schubert@...tmail.fm>,
Miklos Szeredi <miklos@...redi.hu>
Cc: "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
lege.wang@...uarmicro.com, "Matthew Wilcox (Oracle)" <willy@...radead.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [HELP] FUSE writeback performance bottleneck
On 6/4/24 5:32 PM, Bernd Schubert wrote:
>
>
> On 6/4/24 09:36, Jingbo Xu wrote:
>>
>>
>> On 6/4/24 3:27 PM, Miklos Szeredi wrote:
>>> On Tue, 4 Jun 2024 at 03:57, Jingbo Xu <jefflexu@...ux.alibaba.com> wrote:
>>>
>>>> IIUC, there are two sources that may cause deadlock:
>>>> 1) the fuse server needs memory allocation when processing FUSE_WRITE
>>>> requests, which in turn triggers direct memory reclaim, and FUSE
>>>> writeback then - deadlock here
>>>
>>> Yep, see the folio_wait_writeback() call deep in the guts of direct
>>> reclaim, which sleeps until the PG_writeback flag is cleared. If that
>>> happens to be triggered by the writeback in question, then that's a
>>> deadlock.
>>>
>>>> 2) a process that trigfgers direct memory reclaim or calls sync(2) may
>>>> hang there forever, if the fuse server is buggyly or malicious and thus
>>>> hang there when processing FUSE_WRITE requests
>>>
>>> Ah, yes, sync(2) is also an interesting case. We don't want unpriv
>>> fuse servers to be able to block sync(2), which means that sync(2)
>>> won't actually guarantee a synchronization of fuse's dirty pages. I
>>> don't think there's even a theoretical solution to that, but
>>> apparently nobody cares...
>>
>> Okay if the temp page design is unavoidable, then I don't know if there
>> is any approach (in FUSE or VFS layer) helps page copy offloading. At
>> least we don't want the writeback performance to be limited by the
>> single writeback kworker. This is also the initial attempt of this thread.
>>
>
> Offloading it to another thread is just a workaround, though maybe a
> temporary solution.
If we could break the limit that only one single (writeback) kworker for
one bdi... Apparently it's much more complicated. Just a brainstorming
idea...
I agree it's a tough thing.
--
Thanks,
Jingbo
Powered by blists - more mailing lists