linux-kernel - Re: [HELP] FUSE writeback performance bottleneck

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cc2ca6ec-191f-4b30-85f4-b3fc6ddd7323@linux.alibaba.com>
Date: Tue, 4 Jun 2024 20:24:06 +0800
From: Jingbo Xu <jefflexu@...ux.alibaba.com>
To: Bernd Schubert <bernd.schubert@...tmail.fm>,
 Miklos Szeredi <miklos@...redi.hu>
Cc: "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 lege.wang@...uarmicro.com, "Matthew Wilcox (Oracle)" <willy@...radead.org>,
 "linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [HELP] FUSE writeback performance bottleneck



On 6/4/24 5:32 PM, Bernd Schubert wrote:
> 
> 
> On 6/4/24 09:36, Jingbo Xu wrote:
>>
>>
>> On 6/4/24 3:27 PM, Miklos Szeredi wrote:
>>> On Tue, 4 Jun 2024 at 03:57, Jingbo Xu <jefflexu@...ux.alibaba.com> wrote:
>>>
>>>> IIUC, there are two sources that may cause deadlock:
>>>> 1) the fuse server needs memory allocation when processing FUSE_WRITE
>>>> requests, which in turn triggers direct memory reclaim, and FUSE
>>>> writeback then - deadlock here
>>>
>>> Yep, see the folio_wait_writeback() call deep in the guts of direct
>>> reclaim, which sleeps until the PG_writeback flag is cleared.  If that
>>> happens to be triggered by the writeback in question, then that's a
>>> deadlock.
>>>
>>>> 2) a process that trigfgers direct memory reclaim or calls sync(2) may
>>>> hang there forever, if the fuse server is buggyly or malicious and thus
>>>> hang there when processing FUSE_WRITE requests
>>>
>>> Ah, yes, sync(2) is also an interesting case.   We don't want unpriv
>>> fuse servers to be able to block sync(2), which means that sync(2)
>>> won't actually guarantee a synchronization of fuse's dirty pages.  I
>>> don't think there's even a theoretical solution to that, but
>>> apparently nobody cares...
>>
>> Okay if the temp page design is unavoidable, then I don't know if there
>> is any approach (in FUSE or VFS layer) helps page copy offloading.  At
>> least we don't want the writeback performance to be limited by the
>> single writeback kworker.  This is also the initial attempt of this thread.
>>
> 
> Offloading it to another thread is just a workaround, though maybe a
> temporary solution.

If we could break the limit that only one single (writeback) kworker for
one bdi... Apparently it's much more complicated.  Just a brainstorming
idea...

I agree it's a tough thing.

-- 
Thanks,
Jingbo