[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <000000000000019d7e05fa808b1f@google.com>
Date: Sat, 29 Apr 2023 14:47:58 -0700
From: syzbot <syzbot+ecab51a4a5b9f26eeaa1@...kaller.appspotmail.com>
To: tytso@....edu
Cc: adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
tytso@....edu, syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] WARNING in ext4_dirty_folio
> #syz set subsystems: mm
Your commands are accepted, but please keep syzkaller-bugs@...glegroups.com mailing list in CC next time. It serves as a history of what happened with each bug report. Thank you.
>
> On Wed, Jun 08, 2022 at 04:36:20AM -0700, syzbot wrote:
>> syzbot has found a reproducer for the following issue on:
>>
>> HEAD commit: cf67838c4422 selftests net: fix bpf build error
>> git tree: net
>> console+strace: https://syzkaller.appspot.com/x/log.txt?x=123c2173f00000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=fc5a30a131480a80
>> dashboard link: https://syzkaller.appspot.com/bug?extid=ecab51a4a5b9f26eeaa1
>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1342d5abf00000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11ecafebf00000
>
> The root cause of this failure is a fundamental bug / design flaw in
> get_user_pages and related functions, which file system developers
> have been complaining about for literally **years**. See the recent
> discussion at [1] and going back earlier to 2018[2][3] and 2019[4].
>
> [1] https://lore.kernel.org/all/6b73e692c2929dc4613af711bdf92e2ec1956a66.1682638385.git.lstoakes@gmail.com/
> [2] https://lwn.net/Articles/753027/
> [3] https://lwn.net/Articles/774411/
> [4] https://lwn.net/Articles/784574/
>
> I'm going to reassign this to the mm subsystem, since there's not much
> we can do on the file system end. The WARNING is considered a good
> thing because users can see silent data corruption/loss if they use
> process_vm_writev() or RDMA to write to memory backed by a file. And
> while most users at large hyperscale scientific compute farms probably
> won't be paying attention to the system logs, at least we've done
> something to warn them.
>
> Fortunately data corruption is rare (but when it happens it could
> really screw with your results!), but if they are doing some large
> scale simulation to evaluate the safety of nuclear weapons (for
> example), it would be nice if they got at least some hint.
>
> There is a potential solution discussed at [1], but there is push back
> since it could break users by disallowing the thing that might cause
> data corruption. Why breaking user applications is bad, turning a
> possible silent data corruption to a very visible, hard failure is
> arguably a good thing....
>
> - Ted
Powered by blists - more mailing lists