[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKEwX=Nzipr_nWTVPsUC_JKzoKu=6CX7rMSRv+T1YoQ8WheuWw@mail.gmail.com>
Date: Sun, 4 Feb 2024 19:48:38 -0800
From: Nhat Pham <nphamcs@...il.com>
To: Chengming Zhou <chengming.zhou@...ux.dev>
Cc: syzbot <syzbot+17a611d10af7d18a7092@...kaller.appspotmail.com>,
akpm@...ux-foundation.org, hannes@...xchg.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, syzkaller-bugs@...glegroups.com, yosryahmed@...gle.com
Subject: Re: [syzbot] [mm?] WARNING in zswap_folio_swapin
On Sat, Feb 3, 2024 at 6:59 PM Chengming Zhou <chengming.zhou@...uxdev> wrote:
>
> On 2024/2/4 09:28, Nhat Pham wrote:
> > On Sat, Feb 3, 2024 at 12:37 PM syzbot
> > <syzbot+17a611d10af7d18a7092@...kaller.appspotmail.com> wrote:
> >>
> >> Hello,
> >>
> >> syzbot found the following issue on:
> >>
> >> HEAD commit: 861c0981648f Merge tag 'jfs-6.8-rc3' of github.com:kleikam..
> >> git tree: upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=174537bbe80000
> >> kernel config: https://syzkaller.appspot.com/x/.config?x=b168fa511db3ca08
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=17a611d10af7d18a7092
> >> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> >> userspace arch: i386
> >>
> >> Unfortunately, I don't have any reproducer for this issue yet.
> >>
> >> Downloadable assets:
> >> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-861c0981.raw.xz
> >> vmlinux: https://storage.googleapis.com/syzbot-assets/b2b204c7b4a0/vmlinux-861c0981.xz
> >> kernel image: https://storage.googleapis.com/syzbot-assets/170ec316e557/bzImage-861c0981.xz
> >>
> >> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >> Reported-by: syzbot+17a611d10af7d18a7092@...kaller.appspotmail.com
> >>
> >> kcov_ioctl+0x4f/0x720 kernel/kcov.c:704
> >> __do_compat_sys_ioctl+0x2bf/0x330 fs/ioctl.c:971
> >> do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
> >> __do_fast_syscall_32+0x79/0x110 arch/x86/entry/common.c:321
> >> page has been migrated, last migrate reason: compaction
> >> ------------[ cut here ]------------
> >> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 folio_lruvec include/linux/memcontrol.h:775 [inline]
> >> WARNING: CPU: 2 PID: 5104 at include/linux/memcontrol.h:775 zswap_folio_swapin+0x47d/0x5a0 mm/zswap.c:381
> >> Modules linked in:
> >> CPU: 2 PID: 5104 Comm: syz-fuzzer Not tainted 6.8.0-rc2-syzkaller-00031-g861c0981648f #0
> >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> >> RIP: 0010:folio_lruvec include/linux/memcontrol.h:775 [inline]
> >
> > Hmm looks like it's this line:
> > VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled(), folio);
> >
> > Looks like memcg was cleared from the folio. Haven't looked too
> > closely yet, but this (and the "page has been migrated" line above)
> > suggests maybe there is some migration business going on -
> > mem_cgroup_migrate() clears the old folio's memcg_data (via
> > old->memcg_data = 0).
>
> Yeah, I think it's this case.
>
> >
> > Here's my theory (which could be wrong - someone please fact-check
> > me): swap_read_folio(), which precedes zswap_folio_swapin(), unlocks
>
> And another case is !page_allocated, the returned folio is unlocked, right?
I think you're correct. That said, it's probably fine to keep the
protection size if we find the folio in the swapcache anyway - IIUC,
we are not performing a swapin in that case (since !page_allocated
means no swap_read_folio() called), which is the scenario that the
heuristics cares about :)
IOW, something like this:
if (unlikely(page_allocated)) {
zswap_folio_swapin(folio);
swap_read_folio(folio, false, NULL);
}
make sense to me, both from the correctness POV, and the heuristics POV.
>
> > the folio. Could this be sufficient to allow for migration? If this is
>
> IMHO, folio locked is sufficient to avoid concurrent memcg migration.
>
> > the case, all we need to do is move this to above swap_read_folio(),
> > while the folio is still locked. __read_swap_cache_async() already
> > charges the folio to an memcg, so no need to wait till after
> > swap_read_page() anyway.
>
> Should we call zswap_folio_swapin() in the !page_allocated case?
>
> Thanks.
Powered by blists - more mailing lists