lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200729192151.eyghcfysfzaf2ijg@box>
Date:   Wed, 29 Jul 2020 22:21:51 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Hillf Danton <hdanton@...a.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        syzbot <syzbot+c48f34012b06c4ac67dd@...kaller.appspotmail.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        syzkaller-bugs@...glegroups.com,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jens Axboe <axboe@...nel.dk>,
        Markus Elfring <Markus.Elfring@....de>
Subject: Re: kernel BUG at include/linux/swapops.h:LINE!

On Mon, Jul 27, 2020 at 01:03:10PM +0100, Matthew Wilcox wrote:
> On Mon, Jul 27, 2020 at 01:31:40PM +0300, Kirill A. Shutemov wrote:
> > On Sun, Jul 26, 2020 at 05:49:04PM +0100, Matthew Wilcox wrote:
> > > On Fri, Jul 24, 2020 at 02:13:11PM +0300, Kirill A. Shutemov wrote:
> > > > On Thu, Jul 23, 2020 at 03:37:44PM +0800, Hillf Danton wrote:
> > > > > 
> > > > > On Tue, 21 Jul 2020 14:11:31 +0300 Kirill A. Shutemov wrote:
> > > > > > On Mon, Jul 20, 2020 at 04:51:44PM -0700, Andrew Morton wrote:
> > > > > > > On Sun, 19 Jul 2020 14:10:19 -0700 syzbot wrote:
> > > > > > > 
> > > > > > > > syzbot has found a reproducer for the following issue on:
> > > > > > > > 
> > > > > > > > HEAD commit:    4c43049f Add linux-next specific files for 20200716
> > > > > > > > git tree:       linux-next
> > > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=12c56087100000
> > > > > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=2c76d72659687242
> > > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=c48f34012b06c4ac67dd
> > > > > > > > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > > > > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1344abeb100000
> > > > > > > > 
> > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > > > Reported-by: syzbot+c48f34012b06c4ac67dd@...kaller.appspotmail.com
> > > > > > > 
> > > > > > > Thanks.
> > > > > > > 
> > > > > > > __handle_mm_fault
> > > > > > >   ->pmd_migration_entry_wait
> > > > > > >     ->migration_entry_to_page
> > > > > > > 
> > > > > > > stumbled onto an unlocked page.
> > > > > > > 
> > > > > > > I don't immediately see a cause.  Perhaps Matthew's "THP prep patches",
> > > > > > > perhaps something else.
> > > > > > > 
> > > > > > > Is it possible to perform a bisection?
> > > > > > 
> > > > > > Maybe it's related to the new lock_page_async()?
> > > > > 
> > > > > Or is there likely the window that after copy_huge_pmd() the src pmd migrate
> > > > > entry is removed and the page unlocked but the dst is not?
> > > > 
> > > > No.
> > > > 
> > > > copy_huge_pmd() runs with exclusive mmap_lock on the source side and
> > > > destination side is not running yet.
> > > 
> > > The one I'm hitting is huge related though.
> > > 
> > > I added this debug:
> > > 
> > > +++ b/include/linux/swapops.h
> > > @@ -165,8 +165,9 @@ static inline struct page *device_private_entry_to_page(swp_entry_t entry)
> > >  #ifdef CONFIG_MIGRATION
> > >  static inline swp_entry_t make_migration_entry(struct page *page, int write)
> > >  {
> > > -       BUG_ON(!PageLocked(compound_head(page)));
> > > +       VM_BUG_ON_PAGE(!PageLocked(page), page);
> > >  
> > > +if (PageCompound(page)) printk("pfn %lx order %d\n", page_to_pfn(page), thp_order(thp_head(page)));
> > >         return swp_entry(write ? SWP_MIGRATION_WRITE : SWP_MIGRATION_READ,
> > >                         page_to_pfn(page));
> > >  }
> > > @@ -194,7 +195,11 @@ static inline struct page *migration_entry_to_page(swp_entry_t entry)
> > >          * Any use of migration entries may only occur while the
> > >          * corresponding page is locked
> > >          */
> > > -       BUG_ON(!PageLocked(compound_head(p)));
> > > +       if (!PageLocked(p)) {
> > > +               dump_page(p, "not locked");
> > > +               printk("swap entry %d.%lx\n", swp_type(entry), swp_offset(entry));
> > > +               BUG();
> > > +       }
> > >         return p;
> > >  }
> > >  
> > > 
> > > and got useful output (while running generic/086):
> > > 
> > > 1457 086 (20181): drop_caches: 3
> > > 1457 page:00000000a216ae9a refcount:2 mapcount:0 mapping:000000009ba7bfed index:0x2227 pfn:0x229e7
> > > 1457 aops:def_blk_aops ino:0
> > > 1457 flags: 0x4000000000002030(lru|active|private)
> > > 1457 raw: 4000000000002030 fffff5b4416b5a48 fffff5b4408a7988 ffff9e9c34848578
> > > 1457 raw: 0000000000002227 ffff9e9bd18f0d00 00000002ffffffff 0000000000000000
> > > 1457 page dumped because: not locked
> > > 1457 swap entry 30.229e7
> > > 1457 ------------[ cut here ]------------
> > > 1457 kernel BUG at include/linux/swapops.h:201!
> > > 1457 invalid opcode: 0000 [#1] SMP PTI
> > > 1457 CPU: 3 PID: 646 Comm: check Kdump: loaded Tainted: G        W         5.8.0-rc6-00067-gd8b18bdf9870-dirty #355
> > > 1457 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
> > > 1457 RIP: 0010:__migration_entry_wait+0x109/0x110
> > > [...]
> > > 
> > > Looking back in the trace, I see:
> > > 
> > > ...
> > > 1457 pfn 229e5 order 9
> > > 1457 pfn 229e6 order 9
> > > 1457 pfn 229e7 order 9
> > > 1457 pfn 229e8 order 9
> > > 1457 pfn 229e9 order 9
> > > ...
> > > 
> > > so I would say we have a refcount problem.  I've probably made it worse by
> > > creating more THPs, but I don't think I'm the originator of the problem.
> > > 
> > > I know very little about the migration code today.  I suspect I'm going
> > > to have to learn about it next week.
> > 
> > It would be interesting to know if the migration entires ever got removed
> > for pfn. I mean if remove_migration_pte() got called for it.
> > 
> > It can be rmap issue too. Maybe it misses PMD on remove_migration_ptes()
> > or something.
> 
> It's not mapped with a PMD.  I tweaked my debugging slightly:
> 
>  static inline swp_entry_t make_migration_entry(struct page *page, int write)
>  {
> -       BUG_ON(!PageLocked(compound_head(page)));
> +       VM_BUG_ON_PAGE(!PageLocked(page), page);
>  
> +if (PageHead(page)) dump_page(page, "make entry");
> +if (PageTail(page)) printk("pfn %lx order %d\n", page_to_pfn(page), thp_order(thp_head(page)));
> 
> 1523 page:0000000006f62206 refcount:490 mapcount:1 mapping:0000000000000000 index:0x562b12a00 pfn:0x1dc00
> 1523 head:0000000006f62206 order:9 compound_mapcount:0 compound_pincount:0
> 1523 anon flags: 0x400000000009003d(locked|uptodate|dirty|lru|active|head|swapbacked)
> 1523 raw: 400000000009003d ffffecfd41301308 ffffecfd41b08008 ffff9e9971c00059
> 1523 raw: 0000000562b12a00 0000000000000000 000001ea00000000 0000000000000000
> 1523 page dumped because: make entry
> 1523 pfn 1dc01 order 9
> 1523 pfn 1dc02 order 9
> 1523 pfn 1dc03 order 9
> ...
> 
> Notice that it's an anonymous page, so it's not related to my work.

I don't have much hope, but could you try if the patch below would blow
up?

Could you share the setup you use to trigger the issue? I want try it
myself.

diff --git a/mm/migrate.c b/mm/migrate.c
index 40cd7016ae6f..c3148e1261d0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -215,6 +215,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
 	pte_t pte;
 	swp_entry_t entry;
 
+	VM_BUG_ON_PAGE(PageTail(pvmw.page), pvmw.page);
 	VM_BUG_ON_PAGE(PageTail(page), page);
 	while (page_vma_mapped_walk(&pvmw)) {
 		if (PageKsm(page))
-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ