lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7af02ceb-563a-4bad-84ee-620aaa513bed@redhat.com>
Date: Mon, 6 Oct 2025 09:55:27 +0200
From: David Hildenbrand <david@...hat.com>
To: Catalin Marinas <catalin.marinas@....com>,
 syzbot <syzbot+d1974fc28545a3e6218b@...kaller.appspotmail.com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 syzkaller-bugs@...glegroups.com, will@...nel.org
Subject: Re: [syzbot] [arm?] WARNING in copy_highpage

>> Modules linked in:
>> CPU: 1 UID: 0 PID: 25189 Comm: syz.2.7336 Not tainted syzkaller #0 PREEMPT
>> Hardware name: linux,dummy-virt (DT)
>> pstate: 00402009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : copy_highpage+0x150/0x334 arch/arm64/mm/copypage.c:55
>> lr : copy_highpage+0xb4/0x334 arch/arm64/mm/copypage.c:25
>> sp : ffff800088053940
>> x29: ffff800088053940 x28: ffffc1ffc0acf800 x27: ffff800088053b10
>> x26: ffffc1ffc0acf808 x25: ffffc1ffc037b1c0 x24: ffffc1ffc037b1c0
>> x23: ffffc1ffc0acf800 x22: ffffc1ffc0acf800 x21: fff000002b3e0000
>> x20: fff000000dec7000 x19: ffffc1ffc037b1c0 x18: 0000000000000000
>> x17: fff07ffffcffa000 x16: ffff800080008000 x15: 0000000000000001
>> x14: 0000000000000000 x13: 0000000000000003 x12: 000000000006d9ad
>> x11: 0000000000000000 x10: 0000000000000010 x9 : 0000000000000000
>> x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
>> x5 : ffff800088053b18 x4 : ffff80008032df94 x3 : 00000000ff000000
>> x2 : 01ffc00003000001 x1 : 01ffc00003000001 x0 : 01ffc00003000001
>> Call trace:
>>   try_page_mte_tagging arch/arm64/include/asm/mte.h:93 [inline] (P)
>>   copy_highpage+0x150/0x334 arch/arm64/mm/copypage.c:55 (P)
>>   copy_mc_highpage include/linux/highmem.h:383 [inline]
>>   folio_mc_copy+0x44/0x6c mm/util.c:740
>>   __migrate_folio.constprop.0+0xc4/0x23c mm/migrate.c:851
>>   migrate_folio+0x1c/0x2c mm/migrate.c:882
>>   move_to_new_folio+0x58/0x144 mm/migrate.c:1097
>>   migrate_folio_move mm/migrate.c:1370 [inline]
>>   migrate_folios_move mm/migrate.c:1719 [inline]
>>   migrate_pages_batch+0xaf4/0x1024 mm/migrate.c:1966
>>   migrate_pages_sync mm/migrate.c:2023 [inline]
>>   migrate_pages+0xb9c/0xcdc mm/migrate.c:2105
>>   do_mbind+0x20c/0x4a4 mm/mempolicy.c:1539
>>   kernel_mbind mm/mempolicy.c:1682 [inline]
>>   __do_sys_mbind mm/mempolicy.c:1756 [inline]
> 
> I don't think we ever stressed MTE with mbind before. I have a suspicion
> this problem has been around for some time.
> 
> My reading of do_mbind() is that it ends up allocating pages for
> migrating into via alloc_migration_target_by_mpol() ->
> folio_alloc_mpol(). Pages returned should be untagged and uninitialised
> unless the PG_* flags have not been cleared on a prior free. Or
> migrate_pages_batch() somehow reuses some pages instead of reallocating.

Staring at __migrate_folio(), I assume we can end up successfully 
calling folio_mc_copy(), but then failing in __folio_migrate_mapping().

Seems to be as easy as failing the folio_ref_freeze() in 
__folio_migrate_mapping().

We return -EAGAIN in that case, making the caller retry, stumbling into 
an already-tagged page. (with the same source / destination parameters) 
IIRC)

So likely this is simply us re-doing the copy after a migration failed 
after the copy.

Could it happen that we are calling it with a different 
source/destination combination the second time? I don't think so, but I 
am not 100% sure.

The most reliable way would be to un-tag in case folio_mc_copy succeeded 
but __folio_migrate_mapping() failed.

I'm also wondering whether we can simply perform the copy after the 
__folio_migrate_mapping() call: the src folio is locked and unmapped, 
nobody can really modify it. Same for the dst folio.

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ