lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8da2b3bf-b9bf-44e3-88ff-750dc91c2388@redhat.com>
Date: Sun, 14 Jul 2024 20:27:25 +0200
From: David Hildenbrand <david@...hat.com>
To: David Wang <00107082@....com>, Peter Xu <peterx@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 Andrew Morton <akpm@...ux-foundation.org>,
 Alex Williamson <alex.williamson@...hat.com>,
 Jason Gunthorpe <jgg@...dia.com>, Al Viro <viro@...iv.linux.org.uk>,
 Dave Hansen <dave.hansen@...ux.intel.com>, Andy Lutomirski
 <luto@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, "Kirill A . Shutemov"
 <kirill@...temov.name>, x86@...nel.org, Yan Zhao <yan.y.zhao@...el.com>,
 Kevin Tian <kevin.tian@...el.com>, Pei Li <peili.dev@...il.com>,
 Bert Karwatzki <spasswolf@....de>,
 Sergey Senozhatsky <senozhatsky@...omium.org>
Subject: Re: [PATCH] mm/x86/pat: Only untrack the pfn range if unmap region

On 14.07.24 12:59, David Wang wrote:
> 
> At 2024-07-12 22:42:44, "Peter Xu" <peterx@...hat.com> wrote:
>> NOTE: I massaged the commit message comparing to the rfc post [1], the
>> patch itself is untouched.  Also removed rfc tag, and added more people
>> into the loop. Please kindly help test this patch if you have a reproducer,
>> as I can't reproduce it myself even with the syzbot reproducer on top of
>> mm-unstable.  Instead of further check on the reproducer, I decided to send
>> this out first as we have a bunch of reproducers on the list now..
>> ---
>> mm/memory.c | 5 ++---
>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 4bcd79619574..f57cc304b318 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -1827,9 +1827,6 @@ static void unmap_single_vma(struct mmu_gather *tlb,
>> 	if (vma->vm_file)
>> 		uprobe_munmap(vma, start, end);
>>
>> -	if (unlikely(vma->vm_flags & VM_PFNMAP))
>> -		untrack_pfn(vma, 0, 0, mm_wr_locked);
>> -
>> 	if (start != end) {
>> 		if (unlikely(is_vm_hugetlb_page(vma))) {
>> 			/*
>> @@ -1894,6 +1891,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas,
>> 		unsigned long start = start_addr;
>> 		unsigned long end = end_addr;
>> 		hugetlb_zap_begin(vma, &start, &end);
>> +		if (unlikely(vma->vm_flags & VM_PFNMAP))
>> +			untrack_pfn(vma, 0, 0, mm_wr_locked);
>> 		unmap_single_vma(tlb, vma, start, end, &details,
>> 				 mm_wr_locked);
>> 		hugetlb_zap_end(vma, &details);
>> -- 
>> 2.45.0
> 
> Hi,
> 
> Today, I notice a kernel warning with this patch.
> 
> 
> [Sun Jul 14 16:51:38 2024] OOM killer enabled.
> [Sun Jul 14 16:51:38 2024] Restarting tasks ... done.
> [Sun Jul 14 16:51:38 2024] random: crng reseeded on system resumption
> [Sun Jul 14 16:51:38 2024] PM: suspend exit
> [Sun Jul 14 16:51:38 2024] ------------[ cut here ]------------
> [Sun Jul 14 16:51:38 2024] WARNING: CPU: 1 PID: 2484 at arch/x86/mm/pat/memtype.c:1002 untrack_pfn+0x10c/0x120

We fail to find what we need in the page tables, indicating that the 
page tables might have been modified / torn down in the meantime.

Likely we have a previous call to unmap_single_vma() that modifies the 
page tables, and unmaps present PFNs.

PAT is incompatible to that, it relies on information from the page 
tables to know what it has to undo during munmap(), or what it has to do 
during fork().

The splat from the previous discussion [1]:

   follow_phys arch/x86/mm/pat/memtype.c:957 [inline]
   get_pat_info+0xf2/0x510 arch/x86/mm/pat/memtype.c:991
   untrack_pfn+0xf7/0x4d0 arch/x86/mm/pat/memtype.c:1104
   unmap_single_vma+0x1bd/0x2b0 mm/memory.c:1819
   zap_page_range_single+0x326/0x560 mm/memory.c:1920
   unmap_mapping_range_vma mm/memory.c:3684 [inline]
   unmap_mapping_range_tree mm/memory.c:3701 [inline]
   unmap_mapping_pages mm/memory.c:3767 [inline]
   unmap_mapping_range+0x1ee/0x280 mm/memory.c:3804
   truncate_pagecache+0x53/0x90 mm/truncate.c:731
   simple_setattr+0xf2/0x120 fs/libfs.c:886
   notify_change+0xec6/0x11f0 fs/attr.c:499
   do_truncate+0x15c/0x220 fs/open.c:65
   handle_truncate fs/namei.c:3308 [inline]

indicates that file truncation seems to end up messing with a PFNMAP 
mapping that has PAT set. That is ... weird. I would have thought that 
PFNMAP would never really happen with file truncation.

Does this only happen with an OOT driver, that seems to do weird 
truncate stuff on files that have a PFNMAP mapping?

[1] 
https://lore.kernel.org/all/3879ee72-84de-4d2a-93a8-c0b3dc3f0a4c@redhat.com/

> [Sun Jul 14 16:51:38 2024] Modules linked in: snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) ctr(E) ccm(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) br_netfilter(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) nvidia_drm(POE) nvidia_modeset(POE) edac_mce_amd(E) kvm_amd(E) snd_hda_codec_realtek(E) kvm(E) iwlmvm(E) snd_hda_codec_generic(E) crct10dif_pclmul(E) snd_hda_scodec_component(E) snd_hda_codec_hdmi(E) ghash_clmulni_intel(E) sha512_ssse3(E) mac80211(E) sha512_generic(E) snd_hda_intel(E) nvidia(POE) sha256_ssse3(E) snd_intel_dspcfg(E) ppdev(E) sha1_ssse3(E) libarc4(E) snd_hda_codec(E) snd_usb_audio(E) snd_usbmidi_lib(E) uvcvideo(E) snd_hda_core(E) iwlwifi(E) aesni_intel(E) snd_rawmidi(E) snd_pcsp(E)
> [Sun Jul 14 16:51:38 2024]  snd_hwdep(E) snd_seq_device(E) crypto_simd(E) videobuf2_vmalloc(E) snd_pcm(E) cryptd(E) uvc(E) videobuf2_memops(E) videobuf2_v4l2(E) snd_timer(E) rapl(E) cfg80211(E) k10temp(E) wmi_bmof(E) sp5100_tco(E) acpi_cpufreq(E) ccp(E) snd(E) videodev(E) drm_kms_helper(E) videobuf2_common(E) rfkill(E) video(E) rng_core(E) mc(E) soundcore(E) joydev(E) parport_pc(E) parport(E) sg(E) evdev(E) msr(E) loop(E) fuse(E) drm(E) efi_pstore(E) dm_mod(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) efivarfs(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) raid1(E) raid0(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sd_mod(E) ahci(E) libahci(E) xhci_pci(E) nvme(E) libata(E) crc32_pclmul(E) nvme_core(E) xhci_hcd(E) t10_pi(E) crc32c_intel(E) i2c_piix4(E) r8169(E) crc64_rocksoft(E) realtek(E) scsi_mod(E) usbcore(E) scsi_common(E) usb_common(E) wmi(E) gpio_amdpt(E) gpio_generic(E) button(E)
> [Sun Jul 14 16:51:38 2024] CPU: 1 PID: 2484 Comm: gnome-shell Tainted: P           OE      6.10.0-rc7-linan-1 #283
> [Sun Jul 14 16:51:38 2024] Hardware name: Micro-Star International Co., Ltd. MS-7B89/B450M MORTAR MAX (MS-7B89), BIOS 2.80 06/10/2020
> [Sun Jul 14 16:51:38 2024] RIP: 0010:untrack_pfn+0x10c/0x120
> [Sun Jul 14 16:51:38 2024] Code: e2 01 74 22 8b 98 e0 00 00 00 3b 5d 2c 74 ac 48 8b 7d 30 e8 66 e1 bc 00 89 5d 2c 48 8b 7d 30 e8 0a 6c 09 00 eb 95 0f 0b eb da <0f> 0b eb 95 e8 db b6 bb 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90
> [Sun Jul 14 16:51:38 2024] RSP: 0018:ffffae5b4ab1fbe8 EFLAGS: 00010202
> [Sun Jul 14 16:51:38 2024] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000000
> [Sun Jul 14 16:51:38 2024] RDX: 0000000000000001 RSI: 000fffffffe00000 RDI: ffff91d5be99ea80
> [Sun Jul 14 16:51:38 2024] RBP: ffff91d5c44fbe70 R08: 00007f2e5ff32000 R09: 0000000000000001
> [Sun Jul 14 16:51:38 2024] R10: ffff91d5b7ad6d1c R11: 00007f2e5ff35fff R12: 00007f2e5ff32000
> [Sun Jul 14 16:51:38 2024] R13: 0000000000000000 R14: ffffae5b4ab1fde8 R15: ffff91d5c44fbe70
> [Sun Jul 14 16:51:38 2024] FS:  00007f2e5ff59dc0(0000) GS:ffff91d84ec80000(0000) knlGS:0000000000000000
> [Sun Jul 14 16:51:38 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [Sun Jul 14 16:51:38 2024] CR2: 00007fe71316b08c CR3: 000000018468e000 CR4: 0000000000350ef0
> [Sun Jul 14 16:51:38 2024] Call Trace:
> [Sun Jul 14 16:51:38 2024]  <TASK>
> [Sun Jul 14 16:51:38 2024]  ? __warn+0x7c/0x120
> [Sun Jul 14 16:51:38 2024]  ? untrack_pfn+0x10c/0x120
> [Sun Jul 14 16:51:38 2024]  ? report_bug+0x18d/0x1c0
> [Sun Jul 14 16:51:38 2024]  ? handle_bug+0x3c/0x80
> [Sun Jul 14 16:51:38 2024]  ? exc_invalid_op+0x13/0x60
> [Sun Jul 14 16:51:38 2024]  ? asm_exc_invalid_op+0x16/0x20
> [Sun Jul 14 16:51:38 2024]  ? untrack_pfn+0x10c/0x120
> [Sun Jul 14 16:51:38 2024]  ? untrack_pfn+0x53/0x120
> [Sun Jul 14 16:51:38 2024]  unmap_vmas+0x115/0x1a0
> [Sun Jul 14 16:51:38 2024]  unmap_region+0xd4/0x150
> [Sun Jul 14 16:51:38 2024]  ? mas_nomem+0x14/0x80
> [Sun Jul 14 16:51:38 2024]  ? srso_return_thunk+0x5/0x5f
> [Sun Jul 14 16:51:38 2024]  ? mas_store_gfp+0x54/0x110
> [Sun Jul 14 16:51:38 2024]  do_vmi_align_munmap+0x2d4/0x530
> [Sun Jul 14 16:51:38 2024]  do_vmi_munmap+0xda/0x190
> [Sun Jul 14 16:51:38 2024]  __vm_munmap+0xa0/0x160
> [Sun Jul 14 16:51:38 2024]  __x64_sys_munmap+0x17/0x20
> [Sun Jul 14 16:51:38 2024]  do_syscall_64+0x4b/0x110
> [Sun Jul 14 16:51:38 2024]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [Sun Jul 14 16:51:38 2024] RIP: 0033:0x7f2e647208f7
> [Sun Jul 14 16:51:38 2024] Code: 00 00 00 48 8b 15 09 05 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d9 04 0d 00 f7 d8 64 89 01 48
> [Sun Jul 14 16:51:38 2024] RSP: 002b:00007ffd289f0a48 EFLAGS: 00000246 ORIG_RAX: 000000000000000b
> [Sun Jul 14 16:51:38 2024] RAX: ffffffffffffffda RBX: 00007f2e5ff31000 RCX: 00007f2e647208f7
> [Sun Jul 14 16:51:38 2024] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 00007f2e5ff31000
> [Sun Jul 14 16:51:38 2024] RBP: 0000557d5a9330a0 R08: 00000000c1d00028 R09: 00000000beef0100
> [Sun Jul 14 16:51:38 2024] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
> [Sun Jul 14 16:51:38 2024] R13: 0000000000000001 R14: 0000000000000002 R15: 0000557d5a8408c0
> [Sun Jul 14 16:51:38 2024]  </TASK>
> [Sun Jul 14 16:51:38 2024] ---[ end trace 0000000000000000 ]---
> [Sun Jul 14 16:51:39 2024] ------------[ cut here ]------------
> [Sun Jul 14 16:51:39 2024] WARNING: CPU: 1 PID: 2272 at arch/x86/mm/pat/memtype.c:1002 track_pfn_copy+0x94/0xa0
> [Sun Jul 14 16:51:39 2024] Modules linked in: snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) ctr(E) ccm(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) br_netfilter(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) nvidia_drm(POE) nvidia_modeset(POE) edac_mce_amd(E) kvm_amd(E) snd_hda_codec_realtek(E) kvm(E) iwlmvm(E) snd_hda_codec_generic(E) crct10dif_pclmul(E) snd_hda_scodec_component(E) snd_hda_codec_hdmi(E) ghash_clmulni_intel(E) sha512_ssse3(E) mac80211(E) sha512_generic(E) snd_hda_intel(E) nvidia(POE) sha256_ssse3(E) snd_intel_dspcfg(E) ppdev(E) sha1_ssse3(E) libarc4(E) snd_hda_codec(E) snd_usb_audio(E) snd_usbmidi_lib(E) uvcvideo(E) snd_hda_core(E) iwlwifi(E) aesni_intel(E) snd_rawmidi(E) snd_pcsp(E)
> [Sun Jul 14 16:51:39 2024]  snd_hwdep(E) snd_seq_device(E) crypto_simd(E) videobuf2_vmalloc(E) snd_pcm(E) cryptd(E) uvc(E) videobuf2_memops(E) videobuf2_v4l2(E) snd_timer(E) rapl(E) cfg80211(E) k10temp(E) wmi_bmof(E) sp5100_tco(E) acpi_cpufreq(E) ccp(E) snd(E) videodev(E) drm_kms_helper(E) videobuf2_common(E) rfkill(E) video(E) rng_core(E) mc(E) soundcore(E) joydev(E) parport_pc(E) parport(E) sg(E) evdev(E) msr(E) loop(E) fuse(E) drm(E) efi_pstore(E) dm_mod(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) efivarfs(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) raid1(E) raid0(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sd_mod(E) ahci(E) libahci(E) xhci_pci(E) nvme(E) libata(E) crc32_pclmul(E) nvme_core(E) xhci_hcd(E) t10_pi(E) crc32c_intel(E) i2c_piix4(E) r8169(E) crc64_rocksoft(E) realtek(E) scsi_mod(E) usbcore(E) scsi_common(E) usb_common(E) wmi(E) gpio_amdpt(E) gpio_generic(E) button(E)
> [Sun Jul 14 16:51:39 2024] CPU: 1 PID: 2272 Comm: Xorg Tainted: P        W  OE      6.10.0-rc7-linan-1 #283
> [Sun Jul 14 16:51:39 2024] Hardware name: Micro-Star International Co., Ltd. MS-7B89/B450M MORTAR MAX (MS-7B89), BIOS 2.80 06/10/2020
> [Sun Jul 14 16:51:39 2024] RIP: 0010:track_pfn_copy+0x94/0xa0
> [Sun Jul 14 16:51:39 2024] Code: ff ff ff eb b4 48 89 ee 48 8b 44 24 10 48 8b 3c 24 b9 01 00 00 00 4c 29 e6 48 8d 54 24 08 48 89 44 24 08 e8 fe fc ff ff eb 8f <0f> 0b eb d0 e8 73 b9 bb 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90
> [Sun Jul 14 16:51:39 2024] RSP: 0018:ffffae5b4a04fb68 EFLAGS: 00010202
> [Sun Jul 14 16:51:39 2024] RAX: 0000000000000028 RBX: ffff91d546ae1d10 RCX: 0000000000000000
> [Sun Jul 14 16:51:39 2024] RDX: 0000000000000001 RSI: 000fffffffe00000 RDI: ffff91d5b969c700
> [Sun Jul 14 16:51:39 2024] RBP: 00007fe71316e000 R08: ffff91d639b0b9a0 R09: 00007fe71316e000
> [Sun Jul 14 16:51:39 2024] R10: 00007fe71316dfff R11: 00007fe71316efff R12: 00007fe71316d000
> [Sun Jul 14 16:51:39 2024] R13: ffff91d543702f40 R14: ffff91d639b0b9a0 R15: 00007fe71316e000
> [Sun Jul 14 16:51:39 2024] FS:  00007fe7124f8ac0(0000) GS:ffff91d84ec80000(0000) knlGS:0000000000000000
> [Sun Jul 14 16:51:39 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [Sun Jul 14 16:51:39 2024] CR2: 000055d6ab0453c0 CR3: 0000000179626000 CR4: 0000000000350ef0
> [Sun Jul 14 16:51:39 2024] Call Trace:
> [Sun Jul 14 16:51:39 2024]  <TASK>
> [Sun Jul 14 16:51:39 2024]  ? __warn+0x7c/0x120
> [Sun Jul 14 16:51:39 2024]  ? track_pfn_copy+0x94/0xa0

Same thing (follow-up error), during fork() we don't know what to do 
because the page tables were already modified and we don't know how to 
handle that PFNMAP mapping.

-- 
Cheers,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ