lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <60e29e62-4864-4393-b899-01489ee73b91@redhat.com>
Date: Thu, 26 Sep 2024 12:48:19 +0200
From: David Hildenbrand <david@...hat.com>
To: Peter Xu <peterx@...hat.com>
Cc: syzbot <syzbot+bf2c35fa302ebe3c7471@...kaller.appspotmail.com>,
 akpm@...ux-foundation.org, bp@...en8.de, dave.hansen@...ux.intel.com,
 hpa@...or.com, jgg@...pe.ca, leitao@...ian.org,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, mingo@...hat.com,
 rppt@...nel.org, syzkaller-bugs@...glegroups.com, tglx@...utronix.de,
 x86@...nel.org
Subject: Re: [syzbot] [mm?] WARNING in copy_huge_pmd

On 25.09.24 18:59, Peter Xu wrote:
> On Tue, Sep 24, 2024 at 04:45:00PM +0200, David Hildenbrand wrote:
>> On 23.09.24 14:18, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit:    88264981f208 Merge tag 'sched_ext-for-6.12' of git://git.k..
>>> git tree:       upstream
>>> console+strace: https://syzkaller.appspot.com/x/log.txt?x=16c36c27980000
>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=e851828834875d6f
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bf2c35fa302ebe3c7471
>>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12773080580000
>>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16ed5e9f980000
>>>
>>> Downloadable assets:
>>> disk image: https://storage.googleapis.com/syzbot-assets/0e011ac37c93/disk-88264981.raw.xz
>>> vmlinux: https://storage.googleapis.com/syzbot-assets/f5c65577e19e/vmlinux-88264981.xz
>>> kernel image: https://storage.googleapis.com/syzbot-assets/984d963c8ea1/bzImage-88264981.xz
>>>
>>> The issue was bisected to:
>>>
>>> commit 75182022a0439788415b2dd1db3086e07aa506f7
>>> Author: Peter Xu <peterx@...hat.com>
>>> Date:   Mon Aug 26 20:43:51 2024 +0000
>>>
>>>       mm/x86: support large pfn mappings
>>>
>>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=17df9c27980000
>>> final oops:     https://syzkaller.appspot.com/x/report.txt?x=143f9c27980000
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=103f9c27980000
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+bf2c35fa302ebe3c7471@...kaller.appspotmail.com
>>> Fixes: 75182022a043 ("mm/x86: support large pfn mappings")
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 1 PID: 5508 at mm/huge_memory.c:1602 copy_huge_pmd+0x102c/0x1c60 mm/huge_memory.c:1602
>>
>> This is the
>>
>> VM_WARN_ON_ONCE(is_cow_mapping(src_vma->vm_flags) && pmd_write(pmd))
>>
>> So we have a special-marked PMD in a COW mapping.
>>
>> The reproducer seems to involve fuse, but not sure if that makes a
>> difference here.
> 
> That chunk of code seems to be there only making sure the test won't get
> blocked due to any fused based fs being stuck, via writting to the "abort"
> file:
> 
>        snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
>                 ent->d_name);
>        int fd = open(abort, O_WRONLY);
>        if (fd == -1) {
>          continue;
>        }
>        if (write(fd, abort, 1) < 0) {
>        }
>        close(fd);
> 
> So far looks not relevant to this issue indeed.
> 
> Unfortunately I cannot reproduce it even with the reproducer.  So this one
> is a bit tricky..
> 
> What confuses me yet is how that special bit is set, if it's only used so
> far with vfio-pci, and this test doesn't seem to have it involved.
> 
> The test keeps invoking processes, then threads, doing concurrent accesses
> over a few stuff (madvise, mremap, migrate_pages, munmap, etc.) on the
> pre-mapped areas, but none of them seem to create new memory that can
> provide hint on how special bit can start to occur.
> 
> I wonder if some of these operations can race in a way that mm can wrongly
> create the special bit (alone with it being writable).. and then it could
> be a historical bug, only captured by this patchset due to the newly added
> WARN_ON_ONCE somehow, then it could mean that it's not the WRITE bit that
> is not intended, but the SPECIAL bit altogether.

I assume you are missing a check for present/non-swap pmds. Assume you 
have a migration entry and end up using the special bit -- which is 
perfectly fine -- your code would assume it's a present PMD with the 
special bit set.

Maybe for the time being something like:

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 0580ac9e47b9..e55efcad1e6c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1586,7 +1586,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct 
mm_struct *src_mm,
         int ret = -ENOMEM;

         pmd = pmdp_get_lockless(src_pmd);
-       if (unlikely(pmd_special(pmd))) {
+       if (unlikely(pmd_present(pmd) && pmd_special(pmd))) {
                 dst_ptl = pmd_lock(dst_mm, dst_pmd);
                 src_ptl = pmd_lockptr(src_mm, src_pmd);
                 spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);


-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ