lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200410181452.GC3172@xz-x1>
Date:   Fri, 10 Apr 2020 14:14:52 -0400
From:   Peter Xu <peterx@...hat.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Hillf Danton <hdanton@...a.com>, kernel test robot <lkp@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        linux-kernel@...r.kernel.org
Subject: Re: f45ec5ff16 ("userfaultfd: wp: support swap and page migration"):
 [  140.777858] BUG: Bad rss-counter state mm:b278fc66 type:MM_ANONPAGES
 val:1

On Fri, Apr 10, 2020 at 08:38:05AM -0700, Matthew Wilcox wrote:
> On Fri, Apr 10, 2020 at 11:32:34AM -0400, Peter Xu wrote:
> > I'm still trying to digest on what's happened... It would be good too
> > if more information on the test could be given, e.g., what is the
> > behavior of trinity-c2. A reproducer is of course even better.
> 
> Trinity is a syscall fuzzer.  Don't expect what it's doing to make any
> sense, it's just executing syscalls at random.

OK thanks.

Though I just noticed that the original report is actually with some
attachments which I totally missed initially.  There's the config file
showing that we're with:

CONFIG_MIGRATION=y
CONFIG_MEMORY_FAILURE=y
CONFIG_DEVICE_PRIVATE=n

And even a reproducer.  However the reproducer script will fail at
wget, until I fixed it using:

initrd=openwrt-trinity-i386.cgz

to replace:

initrd=openwrt-i386-trinity.cgz

Then I can download the initrd and boot the VM with a decent QEMU.
However I didn't see any test running after the VM booted, and it will
reboot/shutdown after 100 sec without any error triggered (I believe
the rc.local tries to run something under /etc/kernel-tests/ but I'm
not sure it's running the right thing).

If there's any way to reproduce (I believe so because it can even
bisect in the original report, I just don't know how...), I'm thinking
maybe we can try to dump every swp entry change that could have been
touched in change_pte_range(), which is the only place that I thought
could be related to this in the commit, to see whether there's
anything suspecious.

8<----------------------------------------------
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 1d823b050329..1b6daf7d03aa 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -173,6 +173,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
                                newpte = pte_swp_clear_uffd_wp(newpte);
 
                        if (!pte_same(oldpte, newpte)) {
+                               pr_info("%s: Update swp entry, 0x%lx -> 0x%lx\n",
+                                       __func__, pte_val(oldpte), pte_val(newpte));
                                set_pte_at(vma->vm_mm, addr, pte, newpte);
                                pages++;
                        }

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ