[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzMjxY5O6Hf/IPTx@monkey>
Date: Tue, 27 Sep 2022 09:24:37 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Peter Xu <peterx@...hat.com>
Cc: Hugh Dickins <hughd@...gle.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Yang Shi <shy828301@...il.com>,
Matthew Wilcox <willy@...radead.org>,
syzbot <syzbot+152d76c44ba142f8992b@...kaller.appspotmail.com>,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, llvm@...ts.linux.dev, nathan@...nel.org,
ndesaulniers@...gle.com, songmuchun@...edance.com,
syzkaller-bugs@...glegroups.com, trix@...hat.com
Subject: Re: [syzbot] general protection fault in PageHeadHuge
On 09/25/22 20:11, Peter Xu wrote:
> On Sat, Sep 24, 2022 at 12:01:16PM -0700, Mike Kravetz wrote:
> > On 09/24/22 11:06, Peter Xu wrote:
> > >
> > > Sorry I forgot to reply on this one.
> > >
> > > I didn't try linux-next, but I can easily reproduce this with mm-unstable
> > > already, and I verified that Hugh's patch fixes the problem for shmem.
> > >
> > > When I was testing I found hugetlb selftest is broken too but with some
> > > other errors:
> > >
> > > $ sudo ./userfaultfd hugetlb 100 10
> > > ...
> > > bounces: 6, mode: racing ver read, ERROR: unexpected write fault (errno=0, line=779)
> > >
> > > The failing check was making sure all MISSING events are not triggered by
> > > writes, but frankly I don't really know why it's required, and that check
> > > existed since the 1st commit when test was introduced.
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c47174fc362a089b1125174258e53ef4a69ce6b8
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/vm/userfaultfd.c?id=c47174fc362a089b1125174258e53ef4a69ce6b8#n291
> > >
> > > And obviously some recent hugetlb-related change caused that to happen.
> > >
> > > Dropping that check can definitely work, but I'll have a closer look soon
> > > too to make sure I didn't miss something. Mike, please also let me know if
> > > you are aware of this problem.
> > >
> >
> > Peter, I am not aware of this problem. I really should make running ALL
> > hugetlb tests part of my regular routine.
> >
> > If you do not beat me to it, I will take a look in the next few days.
>
> Just to update - my bisection points to 00cdec99f3eb ("hugetlbfs: revert
> use i_mmap_rwsem to address page fault/truncate race", 2022-09-21).
>
> I don't understand how they are related so far, though. It should be a
> timing thing because the failure cannot be reproduced on a VM but only on
> the host, and it can also pass sometimes even on the host but rarely.
Thanks Peter!
After your analysis, I also started looking at this.
- I did reproduce a few times in a VM
- On BM (a laptop) I could reproduce but it would take several (10's of) runs
> Logically all the uffd messages in the stress test should be generated by
> the locking thread, upon:
>
> pthread_mutex_lock(area_mutex(area_dst, page_nr));
I personally find that test program hard to understand/follow and it takes me
a day or so to understand what it is doing, then I immediately loose context
when I stop looking at it. :(
So, as you mention below the program is depending on pthread_mutex_lock()
doing a read fault before a write.
> I thought a common scheme for lock() fast path should already be an
> userspace cmpxchg() and that should be a write fault already.
>
> For example, I did some stupid hack on the test and I can trigger the write
> check fault with anonymous easily with an explicit cmpxchg on byte offset 128:
>
> diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
> index 74babdbc02e5..a7d6938d4553 100644
> --- a/tools/testing/selftests/vm/userfaultfd.c
> +++ b/tools/testing/selftests/vm/userfaultfd.c
> @@ -637,6 +637,10 @@ static void *locking_thread(void *arg)
> } else
> page_nr += 1;
> page_nr %= nr_pages;
> + char *ptr = area_dst + (page_nr * page_size) + 128;
> + char _old = 0, new = 1;
> + (void)__atomic_compare_exchange_n(ptr, &_old, new, false,
> + __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
> pthread_mutex_lock(area_mutex(area_dst, page_nr));
> count = *area_count(area_dst, page_nr);
> if (count != count_verify[page_nr])
>
> I'll need some more time thinking about it before I send a patch to drop
> the write check..
I did another stupid hack, and duplicated the statement:
count = *area_count(area_dst, page_nr);
before the,
pthread_mutex_lock(area_mutex(area_dst, page_nr));
This should guarantee a read fault independent of what pthread_mutex_lock
does. However, it still results in the occasional "ERROR: unexpected write
fault". So, something else if happening. I will continue to experiment
and think about this.
--
Mike Kravetz
Powered by blists - more mailing lists