[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.20.1602241307130.1533@schleppi>
Date: Wed, 24 Feb 2016 13:11:58 +0100 (CET)
From: Sebastian Ott <sebott@...ux.vnet.ibm.com>
To: Martin Schwidefsky <schwidefsky@...ibm.com>
cc: "Kirill A. Shutemov" <kirill@...temov.name>,
Gerald Schaefer <gerald.schaefer@...ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Michael Ellerman <mpe@...erman.id.au>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
linuxppc-dev@...ts.ozlabs.org,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
linux-arm-kernel@...ts.infradead.org,
Heiko Carstens <heiko.carstens@...ibm.com>,
linux-s390@...r.kernel.org
Subject: Re: [BUG] random kernel crashes after THP rework on s390 (maybe also
on PowerPC and ARM)
On Wed, 24 Feb 2016, Martin Schwidefsky wrote:
> On Tue, 23 Feb 2016 22:33:45 +0300
> "Kirill A. Shutemov" <kirill@...temov.name> wrote:
>
> > On Tue, Feb 23, 2016 at 07:19:07PM +0100, Gerald Schaefer wrote:
> > > I'll check with Martin, maybe it is actually trivial, then we can
> > > do a quick test it to rule that one out.
> >
> > Oh. I found a bug in __split_huge_pmd_locked(). Although, not sure if it's
> > _the_ bug.
> >
> > pmdp_invalidate() is called for the wrong address :-/
> > I guess that can be destructive on the architecture, right?
> >
> > Could you check this?
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 1c317b85ea7d..4246bc70e55a 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -2865,7 +2865,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
> > pgtable = pgtable_trans_huge_withdraw(mm, pmd);
> > pmd_populate(mm, &_pmd, pgtable);
> >
> > - for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
> > + for (i = 0; i < HPAGE_PMD_NR; i++) {
> > pte_t entry, *pte;
> > /*
> > * Note that NUMA hinting access restrictions are not
> > @@ -2886,9 +2886,9 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
> > }
> > if (dirty)
> > SetPageDirty(page + i);
> > - pte = pte_offset_map(&_pmd, haddr);
> > + pte = pte_offset_map(&_pmd, haddr + i * PAGE_SIZE);
> > BUG_ON(!pte_none(*pte));
> > - set_pte_at(mm, haddr, pte, entry);
> > + set_pte_at(mm, haddr + i * PAGE_SIZE, pte, entry);
> > atomic_inc(&page[i]._mapcount);
> > pte_unmap(pte);
> > }
> > @@ -2938,7 +2938,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
> > pmd_populate(mm, pmd, pgtable);
> >
> > if (freeze) {
> > - for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
> > + for (i = 0; i < HPAGE_PMD_NR; i++) {
> > page_remove_rmap(page + i, false);
> > put_page(page + i);
> > }
>
> Test is running and it looks good so far. For the final assessment I defer
> to Gerald and Sebastian.
>
Yes, that one worked. My testsystem is doing make -j10 && make clean
in a loop since 4 hours now. Thanks!
Sebastian
Powered by blists - more mailing lists