lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFzZ7PND2Xvz9wB1jaCmp0rBMTSmJtKiFwSeOWy9iLSd8Q@mail.gmail.com>
Date:   Fri, 29 Jun 2018 14:01:46 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Larry Finger <Larry.Finger@...inger.net>
Cc:     Matthew Wilcox <willy@...radead.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Christoph Lameter <cl@...ux.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Jerome Glisse <jglisse@...hat.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Pekka Enberg <penberg@...nel.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        ppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [Update] Regression in 4.18 - 32-bit PowerPC crashes on boot -
 bisected to commit 1d40a5ea01d5

On Fri, Jun 29, 2018 at 1:42 PM Larry Finger <Larry.Finger@...inger.net> wrote:
>
> I have more information regarding this BUG. Line 700 of page-flags.h is the
> macro PAGE_TYPE_OPS(Table, table). For further debugging, I manually expanded
> the macro, and found that the bug line is VM_BUG_ON_PAGE(!PageTable(page), page)
> in routine __ClearPageTable(), which is called from pgtable_page_dtor() in
> include/linux/mm.h. I also added a printk call to PageTable() that logs
> page->page_type. The routine was called twice. The first had page_type of
> 0xfffffbff, which would have been expected for a . The second call had
> 0xffffffff, which led to the BUG.

So it looks to me like the tear-down of the page tables first found a
page that is indeed a page table, and cleared the page table bit
(well, it set it - the bits are reversed).

Then it took an exception (that "interrupt: 700") and that causes
do_exit() again, and it tries to free the same page table - and now
it's no longer marked as a page table, because it already went through
the __ClearPageTable() dance once.

So on the second path through, it catches that "the bit already said
it wasn't a page table" and does the BUG.

But the real question is what the problem was the *first* time around.
I assume that has scrolled off the screen? This part:

  _exception_pkey+0x58/0x128
  ret_from_except_full+0x0/0x4
  --- interrupt: 700 at free_pgd_range+0x19c/0x30c
       LR = free_pgd_range+0x19c/0x30c
  free_pgtables+0xa/0xb
  exit_mnap+0xf4/0x16c
  mmput+0x64/0xf0

Does reverting that commit 1d40a5ea01d5 make everything work for you?
Because if so, judging by the deafening silence on this so far, I
think that's what we should do.

That said, can some ppc person who knows the 32-bit ppc code and maybe
knows what that "interrupt: 700" means talk about that oddity in the
trace, please?

                    Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ