lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whz_8tRNGCr09X59nMW3JBzFLE-g-F-brxd+AkK+RceCw@mail.gmail.com>
Date:   Wed, 30 Mar 2022 13:05:00 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, Zi Yan <ziy@...dia.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        David Hildenbrand <david@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Oscar Salvador <osalvador@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux-MM <linux-mm@...ck.org>
Subject: Re: [BUG] Crash on x86_32 for: mm: page_alloc: avoid merging
 non-fallbackable pageblocks with others

On Wed, Mar 30, 2022 at 12:42 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> I started testing new patches and it crashed when doing the x86-32 test on
> boot up.
>
> Initializing HighMem for node 0 (000375fe:0021ee00)
> BUG: kernel NULL pointer dereference, address: 00000878
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> *pdpt = 0000000000000000 *pde = f0000000f000eef3
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-test+ #469
> Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> EIP: get_pfnblock_flags_mask+0x2c/0x36
> Code: 6d ea ff 55 89 e5 56 89 ce 53 8b 18 89 d8 c1 eb 1e e8 f7 fb ff ff 69 db c0 02 00 00 89 c1 89 c2 c1 ea 05 8b 83 7c d7 79 c1 5b <8b> 04 90 d3 e8 21 f0 5e 5d c3 55 89 e5 57 56 89 d6 53 89 c3 64 a1

The whole function is in that Code: thing, and it decodes to:

   0: 55                    push   %ebp
   1: 89 e5                mov    %esp,%ebp
   3: 56                    push   %esi
   4: 89 ce                mov    %ecx,%esi
   6: 53                    push   %ebx
   7: 8b 18                mov    (%eax),%ebx
   9: 89 d8                mov    %ebx,%eax
   b: c1 eb 1e              shr    $0x1e,%ebx
   e: e8 f7 fb ff ff        call   0xfffffc0a
  13: 69 db c0 02 00 00    imul   $0x2c0,%ebx,%ebx
  19: 89 c1                mov    %eax,%ecx
  1b: 89 c2                mov    %eax,%edx
  1d: c1 ea 05              shr    $0x5,%edx
  20: 8b 83 7c d7 79 c1    mov    -0x3e862884(%ebx),%eax
  26: 5b                    pop    %ebx
  27:* 8b 04 90              mov    (%eax,%edx,4),%eax <-- trapping instruction
  2a: d3 e8                shr    %cl,%eax
  2c: 21 f0                and    %esi,%eax
  2e: 5e                    pop    %esi
  2f: 5d                    pop    %ebp
  30: c3                    ret

with '%eax' being NULL, and %edx being 0x21e.

(The call seems to be to 'pfn_to_bitidx().isra.0' if my compiler does
similar code generation, so it's out-of-lined part of pfn_to_bitidx()
despite being marked inline)

So that oops is that

        word = bitmap[word_bitidx];

line, with 'bitmap' being NULL (and %edx contains 'word_bitidx').

Looking around, your 'config-bad' doesn't even have
CONFIG_MEMORY_ISOLATION enabled, and so I suspect the culprit is this
part of the change:

-               if (unlikely(has_isolate_pageblock(zone))) {

which used to always be false for that config, and now the code is
suddenly enabled.

Alternatively, that code just can't deal with highmem properly.

But I didn't really analyze things, I'm mainly doing pattern matching here.

Zi Yan - and all the people who ack'ed and reviewed this - please take
a deeper look..

                Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ