lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwuuAQdoBx_R4CaHJp1ZdRTAwG8n1ZfiKmpZUwwZ9iUkw@mail.gmail.com>
Date:	Thu, 6 Dec 2012 08:10:13 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Henrik Rydberg <rydberg@...omail.se>
Cc:	Jan Kara <jack@...e.cz>, Mel Gorman <mgorman@...e.de>,
	linux-mm <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: Oops in 3.7-rc8 isolate_free_pages_block()

Ok, so it's isolate_freepages_block+0x88, and as Jan Kara already
guessed from just the offset, that is indeed likely the PageBuddy()
test.

On Thu, Dec 6, 2012 at 7:22 AM, Henrik Rydberg <rydberg@...omail.se> wrote:
>
>  http://bitmath.org/test/oops-3.7-rc8.jpg
>
> ffffffff810a6d6a:       eb 1c                   jmp    ffffffff810a6d88 <isolate_freepages_block+0x88>
> ffffffff810a6d6c:       0f 1f 40 00             nopl   0x0(%rax)

On the first entry to the loop, we jump *into* the loop, over the end
condition (the compiler has basically turned. And we jump directly to
the faulting instruction. Looking at the register state, though, we're
not at the first iteration of the loop, so we don't have to worry
about that case. The loop itself then starts with:

> ffffffff810a6d70:       48 83 c5 01             add    $0x1,%rbp
> ffffffff810a6d74:       48 83 c3 40             add    $0x40,%rbx

The above is the "blockpfn++, cursor++" part of the loop, while the
test below is the loop condition ("blockpfn < end_pfn"):

> ffffffff810a6d78:       49 39 ed                cmp    %rbp,%r13
> ffffffff810a6d7b:       0f 86 cf 00 00 00       jbe    ffffffff810a6e50 <isolate_freepages_block+0x150>

>From your image, %rbp is 0x070000 and %r13 is 0x0702f9.

The "pfn_valid_within()" test is a no-op because we don't have holes
in zones on x86, so then we have

                if (!valid_page)
                        valid_page = page;

which generates a test+cmove:

> ffffffff810a6d81:       4d 85 e4                test   %r12,%r12
> ffffffff810a6d84:       4c 0f 44 e3             cmove  %rbx,%r12

(which is how we can tell we're not at the beginning: 'valid_page' is
0xffffea0001bfbe40, while the current page is 0xffffea0001c00000).

.. and finally the oopsing instruction from PageBuddy(), which is the
read of the 'page->_mapcount'

> ffffffff810a6d88:       8b 43 18                mov    0x18(%rbx),%eax
> ffffffff810a6d8b:       83 f8 80                cmp    $0xffffff80,%eax
> ffffffff810a6d8e:       75 e0                   jne    ffffffff810a6d70 <isolate_freepages_block+0x70>

So yeah, that loop has apparently wandered into la-la-land. end_pfn
must be somehow wrong.

Mel, does any of this ring a bell (Andrew also added to the cc, since
the patches came through him).

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ