lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 21 Jan 2014 18:47:07 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:	Steven Noonan <steven@...inklabs.net>,
	Linux Kernel mailing List <linux-kernel@...r.kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	Alex Thorlton <athorlton@....com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [BISECTED] Linux 3.12.7 introduces page map handling regression

On Tue, Jan 21, 2014 at 5:49 PM, Greg Kroah-Hartman
<gregkh@...uxfoundation.org> wrote:
>
> Odds are this also shows up in 3.13, right?

Probably. I don't have a Xen PV setup to test with (and very little
interest in setting one up).. And I have a suspicion that it might not
be so much about Xen PV, as perhaps about the kind of hardware.

I suspect the issue has something to do with the magic _PAGE_NUMA
tie-in with _PAGE_PRESENT. And then mprotect(PROT_NONE) ends up
removing the _PAGE_PRESENT bit, and now the crazy numa code is
confused.

The whole _PAGE_NUMA thing is a f*cking horrible hack, and shares the
bit with _PAGE_PROTNONE, which is why it then has that tie-in to
_PAGE_PRESENT.

Adding Andrea to the Cc, because he's the author of that horridness.
Putting Steven's test-case here as an attachement for Andrea, maybe
that makes him go "Ahh, yes, silly case".

Also added Kirill, because he was involved the last _PAGE_NUMA debacle.

Andrea, you can find the thread on lkml, but it boils down to commit
1667918b6483 (backported to 3.12.7 as 3d792d616ba4) breaking the
attached test-case (but apparently only under Xen PV). There it
apparently causes a "BUG: Bad page map .." error.

And I suspect this is another of those "this bug is only visible on
real numa machines, because _PAGE_NUMA isn't actually ever set
otherwise". That has pretty much guaranteed that it gets basically
zero testing, which is not a great idea when coupled with that subtle
sharing of the _PAGE_PROTNONE bit..

It may be that the whole "Xen PV" thing is a red herring, and that
Steven only sees it on that one machine because the one he runs as a
PV guest under is a real NUMA machine, and all the other machines he
has tried it on haven't been numa. So it *may* be that that "only
under Xen PV" is a red herring. But that's just a possible guess.

Christ, how I hate that _PAGE_NUMA bit. Andrea: the fact that it gets
no testing on any normal machines is a major problem. If it was simple
and straightforward and the code was "obviously correct", it wouldn't
be such a problem, but the _PAGE_NUMA code definitely does not fall
under that "simple and obviously correct" heading.

Guys, any ideas?

                Linus

View attachment "t.c" of type "text/x-csrc" (532 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ