lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 1 Oct 2014 09:18:25 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Hugh Dickins <hughd@...gle.com>
Cc:	Dave Jones <davej@...hat.com>, Al Viro <viro@...iv.linux.org.uk>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Rik van Riel <riel@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Michel Lespinasse <walken@...gle.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Mel Gorman <mgorman@...e.de>,
	Sasha Levin <sasha.levin@...cle.com>
Subject: Re: pipe/page fault oddness.

On Wed, Oct 1, 2014 at 9:01 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> We need to get rid of it, and just make it the same as pte_protnone().
> And then the real protnone is in the vma flags, and if you actually
> ever get to a pte that is marked protnone, you know it's a numa page.

So I'd really suggest we do exactly that. Get rid of "pte_numa()"
entirely, get rid of "_PAGE_[BIT_]NUMA" entirely, and instead add a
"pte_protnone()" helper to check for the "protnone" case (which on x86
is testing the _PAGE_PROTNONE bit, and on most other architectures is
just testing that the page has no access rights).

Then we throw away "pte_mknuma()" and "pte_mknonnuma()" entirely,
because they are brainless sh*t, and we just use

    ptent = ptep_modify_prot_start(mm, addr, pte);
    ptent = pte_modify(ptent, newprot);
    ptep_modify_prot_commit(mm, addr, pte, ptent);

reliably instead (where for the mknuma case "newprot" is PROT_NONE,
and for mknonnuma() it is vma->vm_page_prot. Yes, that means that you
have to pass in the vma to those functions, but that just makes sense
anyway.

And if that means that we lose the numa flag on mprotect etc, nobody sane cares.

Seriously, why can't we just do this, and throw away all the crap that
is "numa special case". This would make all the random games in
change_pte_range() just go away entirely, because the whole NUMA thing
really wouldn't be a special case for the pte AT ALL any more. All it
would be is that a pte could be marked PROT_NONE even if the
vma->vm_flags aren't.

Please, please, please? The current _PAGE_NUMA really is a horrible
horrible thing, and may well be the source of this bug.

The fact that it took DaveJ a long time to trigger his lockup would be
entirely consistent with "you have to split a PROTNONE large page due
to memory pressure", so the problem with our current pte_mknuma() that
Hugh points out looks entirely possible to me.

Now, there may be some reason why it can't happen, but even in the
absense of this bug, I really think that _PAGE_NUMA has been a huge
mistake from day one.

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ