lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxn_T=UgBUwqkRJvALygrNORaB6ox3nowvHpV6yFBfDoA@mail.gmail.com>
Date:	Fri, 21 Nov 2014 10:22:07 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Steven Rostedt <rostedt@...dmis.org>, Tejun Heo <tj@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Don Zickus <dzickus@...hat.com>, Dave Jones <davej@...hat.com>,
	"the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: frequent lockups in 3.18rc4

On Fri, Nov 21, 2014 at 9:22 AM, Andy Lutomirski <luto@...capital.net> wrote:
>
> Both mystify me.  Why does the 32-bit version walk down the hierarchy
> at all instead of just touching the top level?

Quite frankly, I think it's just due to historical reasons, and should
be removed.

But the historical reasons are that with the aliasing of the PUD and
PMD entries in the PGD, it's all fairly confusing. So I think we only
used to do the top level, but then when we expanded from two levels to
three, that "top level" became the pmd, and then when we expanded from
three to four, the pmd was actually two levels down. So it's all
basically mindless work.

So I do think we could simplify and unify things.

In 32-bit mode, we actually have two different cases:

 - in PAE, there's the magic top-level 4-entry PGD that always *has*
to be present (the P bit isn't actually checked by hardware)

    As a result, in PAE mode, the top PGD entries always exist, and
are always prepopulated, and for the kernel area (including obviously
the vmalloc space) always points to the init_pgd[] entry.

    Ergo, in PAE mode, I don't think we should ever hit this case in
the first place.

 - in non-PAE mode, we should just copy the top-level entry, and return.

And in 64-bit more, we only have the "copy the top-level entry" case.

So I think we should

 (a) remove the 32-bit vs 64-bit difference, because that's not actually valid

 (b) make it a PAE vs non-PAE difference

 (c) the PAE case is a no-op

 (d) the non-PAE case would look something like this:

    static noinline int vmalloc_fault(unsigned long address)
    {
        unsigned index;
        pgd_t *pgd_dst, pgd_entry;

        /* Make sure we are in vmalloc area: */
        if (!(address >= VMALLOC_START && address < VMALLOC_END))
                return -1;

        index = pgd_index(address);
        pgd_entry = init_mm.pgd[index];
        if (!pgd_present(pgd_entry))
                return -1;

        pgd_dst = __va(PAGE_MASK & read_cr3());
        if (pgd_present(pgd_dst[index]))
                return -1;

        ACCESS_ONCE(pgd_dst[index]) = pgd_entry;
        return 0;
    }
    NOKPROBE_SYMBOL(vmalloc_fault);

and it's done.

Would anybody be willing to actually *test* something like the above?
The above may compile, but that's all the "testing" it got.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ