[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVKZ-0+m9AOGS5PLKKRfY59OckPyNon1zrWmkOZbjvaEA@mail.gmail.com>
Date: Tue, 13 Jun 2017 08:54:08 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Andy Lutomirski <luto@...nel.org>, Bernhard Held <berny156@....de>,
Toshi Kani <toshi.kani@...com>, Borislav Petkov <bp@...en8.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Brian Gerst <brgerst@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
"Luis R. Rodriguez" <mcgrof@...e.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
On Tue, Jun 6, 2017 at 4:21 PM, Mikulas Patocka <mpatocka@...hat.com> wrote:
>
>
> On Tue, 6 Jun 2017, Andy Lutomirski wrote:
>
>> On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@...hat.com> wrote:
>> >
>> >
>> > On Sun, 28 May 2017, Andy Lutomirski wrote:
>> >
>> >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@....de> wrote:
>> >> > Hi,
>> >> >
>> >> > this patch breaks the boot of my kernel. The last message is "Booting
>> >> > the kernel.".
>> >> >
>> >> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
>> >> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
>> >> > microcode of the E5450 and recognizes the CPU.
>> >> >
>> >> > Please find below the dmesg of a the latest kernel w/o the PAT-patch.
>> >> > I'm happy to provide more information or to test patches.
>> >>
>> >> I think this patch is bogus. pat_enabled() sure looks like it's
>> >> supposed to return true if PAT is *enabled*, and these days PAT is
>> >> "enabled" even if there's no HW PAT support. Even if the patch were
>> >> somehow correct, it should have been split up into two patches, one to
>> >> change pat_enabled() and one to use this_cpu_has().
>> >>
>> >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so
>> >> -stable knows not to backport it, and starting over with the fix.
>> >> >From very brief inspection, the right fix is to make sure that
>> >> pat_init(), or at least init_cache_modes(), gets called on the
>> >> affected CPUs.
>> >>
>> >> --Andy
>> >
>> > Hi
>> >
>> > Here I send the second version of the patch. It drops the change from
>> > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that
>> > caused kernel to be unbootable for some people).
>> >
>> > Another change is that setup_arch() calls init_cache_modes() if PAT is
>> > disabled, so that init_cache_modes() is always called.
>> >
>> > Mikulas
>> >
>> >
>> >
>> > From: Mikulas Patocka <mpatocka@...hat.com>
>> >
>> > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
>> > variable is set to 1 by default and the function pat_init() sets
>> > __pat_enabled to 0 if the CPU doesn't support PAT.
>> >
>> > However, on AMD K6-3 CPU, the processor initialization code never calls
>> > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
>> > returns true, even though the K6-3 CPU doesn't support PAT.
>> >
>> > The result of this bug is that this warning is produced when attemting to
>> > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
>> > Another symptom of this bug is that the framebuffer driver doesn't set the
>> > K6-3 MTRR registers.
>> >
>> > This patch changes pat_enabled() so that it returns true only if pat
>> > initialization was actually done.
>>
>> Why? Shouldn't calling init_cache_modes() be sufficient?
>>
>> --Andy
>
> See the function arch_phys_wc_add():
>
> if (pat_enabled() || !mtrr_enabled())
> return 0; /* Success! (We don't need to do anything.) */
> ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
>
> - if pat_enabled() returns true, that function doesn't set MTRRs.
> pat_enabled() must return false on systems without PAT, so that MTRRs are
> set.
It still sounds to me like there are two bugs here that should be
treated separately.
Bug 1: A warning fires. Have you figured out why the warning fires?
Bug 2: arch_phys_wc_add() appears to be checking the wrong condition.
How about checking the right condition? It doesn't actually want to
know if PAT is enabled -- it wants to know if the PAT contains a
usable WC entry. Something like pat_has_wc() would be better, I
think.
--Andy
Powered by blists - more mailing lists