[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa6f1dd4-02b6-779e-2ee4-12644d1b3c3f@suse.com>
Date: Fri, 15 Jul 2022 06:22:52 +0200
From: Juergen Gross <jgross@...e.com>
To: Chuck Zmudzinski <brchuckz@...scape.net>,
Chuck Zmudzinski <brchuckz@....com>,
linux-kernel@...r.kernel.org
Cc: Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
Dan Williams <dan.j.williams@...el.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Tom Lendacky <thomas.lendacky@....com>,
Jane Chu <jane.chu@...cle.com>,
Tianyu Lan <Tianyu.Lan@...rosoft.com>,
Randy Dunlap <rdunlap@...radead.org>,
Sean Christopherson <seanjc@...gle.com>,
Jan Beulich <jbeulich@...e.com>,
xen-devel@...ts.xenproject.org, stable@...r.kernel.org
Subject: Re: [PATCH v2] Subject: x86/PAT: Report PAT on CPUs that support PAT
without MTRR
On 15.07.22 04:19, Chuck Zmudzinski wrote:
> On 7/14/2022 1:40 AM, Juergen Gross wrote:
>> On 13.07.22 03:36, Chuck Zmudzinski wrote:
>>> The commit 99c13b8c8896d7bcb92753bf
>>> ("x86/mm/pat: Don't report PAT on CPUs that don't support it")
>>> incorrectly failed to account for the case in init_cache_modes() when
>>> CPUs do support PAT and falsely reported PAT to be disabled when in
>>> fact PAT is enabled. In some environments, notably in Xen PV domains,
>>> MTRR is disabled but PAT is still enabled, and that is the case
>>> that the aforementioned commit failed to account for.
>>>
>>> As an unfortunate consequnce, the pat_enabled() function currently does
>>> not correctly report that PAT is enabled in such environments. The fix
>>> is implemented in init_cache_modes() by setting pat_bp_enabled to true
>>> in init_cache_modes() for the case that commit 99c13b8c8896d7bcb92753bf
>>> ("x86/mm/pat: Don't report PAT on CPUs that don't support it") failed
>>> to account for.
>>>
>>> This approach arranges for pat_enabled() to return true in the Xen PV
>>> environment without undermining the rest of PAT MSR management logic
>>> that considers PAT to be disabled: Specifically, no writes to the PAT
>>> MSR should occur.
>>>
>>> This patch fixes a regression that some users are experiencing with
>>> Linux as a Xen Dom0 driving particular Intel graphics devices by
>>> correctly reporting to the Intel i915 driver that PAT is enabled where
>>> previously it was falsely reporting that PAT is disabled. Some users
>>> are experiencing system hangs in Xen PV Dom0 and all users on Xen PV
>>> Dom0 are experiencing reduced graphics performance because the keying of
>>> the use of WC mappings to pat_enabled() (see arch_can_pci_mmap_wc())
>>> means that in particular graphics frame buffer accesses are quite a bit
>>> less performant than possible without this patch.
>>>
>>> Also, with the current code, in the Xen PV environment, PAT will not be
>>> disabled if the administrator sets the "nopat" boot option. Introduce
>>> a new boolean variable, pat_force_disable, to forcibly disable PAT
>>> when the administrator sets the "nopat" option to override the default
>>> behavior of using the PAT configuration that Xen has provided.
>>>
>>> For the new boolean to live in .init.data, init_cache_modes() also needs
>>> moving to .init.text (where it could/should have lived already before).
>>>
>>> Fixes: 99c13b8c8896d7bcb92753bf ("x86/mm/pat: Don't report PAT on CPUs that don't support it")
>>> Co-developed-by: Jan Beulich <jbeulich@...e.com>
>>> Cc: stable@...r.kernel.org
>>> Signed-off-by: Chuck Zmudzinski <brchuckz@....com>
>>> ---
>>> v2: *Add force_pat_disabled variable to fix "nopat" on Xen PV (Jan Beulich)
>>> *Add the necessary code to incorporate the "nopat" fix
>>> *void init_cache_modes(void) -> void __init init_cache_modes(void)
>>> *Add Jan Beulich as Co-developer (Jan has not signed off yet)
>>> *Expand the commit message to include relevant parts of the commit
>>> message of Jan Beulich's proposed patch for this problem
>>> *Fix 'else if ... {' placement and indentation
>>> *Remove indication the backport to stable branches is only back to 5.17.y
>>>
>>> I think these changes address all the comments on the original patch
>>>
>>> I added Jan Beulich as a Co-developer because Juergen Gross asked me to
>>> include Jan's idea for fixing "nopat" that was missing from the first
>>> version of the patch.
>>>
>>> The patch has been tested, it works as expected with and without nopat
>>> in the Xen PV Dom0 environment. That is, "nopat" causes the system to
>>> exhibit the effects and problems that lack of PAT support causes.
>>>
>>> arch/x86/mm/pat/memtype.c | 16 ++++++++++++++--
>>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
>>> index d5ef64ddd35e..10a37d309d23 100644
>>> --- a/arch/x86/mm/pat/memtype.c
>>> +++ b/arch/x86/mm/pat/memtype.c
>>> @@ -62,6 +62,7 @@
>>>
>>> static bool __read_mostly pat_bp_initialized;
>>> static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
>>> +static bool __initdata pat_force_disabled = !IS_ENABLED(CONFIG_X86_PAT);
>>> static bool __read_mostly pat_bp_enabled;
>>> static bool __read_mostly pat_cm_initialized;
>>>
>>> @@ -86,6 +87,7 @@ void pat_disable(const char *msg_reason)
>>> static int __init nopat(char *str)
>>> {
>>> pat_disable("PAT support disabled via boot option.");
>>> + pat_force_disabled = true;
>>> return 0;
>>> }
>>> early_param("nopat", nopat);
>>> @@ -272,7 +274,7 @@ static void pat_ap_init(u64 pat)
>>> wrmsrl(MSR_IA32_CR_PAT, pat);
>>> }
>>>
>>> -void init_cache_modes(void)
>>> +void __init init_cache_modes(void)
>>> {
>>> u64 pat = 0;
>>>
>>> @@ -292,7 +294,7 @@ void init_cache_modes(void)
>>> rdmsrl(MSR_IA32_CR_PAT, pat);
>>> }
>>>
>>> - if (!pat) {
>>> + if (!pat || pat_force_disabled) {
>>
>> Can we just remove this modification and ...
>>
>>> /*
>>> * No PAT. Emulate the PAT table that corresponds to the two
>>> * cache bits, PWT (Write Through) and PCD (Cache Disable).
>>> @@ -313,6 +315,16 @@ void init_cache_modes(void)
>>> */
>>> pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
>>> PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
>>> + } else if (!pat_bp_enabled) {
>>
>> ... use
>>
>> + } else if (!pat_bp_enabled && !pat_force_disabled) {
>>
>> here?
>>
>> This will result in the desired outcome in all cases IMO: If PAT wasn't
>> disabled via "nopat" and the PAT MSR has a non-zero value (from BIOS or
>> Hypervisor) and PAT has been disabled implicitly (e.g. due to lack of
>> MTRR), then PAT will be set to "enabled" again.
>
> With that, you can also completely remove the new Boolean - it
> will be a meaningless variable wasting memory. This will also make
No, it is making a difference with "nopat" having been specified.
In the Xen PV case we will have pat_bp_enabled == false due to the
lack of MTRR. We don't want to set it to true if "nopat" has been
specified on the command line, so pat_force_disabled should not be
true when we are setting pat_bp_enabled to true again.
> my patch more or less do what Jan's patch does - the "nopat" option
> will not cause the situation when the PAT MSR does not match the
> software view. So you are basically proposing just going back to
> my original patch, after fixing the style problems, of course. That
> also would solve the problem of needing Jan's S-o-b. I am inclined,
> however, to wait for a maintainer who has power to actually do the
> commit, to make a comment. Your R-b to my v2 did not have much clout
> with the actual maintainers, as far as I can tell. I am somewhat annoyed
> that it was at your suggestion that my v2 ended up confusing the
> main issue, the regression, with the red herring of the "nopat"
> option.
I'm sorry for that.
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists