lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Jul 2022 17:30:02 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Chuck Zmudzinski <brchuckz@...scape.net>,
        Jan Beulich <jbeulich@...e.com>
Cc:     Andrew Lutomirski <luto@...nel.org>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        lkml <linux-kernel@...r.kernel.org>,
        "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH] x86/PAT: have pat_enabled() properly reflect state when
 running on e.g. Xen

On 12.07.22 17:09, Chuck Zmudzinski wrote:
> On 7/12/2022 9:32 AM, Juergen Gross wrote:
>> On 12.07.22 15:22, Chuck Zmudzinski wrote:
>>> On 7/12/2022 2:04 AM, Jan Beulich wrote:
>>>> On 11.07.2022 19:41, Chuck Zmudzinski wrote:
>>>>> Moreover... (please move to the bottom of the code snippet
>>>>> for more information about my tests in the Xen PV environment...)
>>>>>
>>>>> void init_cache_modes(void)
>>>>> {
>>>>>       u64 pat = 0;
>>>>>
>>>>>       if (pat_cm_initialized)
>>>>>           return;
>>>>>
>>>>>       if (boot_cpu_has(X86_FEATURE_PAT)) {
>>>>>           /*
>>>>>            * CPU supports PAT. Set PAT table to be consistent with
>>>>>            * PAT MSR. This case supports "nopat" boot option, and
>>>>>            * virtual machine environments which support PAT without
>>>>>            * MTRRs. In specific, Xen has unique setup to PAT MSR.
>>>>>            *
>>>>>            * If PAT MSR returns 0, it is considered invalid and emulates
>>>>>            * as No PAT.
>>>>>            */
>>>>>           rdmsrl(MSR_IA32_CR_PAT, pat);
>>>>>       }
>>>>>
>>>>>       if (!pat) {
>>>>>           /*
>>>>>            * No PAT. Emulate the PAT table that corresponds to the two
>>>>>            * cache bits, PWT (Write Through) and PCD (Cache Disable).
>>>>>            * This setup is also the same as the BIOS default setup.
>>>>>            *
>>>>>            * PTE encoding:
>>>>>            *
>>>>>            *       PCD
>>>>>            *       |PWT  PAT
>>>>>            *       ||    slot
>>>>>            *       00    0    WB : _PAGE_CACHE_MODE_WB
>>>>>            *       01    1    WT : _PAGE_CACHE_MODE_WT
>>>>>            *       10    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
>>>>>            *       11    3    UC : _PAGE_CACHE_MODE_UC
>>>>>            *
>>>>>            * NOTE: When WC or WP is used, it is redirected to UC- per
>>>>>            * the default setup in __cachemode2pte_tbl[].
>>>>>            */
>>>>>           pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
>>>>>                 PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
>>>>>       }
>>>>>
>>>>>       else if (!pat_bp_enabled) {
>>>>>           /*
>>>>>            * In some environments, specifically Xen PV, PAT
>>>>>            * initialization is skipped because MTRRs are
>>>>>            * disabled even though PAT is available. In such
>>>>>            * environments, set PAT to initialized and enabled to
>>>>>            * correctly indicate to callers of pat_enabled() that
>>>>>            * PAT is available and prevent PAT from being disabled.
>>>>>            */
>>>>>           pat_bp_enabled = true;
>>>>>           pr_info("x86/PAT: PAT enabled by init_cache_modes\n");
>>>>>       }
>>>>>
>>>>>       __init_cache_modes(pat);
>>>>> }
>>>>>
>>>>> This function, patched with the extra 'else if' block, fixes the
>>>>> regression on my Xen worksatation, and the pr_info message
>>>>> "x86/PAT: PAT enabled by init_cache_modes" appears in the logs
>>>>> when running this patched kernel in my Xen Dom0. This means
>>>>> that in the Xen PV environment on my Xen Dom0 workstation,
>>>>> rdmsrl(MSR_IA32_CR_PAT, pat) successfully tested for the presence
>>>>> of PAT on the virtual CPU that Xen exposed to the Linux kernel on my
>>>>> Xen Dom0 workstation. At least that is what I think my tests prove.
>>>>>
>>>>> So why is this not a valid way to test for the existence of
>>>>> PAT in the Xen PV environment? Are the existing comments
>>>>> in init_cache_modes() about supporting both the case when
>>>>> the "nopat" boot option is set and the specific case of Xen and
>>>>> MTRR disabled wrong? My testing confirms those comments are
>>>>> correct.
>>>>
>>>> At the very least this ignores the possible "nopat" an admin may
>>>> have passed to the kernel.
>>>
>>> I realize that. The patch I proposed here only fixes the regression. It
>>> would be easy to also modify the patch to also observe the 'nopat"
>>> setting. I think your patch had a force_pat_disable local variable that
>>> is set if pat is disabled by the administrator with "nopat." With that
>>> variable available, modifying the patch so in init_cache_modes we have:
>>>
>>>        if (!pat || force_pat_disable) {
>>>            /*
>>>             * No PAT. Emulate the PAT table that corresponds to the two
>>>
>>> Instead of:
>>>
>>>        if (!pat) {
>>>            /*
>>>             * No PAT. Emulate the PAT table that corresponds to the two
>>>
>>> would cause the kernel to respect the "nopat" setting by the administrator
>>> in the Xen PV Dom0 environment.
>>
>> Chuck, could you please send out a proper patch with your initial fix
>> (setting pat_bp_enabled) and the fix above?
>>
>> I've chatted with Boris Petkov on IRC and he is fine with that.
> 
> That's great, I will submit a formal patch later today.
> 
>>
>>> I agree this needs to be fixed up, because currently the code is very
>>> confusing and the current variable names and function names do not
>>> always accurately describe what they actually do in the code. That is
>>> why I am working on a patch to do some re-factoring, which only consists
>>> of function and variable name changes and comment changes to fix
>>> the places where the comments in the code are misleading or incomplete.
>>
>> Boris and I agreed to pursue my approach further by removing the
>> dependency between PAT and MTRR and to make this whole mess more
>> clear.
>>
>> I will start to work on this as soon as possible, which will
>> probably be some time in September.
> 
> Good, I will look for your patches and try them out.
> 
>>
>>> I think perhaps the most misnamed variable here is the  local
>>> variable pat_disabled in memtypes.c and the most misnamed function is the
>>> pat_disable function in memtypes.c. They should be named pat_init_disabled
>>> and pat_init_disable, respectively, because they do not really disable
>>> PAT in
>>> the code but only prevent execution of the pat_init function. That
>>> leaves open
>>> the possibility for PAT to be enabled by init_cache_modes, which actually
>>> occurs in the current code in the Xen PV Dom0 environment, but the current
>>> code neglects to set pat_bp_enabled to true in that case. So we need a patch
>>> to fix that in order to fix the regression.
>>
>> In principle I agree, but you should be aware of my refactoring plans.
> 
> I will defer to you and stop working on this re-factoring effort, but I
> will prepare a formal patch to fix the regression later today.
> 
> I do think Jan's point about respecting the administrator's "nopat" setting
> should also be considered. AFAICT, the "nopat" option in current code

Yes, please add that, too. This was what I meant with "the fix above".

> is also not being respected on the bare metal, and the patch to
> init_cache_modes with a force_no_pat variable is also needed to
> ensure "nopat" is respected on the bare metal, AFAICT.

Hmm, I don't see how the PAT MSR will be written on bare metal when "nopat"
has been specified.

I just tried it with a 5.19 kernel and it booted with the correct PAT
settings:

[    0.000000] x86/PAT: PAT support disabled via boot option.
[    0.000986] x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ