lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45fd7ade-61fe-18fa-aacc-ed80fdf3fb8d@suse.com>
Date:   Tue, 12 Jul 2022 15:32:37 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Chuck Zmudzinski <brchuckz@...scape.net>,
        Jan Beulich <jbeulich@...e.com>
Cc:     Andrew Lutomirski <luto@...nel.org>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        lkml <linux-kernel@...r.kernel.org>,
        "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH] x86/PAT: have pat_enabled() properly reflect state when
 running on e.g. Xen

On 12.07.22 15:22, Chuck Zmudzinski wrote:
> On 7/12/2022 2:04 AM, Jan Beulich wrote:
>> On 11.07.2022 19:41, Chuck Zmudzinski wrote:
>>> Moreover... (please move to the bottom of the code snippet
>>> for more information about my tests in the Xen PV environment...)
>>>
>>> void init_cache_modes(void)
>>> {
>>>      u64 pat = 0;
>>>
>>>      if (pat_cm_initialized)
>>>          return;
>>>
>>>      if (boot_cpu_has(X86_FEATURE_PAT)) {
>>>          /*
>>>           * CPU supports PAT. Set PAT table to be consistent with
>>>           * PAT MSR. This case supports "nopat" boot option, and
>>>           * virtual machine environments which support PAT without
>>>           * MTRRs. In specific, Xen has unique setup to PAT MSR.
>>>           *
>>>           * If PAT MSR returns 0, it is considered invalid and emulates
>>>           * as No PAT.
>>>           */
>>>          rdmsrl(MSR_IA32_CR_PAT, pat);
>>>      }
>>>
>>>      if (!pat) {
>>>          /*
>>>           * No PAT. Emulate the PAT table that corresponds to the two
>>>           * cache bits, PWT (Write Through) and PCD (Cache Disable).
>>>           * This setup is also the same as the BIOS default setup.
>>>           *
>>>           * PTE encoding:
>>>           *
>>>           *       PCD
>>>           *       |PWT  PAT
>>>           *       ||    slot
>>>           *       00    0    WB : _PAGE_CACHE_MODE_WB
>>>           *       01    1    WT : _PAGE_CACHE_MODE_WT
>>>           *       10    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
>>>           *       11    3    UC : _PAGE_CACHE_MODE_UC
>>>           *
>>>           * NOTE: When WC or WP is used, it is redirected to UC- per
>>>           * the default setup in __cachemode2pte_tbl[].
>>>           */
>>>          pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
>>>                PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
>>>      }
>>>
>>>      else if (!pat_bp_enabled) {
>>>          /*
>>>           * In some environments, specifically Xen PV, PAT
>>>           * initialization is skipped because MTRRs are
>>>           * disabled even though PAT is available. In such
>>>           * environments, set PAT to initialized and enabled to
>>>           * correctly indicate to callers of pat_enabled() that
>>>           * PAT is available and prevent PAT from being disabled.
>>>           */
>>>          pat_bp_enabled = true;
>>>          pr_info("x86/PAT: PAT enabled by init_cache_modes\n");
>>>      }
>>>
>>>      __init_cache_modes(pat);
>>> }
>>>
>>> This function, patched with the extra 'else if' block, fixes the
>>> regression on my Xen worksatation, and the pr_info message
>>> "x86/PAT: PAT enabled by init_cache_modes" appears in the logs
>>> when running this patched kernel in my Xen Dom0. This means
>>> that in the Xen PV environment on my Xen Dom0 workstation,
>>> rdmsrl(MSR_IA32_CR_PAT, pat) successfully tested for the presence
>>> of PAT on the virtual CPU that Xen exposed to the Linux kernel on my
>>> Xen Dom0 workstation. At least that is what I think my tests prove.
>>>
>>> So why is this not a valid way to test for the existence of
>>> PAT in the Xen PV environment? Are the existing comments
>>> in init_cache_modes() about supporting both the case when
>>> the "nopat" boot option is set and the specific case of Xen and
>>> MTRR disabled wrong? My testing confirms those comments are
>>> correct.
>>
>> At the very least this ignores the possible "nopat" an admin may
>> have passed to the kernel.
> 
> I realize that. The patch I proposed here only fixes the regression. It
> would be easy to also modify the patch to also observe the 'nopat"
> setting. I think your patch had a force_pat_disable local variable that
> is set if pat is disabled by the administrator with "nopat." With that
> variable available, modifying the patch so in init_cache_modes we have:
> 
>       if (!pat || force_pat_disable) {
>           /*
>            * No PAT. Emulate the PAT table that corresponds to the two
> 
> Instead of:
> 
>       if (!pat) {
>           /*
>            * No PAT. Emulate the PAT table that corresponds to the two
> 
> would cause the kernel to respect the "nopat" setting by the administrator
> in the Xen PV Dom0 environment.

Chuck, could you please send out a proper patch with your initial fix
(setting pat_bp_enabled) and the fix above?

I've chatted with Boris Petkov on IRC and he is fine with that.

> I agree this needs to be fixed up, because currently the code is very
> confusing and the current variable names and function names do not
> always accurately describe what they actually do in the code. That is
> why I am working on a patch to do some re-factoring, which only consists
> of function and variable name changes and comment changes to fix
> the places where the comments in the code are misleading or incomplete.

Boris and I agreed to pursue my approach further by removing the
dependency between PAT and MTRR and to make this whole mess more
clear.

I will start to work on this as soon as possible, which will
probably be some time in September.

> I think perhaps the most misnamed variable here is the  local
> variable pat_disabled in memtypes.c and the most misnamed function is the
> pat_disable function in memtypes.c. They should be named pat_init_disabled
> and pat_init_disable, respectively, because they do not really disable
> PAT in
> the code but only prevent execution of the pat_init function. That
> leaves open
> the possibility for PAT to be enabled by init_cache_modes, which actually
> occurs in the current code in the Xen PV Dom0 environment, but the current
> code neglects to set pat_bp_enabled to true in that case. So we need a patch
> to fix that in order to fix the regression.

In principle I agree, but you should be aware of my refactoring plans.


Juergen


Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ