lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5431d53-fa43-d6e6-9af3-4313c466e991@suse.com>
Date:   Wed, 17 Aug 2022 11:17:01 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     xen-devel@...ts.xenproject.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, brchuckz@...scape.net,
        jbeulich@...e.com, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        stable@...r.kernel.org
Subject: Re: [PATCH 3/3] x86: decouple pat and mtrr handling

On 19.07.22 17:15, Borislav Petkov wrote:
> On Fri, Jul 15, 2022 at 04:25:49PM +0200, Juergen Gross wrote:
>> Today PAT is usable only with MTRR being active, with some nasty tweaks
>> to make PAT usable when running as Xen PV guest, which doesn't support
>> MTRR.
>>
>> The reason for this coupling is, that both, PAT MSR changes and MTRR
>> changes, require a similar sequence and so full PAT support was added
>> using the already available MTRR handling.
>>
>> Xen PV PAT handling can work without MTRR, as it just needs to consume
>> the PAT MSR setting done by the hypervisor without the ability and need
>> to change it. This in turn has resulted in a convoluted initialization
>> sequence and wrong decisions regarding cache mode availability due to
>> misguiding PAT availability flags.
>>
>> Fix all of that by allowing to use PAT without MTRR and by adding an
>> environment dependent PAT init function.
> 
> Aha, there's the explanation I was looking for.
> 
>> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
>> index 0a1bd14f7966..3edfb779dab5 100644
>> --- a/arch/x86/kernel/cpu/common.c
>> +++ b/arch/x86/kernel/cpu/common.c
>> @@ -2408,8 +2408,8 @@ void __init cache_bp_init(void)
>>   {
>>   	if (IS_ENABLED(CONFIG_MTRR))
>>   		mtrr_bp_init();
>> -	else
>> -		pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
>> +
>> +	pat_cpu_init();
>>   }
>>   
>>   void cache_ap_init(void)
>> @@ -2417,7 +2417,8 @@ void cache_ap_init(void)
>>   	if (cache_aps_delayed_init)
>>   		return;
>>   
>> -	mtrr_ap_init();
>> +	if (!mtrr_ap_init())
>> +		pat_ap_init_nomtrr();
>>   }
> 
> So I'm reading this as: if it couldn't init AP's MTRRs, init its PAT.
> 
> But currently, the code sets the MTRRs for the delayed case or when the
> CPU is not online by doing ->set_all and in there it sets first MTRRs
> and then PAT.
> 
> I think the code above should simply try the two things, one after the
> other, independently from one another.
> 
> And I see you've added another stomp machine call for PAT only.
> 
> Now, what I think the design of all this should be, is:
> 
> you have a bunch of things you need to do at each point:
> 
> * cache_ap_init
> 
> * cache_aps_init
> 
> * ...
> 
> Now, in each those, you look at whether PAT or MTRR is supported and you
> do only those which are supported.
> 
> Also, the rendezvous handler should do:
> 
> 	if MTRR:
> 		do MTRR specific stuff
> 
> 	if PAT:
> 		do PAT specific stuff
> 
> This way you have clean definitions of what needs to happen when and you
> also do *only* the things that the platform supports, by keeping the
> proper order of operations - I believe MTRRs first and then PAT.
> 
> This way we'll get rid of that crazy maze of who calls what and when.
> 
> But first we need to define those points where stuff needs to happen and
> then for each point define what stuff needs to happen.
> 
> How does that sound?

This asks for some more cleanup in the MTRR code:

mtrr_if->set_all() is the relevant callback, and it will only ever be called
for the generic case (use_intel() == true), so I think we want to:

- remove the cyrix specific set_all() function
- split the set_all() callback case from mtrr_rendezvous_handler() into a
   dedicated rendezvous handler
- remove the set_all() member from struct mtrr_ops and directly call
   generic_set_all() from the new rendezvous handler
- optional: rename use_intel() to use_generic(), or even introduce just
   a static bool for that purpose

Then the new rendezvous handler can be modified as you suggested.

Are you okay with that route?


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ