linux-kernel - Re: [RFC 1/2] x86/bugs: Disable coresched on hardware that does not need it

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8d4a522a-4ba7-f9f9-0acd-11b0def561c2@amazon.com>
Date:   Thu, 12 Nov 2020 15:52:32 +0100
From:   Alexander Graf <graf@...zon.com>
To:     Joel Fernandes <joel@...lfernandes.org>
CC:     Nishanth Aravamudan <naravamudan@...italocean.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Tim Chen" <tim.c.chen@...ux.intel.com>,
        Vineeth Pillai <viremana@...ux.microsoft.com>,
        Aaron Lu <aaron.lwe@...il.com>,
        Aubrey Li <aubrey.intel@...il.com>,
        Thomas Glexiner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        "Linus Torvalds" <torvalds@...ux-foundation.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kerr <kerrnel@...gle.com>, Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        "Pawan Gupta" <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>, <vineeth@...byteword.org>,
        Chen Yu <yu.c.chen@...el.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Agata Gruza <agata.gruza@...el.com>,
        Antonio Gomez Iglesias <antonio.gomez.iglesias@...el.com>,
        <konrad.wilk@...cle.com>, Dario Faggioli <dfaggioli@...e.com>,
        Paul Turner <pjt@...gle.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Patrick Bellasi <derkling@...gle.com>,
        benbjiang(蒋彪) <benbjiang@...cent.com>,
        "Alexandre Chartre" <alexandre.chartre@...cle.com>,
        <James.Bottomley@...senpartnership.com>, <OWeisse@...ch.edu>,
        Dhaval Giani <dhaval.giani@...cle.com>,
        Junaid Shahid <junaids@...gle.com>,
        Jesse Barnes <jsbarnes@...gle.com>,
        "Hyser,Chris" <chris.hyser@...cle.com>,
        Ben Segall <bsegall@...gle.com>, Josh Don <joshdon@...gle.com>,
        Hao Luo <haoluo@...gle.com>,
        "Anand K. Mistry" <amistry@...gle.com>,
        Borislav Petkov <bp@...en8.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        "Dietmar Eggemann" <dietmar.eggemann@....com>,
        "H. Peter Anvin" <hpa@...or.com>, "Ingo Molnar" <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Mel Gorman <mgorman@...e.de>, Mike Rapoport <rppt@...nel.org>,
        Tom Lendacky <thomas.lendacky@....com>,
        Tony Luck <tony.luck@...el.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>
Subject: Re: [RFC 1/2] x86/bugs: Disable coresched on hardware that does not
 need it



On 12.11.20 14:40, Joel Fernandes wrote:
> 
> On Wed, Nov 11, 2020 at 11:29:37PM +0100, Alexander Graf wrote:
>>
>>
>> On 11.11.20 23:15, Joel Fernandes wrote:
>>>
>>> On Wed, Nov 11, 2020 at 5:13 PM Joel Fernandes <joel@...lfernandes.org> wrote:
>>>>
>>>> On Wed, Nov 11, 2020 at 5:00 PM Alexander Graf <graf@...zon.com> wrote:
>>>>> On 11.11.20 22:14, Joel Fernandes wrote:
>>>>>>> Some hardware such as certain AMD variants don't have cross-HT MDS/L1TF
>>>>>>> issues. Detect this and don't enable core scheduling as it can
>>>>>>> needlessly slow the device done.
>>>>>>>
>>>>>>> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
>>>>>>> index dece79e4d1e9..0e6e61e49b23 100644
>>>>>>> --- a/arch/x86/kernel/cpu/bugs.c
>>>>>>> +++ b/arch/x86/kernel/cpu/bugs.c
>>>>>>> @@ -152,6 +152,14 @@ void __init check_bugs(void)
>>>>>>>     #endif
>>>>>>>     }
>>>>>>>
>>>>>>> +/*
>>>>>>> + * Do not need core scheduling if CPU does not have MDS/L1TF vulnerability.
>>>>>>> + */
>>>>>>> +int arch_allow_core_sched(void)
>>>>>>> +{
>>>>>>> +       return boot_cpu_has_bug(X86_BUG_MDS) || boot_cpu_has_bug(X86_BUG_L1TF);
>>>>>
>>>>> Can we make this more generic and user settable, similar to the L1 cache
>>>>> flushing modes in KVM?
>>>>>
>>>>> I am not 100% convinced that there are no other thread sibling attacks
>>>>> possible without MDS and L1TF. If I'm paranoid, I want to still be able
>>>>> to force enable core scheduling.
>>>>>
>>>>> In addition, we are also using core scheduling as a poor man's mechanism
>>>>> to give customers consistent performance for virtual machine thread
>>>>> siblings. This is important irrespective of CPU bugs. In such a
>>>>> scenario, I want to force enable core scheduling.
>>>>
>>>> Ok,  I can make it new kernel command line option with:
>>>> coresched=on
>>>> coresched=secure (only if HW has MDS/L1TF)
>>>> coresched=off
>>>
>>> Also, I would keep "secure" as the default.  (And probably, we should
>>> modify the informational messages in sysfs to reflect this..)
>>
>> I agree that "secure" should be the default.
> 
> Ok.
> 
>> Can we also integrate into the "mitigations" kernel command line[1] for this?
> 
> Sure, the integration into [1] sounds conceptually fine to me however it is
> not super straight forward. Like: What if user wants to force-enable
> core-scheduling for the usecase you mention, but still wants the cross-HT
> mitigation because they are only tagging VMs (as in your usecase) and not
> other tasks. Idk.

Can we roll this backwards from what you would expect as a user? How 
about we make this 2-dimensional?

   coresched=[on|off|secure][,force]

where "on" means "core scheduling can be done if colors are set", "off" 
means "no core scheduling is done" and "secure" means "core scheduling 
can be done on MDS or L1TF if colors are set".

The "force" option would then mean "apply a color to every new task".

What then happens with mitigations= is easy. "auto" means 
"coresched=secure". "off" means "coresched=off" and if you want to force 
core scheduling for everything if necessary, you just do 
mitigations=auto coresched=auto,force.

Am I missing something obvious? :)

> The best thing to do could be to keep the "auto disable HT" controls and
> logic separate from the "coresched=on" logic and let the user choose. The
> exception being, coresched=secure means that on HW that does not have
> vulnerability, we will not activate the core scheduling.

I'm much more interested in the coresched=off one for mitigations=. It's 
what we have introduced a while back to save people from setting 50 
different command line options.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879