[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h7sgk41y.fsf@nanos.tec.linutronix.de>
Date: Wed, 02 Sep 2020 09:53:29 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: Julien Desfossez <jdesfossez@...italocean.com>,
Peter Zijlstra <peterz@...radead.org>,
Vineeth Pillai <viremana@...ux.microsoft.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Aaron Lu <aaron.lwe@...il.com>,
Aubrey Li <aubrey.intel@...il.com>,
Dhaval Giani <dhaval.giani@...cle.com>,
Chris Hyser <chris.hyser@...cle.com>,
Nishanth Aravamudan <naravamudan@...italocean.com>,
mingo@...nel.org, pjt@...gle.com, torvalds@...ux-foundation.org,
linux-kernel@...r.kernel.org, fweisbec@...il.com,
keescook@...omium.org, kerrnel@...gle.com,
Phil Auld <pauld@...hat.com>,
Valentin Schneider <valentin.schneider@....com>,
Mel Gorman <mgorman@...hsingularity.net>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Paolo Bonzini <pbonzini@...hat.com>, vineeth@...byteword.org,
Chen Yu <yu.c.chen@...el.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Agata Gruza <agata.gruza@...el.com>,
Antonio Gomez Iglesias <antonio.gomez.iglesias@...el.com>,
graf@...zon.com, konrad.wilk@...cle.com, dfaggioli@...e.com,
rostedt@...dmis.org, derkling@...gle.com, benbjiang@...cent.com,
Aubrey Li <aubrey.li@...ux.intel.com>,
Tim Chen <tim.c.chen@...el.com>,
"Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [RFC PATCH v7 17/23] kernel/entry: Add support for core-wide protection of kernel-mode
Joel,
On Tue, Sep 01 2020 at 21:29, Joel Fernandes wrote:
> On Tue, Sep 01, 2020 at 10:02:10PM +0200, Thomas Gleixner wrote:
>> The generated code for the CONFIG_PRETENT_HT_SECURE=n case is the same
>
> When you say 'pretend', did you mean 'make' ? The point of this patch is to
> protect the kernel from the other hyperthread thus making HT secure for the
> kernel contexts and not merely pretending.
I'm paranoid and don't trust HT at all. There is too much shared state.
>> --- /dev/null
>> +++ b/include/linux/pretend_ht_secure.h
>> @@ -0,0 +1,21 @@
>> +#ifndef _LINUX_PRETEND_HT_SECURE_H
>> +#define _LINUX_PRETEND_HT_SECURE_H
>> +
>> +#ifdef CONFIG_PRETEND_HT_SECURE
>> +static inline void enter_from_user_ht_sucks(void)
>> +{
>> + if (static_branch_unlikely(&pretend_ht_secure_key))
>> + enter_from_user_pretend_ht_is_secure();
>> +}
>> +
>> +static inline void exit_to_user_ht_sucks(void)
>> +{
>> + if (static_branch_unlikely(&pretend_ht_secure_key))
>> + exit_to_user_pretend_ht_is_secure();
>
> We already have similar config and static keys for the core-scheduling
> feature itself. Can we just make it depend on that?
Of course. This was just for illustration. :)
> Or, are you saying users may want 'core scheduling' enabled but may want to
> leave out the kernel protection?
Core scheduling per se without all the protection muck, i.e. a relaxed
version which tries to gang schedule threads of a process on a core if
feasible has advantages to some workloads.
>> @@ -111,6 +113,12 @@ static __always_inline void exit_to_user
>> /* Workaround to allow gradual conversion of architecture code */
>> void __weak arch_do_signal(struct pt_regs *regs) { }
>>
>> +static inline unsigned long exit_to_user_get_work(void)
>> +{
>> + exit_to_user_ht_sucks();
>
> Ok, one issue with your patch is it does not take care of the waiting logic.
> sched_core_unsafe_exit_wait() needs to be called *after* all of the
> exit_to_user_mode_work is processed. This is because
> sched_core_unsafe_exit_wait() also checks for any new exit-to-usermode-work
> that popped up while it is spinning and breaks out of its spin-till-safe loop
> early. This is key to solving the stop-machine issue. If the stopper needs to
> run, then the need-resched flag will be set and we break out of the spin and
> redo the whole exit_to_user_mode_loop() as it should.
And where is the problem?
syscall_entry()
...
sys_foo()
....
return 0;
local_irq_disable();
exit_to_user_mode_prepare()
ti_work = exit_to_user_get_work()
{
if (ht_muck)
syscall_exit_ht_muck() {
....
while (wait) {
local_irq_enable();
while (wait) cpu_relax();
local_irq_disable();
}
}
return READ_ONCE(current_thread_info()->flags);
}
if (unlikely(ti_work & WORK))
ti_work = exit_loop(ti_work)
while (ti_work & MASK) {
local_irq_enable();
.....
local_irq_disable();
ti_work = exit_to_user_get_work()
{
See above
}
}
It covers both the 'no work' and the 'do work' exit path. If that's not
sufficient, then something is fundamentally wrong with your design.
Thanks,
tglx
Powered by blists - more mailing lists