linux-kernel - Re: [RFC PATCH v7 17/23] kernel/entry: Add support for core-wide protection of kernel-mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200902012905.GB1703315@google.com>
Date:   Tue, 1 Sep 2020 21:29:05 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Julien Desfossez <jdesfossez@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vineeth Pillai <viremana@...ux.microsoft.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Aaron Lu <aaron.lwe@...il.com>,
        Aubrey Li <aubrey.intel@...il.com>,
        Dhaval Giani <dhaval.giani@...cle.com>,
        Chris Hyser <chris.hyser@...cle.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        mingo@...nel.org, pjt@...gle.com, torvalds@...ux-foundation.org,
        linux-kernel@...r.kernel.org, fweisbec@...il.com,
        keescook@...omium.org, kerrnel@...gle.com,
        Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>, vineeth@...byteword.org,
        Chen Yu <yu.c.chen@...el.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Agata Gruza <agata.gruza@...el.com>,
        Antonio Gomez Iglesias <antonio.gomez.iglesias@...el.com>,
        graf@...zon.com, konrad.wilk@...cle.com, dfaggioli@...e.com,
        rostedt@...dmis.org, derkling@...gle.com, benbjiang@...cent.com,
        Aubrey Li <aubrey.li@...ux.intel.com>,
        Tim Chen <tim.c.chen@...el.com>,
        "Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [RFC PATCH v7 17/23] kernel/entry: Add support for core-wide
 protection of kernel-mode

Hi Thomas,

On Tue, Sep 01, 2020 at 10:02:10PM +0200, Thomas Gleixner wrote:
[..] 
> > The reason for that is, the loop can switch into another thread, so we
> > have to do unsafe_exit() for the old thread, and unsafe_enter() for
> > the new one while handling the tif work properly. We could get
> > migrated to another CPU in this loop itself, AFAICS. So the
> > unsafe_enter() / unsafe_exit() calls could also happen on different
> > CPUs.
> 
> That's fine. It still does not justify to make everything slower even if
> that 'pretend that HT is secure' thing is disabled.
> 
> Something like the below should be sufficient to do what you want
> while restricting the wreckage to the 'pretend ht is secure' case.
> 
> The generated code for the CONFIG_PRETENT_HT_SECURE=n case is the same

When you say 'pretend', did you mean 'make' ? The point of this patch is to
protect the kernel from the other hyperthread thus making HT secure for the
kernel contexts and not merely pretending.

> as without the patch. With CONFIG_PRETENT_HT_SECURE=y the impact is
> exactly two NOP-ed out jumps if the muck is not enabled on the command
> line which should be the default behaviour.

I see where you're coming from, I'll try to rework it to be less intrusive
when core-scheduling is disabled. Some more comments below:

> Thanks,
> 
>         tglx
> 
> ---
> --- /dev/null
> +++ b/include/linux/pretend_ht_secure.h
> @@ -0,0 +1,21 @@
> +#ifndef _LINUX_PRETEND_HT_SECURE_H
> +#define _LINUX_PRETEND_HT_SECURE_H
> +
> +#ifdef CONFIG_PRETEND_HT_SECURE
> +static inline void enter_from_user_ht_sucks(void)
> +{
> +	if (static_branch_unlikely(&pretend_ht_secure_key))
> +		enter_from_user_pretend_ht_is_secure();
> +}
> +
> +static inline void exit_to_user_ht_sucks(void)
> +{
> +	if (static_branch_unlikely(&pretend_ht_secure_key))
> +		exit_to_user_pretend_ht_is_secure();

We already have similar config and static keys for the core-scheduling
feature itself. Can we just make it depend on that?

Or, are you saying users may want 'core scheduling' enabled but may want to
leave out the kernel protection?

> +}
> +#else
> +static inline void enter_from_user_ht_sucks(void) { }
> +static inline void exit_to_user_ht_sucks(void) { }
> +#endif
> +
> +#endif
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -17,6 +17,7 @@
>   * 1) Tell lockdep that interrupts are disabled
>   * 2) Invoke context tracking if enabled to reactivate RCU
>   * 3) Trace interrupts off state
> + * 4) Pretend that HT is secure
>   */
>  static __always_inline void enter_from_user_mode(struct pt_regs *regs)
>  {
> @@ -28,6 +29,7 @@ static __always_inline void enter_from_u
>  
>  	instrumentation_begin();
>  	trace_hardirqs_off_finish();
> +	enter_from_user_ht_sucks();
>  	instrumentation_end();
>  }
>  
> @@ -111,6 +113,12 @@ static __always_inline void exit_to_user
>  /* Workaround to allow gradual conversion of architecture code */
>  void __weak arch_do_signal(struct pt_regs *regs) { }
>  
> +static inline unsigned long exit_to_user_get_work(void)
> +{
> +	exit_to_user_ht_sucks();

Ok, one issue with your patch is it does not take care of the waiting logic.
sched_core_unsafe_exit_wait() needs to be called *after* all of the
exit_to_user_mode_work is processed. This is because
sched_core_unsafe_exit_wait() also checks for any new exit-to-usermode-work
that popped up while it is spinning and breaks out of its spin-till-safe loop
early. This is key to solving the stop-machine issue. If the stopper needs to
run, then the need-resched flag will be set and we break out of the spin and
redo the whole exit_to_user_mode_loop() as it should.

I agree with the need to make the ASM suck less if the feature is turned off
though, and I can try to cook something along those lines. Thanks for the idea!

thanks,

 - Joel


> +	return READ_ONCE(current_thread_info()->flags);
> +}
> +
>  static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>  					    unsigned long ti_work)
>  {
> @@ -149,7 +157,7 @@ static unsigned long exit_to_user_mode_l
>  		 * enabled above.
>  		 */
>  		local_irq_disable_exit_to_user();
> -		ti_work = READ_ONCE(current_thread_info()->flags);
> +		ti_work = exit_to_user_get_work();
>  	}
>  
>  	/* Return the latest work state for arch_exit_to_user_mode() */
> @@ -158,7 +166,7 @@ static unsigned long exit_to_user_mode_l
>  
>  static void exit_to_user_mode_prepare(struct pt_regs *regs)
>  {
> -	unsigned long ti_work = READ_ONCE(current_thread_info()->flags);
> +	unsigned long ti_work = exit_to_user_get_work();
>  
>  	lockdep_assert_irqs_disabled();
>