linux-kernel - Re: [RFC PATCH v7 17/23] kernel/entry: Add support for core-wide protection of kernel-mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200902151200.GA2474204@google.com>
Date:   Wed, 2 Sep 2020 11:12:00 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Julien Desfossez <jdesfossez@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vineeth Pillai <viremana@...ux.microsoft.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Aaron Lu <aaron.lwe@...il.com>,
        Aubrey Li <aubrey.intel@...il.com>,
        Dhaval Giani <dhaval.giani@...cle.com>,
        Chris Hyser <chris.hyser@...cle.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        mingo@...nel.org, pjt@...gle.com, torvalds@...ux-foundation.org,
        linux-kernel@...r.kernel.org, fweisbec@...il.com,
        keescook@...omium.org, kerrnel@...gle.com,
        Phil Auld <pauld@...hat.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>, vineeth@...byteword.org,
        Chen Yu <yu.c.chen@...el.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Agata Gruza <agata.gruza@...el.com>,
        Antonio Gomez Iglesias <antonio.gomez.iglesias@...el.com>,
        graf@...zon.com, konrad.wilk@...cle.com, dfaggioli@...e.com,
        rostedt@...dmis.org, derkling@...gle.com, benbjiang@...cent.com,
        Aubrey Li <aubrey.li@...ux.intel.com>,
        Tim Chen <tim.c.chen@...el.com>,
        "Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [RFC PATCH v7 17/23] kernel/entry: Add support for core-wide
 protection of kernel-mode

Hi Thomas,

On Wed, Sep 02, 2020 at 09:53:29AM +0200, Thomas Gleixner wrote:
[...]
> >> --- /dev/null
> >> +++ b/include/linux/pretend_ht_secure.h
> >> @@ -0,0 +1,21 @@
> >> +#ifndef _LINUX_PRETEND_HT_SECURE_H
> >> +#define _LINUX_PRETEND_HT_SECURE_H
> >> +
> >> +#ifdef CONFIG_PRETEND_HT_SECURE
> >> +static inline void enter_from_user_ht_sucks(void)
> >> +{
> >> +	if (static_branch_unlikely(&pretend_ht_secure_key))
> >> +		enter_from_user_pretend_ht_is_secure();
> >> +}
> >> +
> >> +static inline void exit_to_user_ht_sucks(void)
> >> +{
> >> +	if (static_branch_unlikely(&pretend_ht_secure_key))
> >> +		exit_to_user_pretend_ht_is_secure();
> >
> > We already have similar config and static keys for the core-scheduling
> > feature itself. Can we just make it depend on that?
> 
> Of course. This was just for illustration. :)

Got it. :)

> > Or, are you saying users may want 'core scheduling' enabled but may want to
> > leave out the kernel protection?
> 
> Core scheduling per se without all the protection muck, i.e. a relaxed
> version which tries to gang schedule threads of a process on a core if
> feasible has advantages to some workloads.

Sure. So I will make it depending on the existing core-scheduling
config/static-key so the kernel protection is there when core scheduling is
enabled (so both userspace and with this patch the kernel is protected).

> 
> >> @@ -111,6 +113,12 @@ static __always_inline void exit_to_user
> >>  /* Workaround to allow gradual conversion of architecture code */
> >>  void __weak arch_do_signal(struct pt_regs *regs) { }
> >>  
> >> +static inline unsigned long exit_to_user_get_work(void)
> >> +{
> >> +	exit_to_user_ht_sucks();
> >
> > Ok, one issue with your patch is it does not take care of the waiting logic.
> > sched_core_unsafe_exit_wait() needs to be called *after* all of the
> > exit_to_user_mode_work is processed. This is because
> > sched_core_unsafe_exit_wait() also checks for any new exit-to-usermode-work
> > that popped up while it is spinning and breaks out of its spin-till-safe loop
> > early. This is key to solving the stop-machine issue. If the stopper needs to
> > run, then the need-resched flag will be set and we break out of the spin and
> > redo the whole exit_to_user_mode_loop() as it should.
> 
> And where is the problem?
> 
> syscall_entry()
>   ...
>     sys_foo()
>       ....
>       return 0;
> 
>   local_irq_disable();
>   exit_to_user_mode_prepare()
>     ti_work = exit_to_user_get_work()
>        {
>         if (ht_muck)
>           syscall_exit_ht_muck() {
>             ....
>             while (wait) {
>             	local_irq_enable();
>                 while (wait) cpu_relax();
>                 local_irq_disable();
>             }
>           }
>         return READ_ONCE(current_thread_info()->flags);
>        }
> 
>     if (unlikely(ti_work & WORK))
>     	ti_work = exit_loop(ti_work)
> 
>         while (ti_work & MASK) {
>           local_irq_enable();
>           .....
>           local_irq_disable();
>           ti_work = exit_to_user_get_work()
>             {
>               See above
>             }
>        }
> 
> It covers both the 'no work' and the 'do work' exit path. If that's not
> sufficient, then something is fundamentally wrong with your design.

Yes, you are right, I got confused from your previous patch. This works too
and is exactly as my design. I will do it this way then. Thank you, Thomas!

 - Joel