linux-kernel - Re: [PATCH v3 4/5] prctl: Hook L1D flushing in via prctl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201205025622.GA31680@uc18d6cc16d755e.ant.amazon.com>
Date:   Sat, 5 Dec 2020 13:56:23 +1100
From:   Balbir Singh <sblbir@...zon.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     <mingo@...hat.com>, <peterz@...radead.org>,
        <linux-kernel@...r.kernel.org>, <keescook@...omium.org>,
        <jpoimboe@...hat.com>, <tony.luck@...el.com>,
        <benh@...nel.crashing.org>, <x86@...nel.org>,
        <dave.hansen@...el.com>, <thomas.lendacky@....com>,
        <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v3 4/5] prctl: Hook L1D flushing in via prctl

On Fri, Dec 04, 2020 at 11:19:17PM +0100, Thomas Gleixner wrote:
> 
> Balbir,
> 
> On Fri, Nov 27 2020 at 17:59, Balbir Singh wrote:
> > +enum l1d_flush_out_mitigations {
> > +     L1D_FLUSH_OUT_OFF,
> > +     L1D_FLUSH_OUT_ON,
> > +};
> > +
> > +static enum l1d_flush_out_mitigations l1d_flush_out_mitigation __ro_after_init = L1D_FLUSH_OUT_ON;
> 
> Why default on and why stays it on when the CPU is not affected by L1TF ...
> 

Because we don't set the PRCTL is the processor is not affected by the
bug

> >  /* Default mitigation for TAA-affected CPUs */
> >  static enum taa_mitigations taa_mitigation __ro_after_init = TAA_MITIGATION_VERW;
> >  static bool taa_nosmt __ro_after_init;
> > @@ -379,6 +386,18 @@ static void __init taa_select_mitigation(void)
> >       pr_info("%s\n", taa_strings[taa_mitigation]);
> >  }
> >
> > +static int __init l1d_flush_out_parse_cmdline(char *str)
> > +{
> > +     if (!boot_cpu_has_bug(X86_BUG_L1TF))
> > +             return 0;
> 
> ... while here you check for L1TF.
> 
> Also shouldn't it be default off and enabled via command line?
> 

I chose the other way because the prctl is an opt-in as well

> > +static int l1d_flush_out_prctl_get(struct task_struct *task)
> > +{
> > +     int ret;
> > +
> > +     if (l1d_flush_out_mitigation == L1D_FLUSH_OUT_OFF)
> > +             return PR_SPEC_FORCE_DISABLE;
> > +
> > +     ret = test_ti_thread_flag(&task->thread_info, TIF_SPEC_L1D_FLUSH);
> 
> That ret indirection is pointless. Just make it if (test_....)

Sure, will do

> 
> > +static int l1d_flush_out_prctl_set(struct task_struct *task, unsigned long ctrl)
> > +{
> > +
> > +     if (l1d_flush_out_mitigation == L1D_FLUSH_OUT_OFF)
> > +             return -EPERM;
> 
> So here you check for off and then...
> 

Yes

> >  int enable_l1d_flush_for_task(struct task_struct *tsk)
> >  {
> > +     /*
> > +      * Do not enable L1D_FLUSH_OUT if
> > +      * b. The CPU is not affected by the L1TF bug
> > +      * c. The CPU does not have L1D FLUSH feature support
> > +      */
> > +
> > +     if (!boot_cpu_has_bug(X86_BUG_L1TF) ||
> > +                     !boot_cpu_has(X86_FEATURE_FLUSH_L1D))
> > +             return -EINVAL;
> 
> ... you check for the feature bits with a malformatted condition at some
> other place. It's not supported when these conditions are not there. So
> why having this check here?
> 
> > +
> >       set_ti_thread_flag(&tsk->thread_info, TIF_SPEC_L1D_FLUSH);
> >       return 0;
> >  }
> 
> Aside of that, why is this in tlb.c and not in bugs.c? There is nothing
> tlb specific in these enable/disable functions. They just fiddle with
> the TIF bit.
> 

I can move them over.

> > +/*
> > + * Sent to a task that opts into L1D flushing via the prctl interface
> > + * but ends up running on an SMT enabled core.
> > + */
> > +static void l1d_flush_kill(struct callback_head *ch)
> > +{
> > +     force_sig(SIGBUS);
> > +}
> > +
> >  static inline unsigned long mm_mangle_tif_spec_bits(struct task_struct *next)
> >  {
> >       unsigned long next_tif = task_thread_info(next)->flags;
> >       unsigned long spec_bits = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_SPEC_MASK;
> > +     unsigned long next_mm;
> >
> >       BUILD_BUG_ON(TIF_SPEC_L1D_FLUSH != TIF_SPEC_IB + 1);
> > -     return (unsigned long)next->mm | spec_bits;
> > +     next_mm = (unsigned long)next->mm | spec_bits;
> > +
> > +     if ((next_mm & LAST_USER_MM_L1D_FLUSH) && this_cpu_read(cpu_info.smt_active)) {
> 
> Wheeee. Yet more unconditional checks on every context switch.

A task can only get here if it is affected by the bug (processor has
L1TF and L1D_FLUSH support) and the task opted in, I think what your
suggesting is that we avoid the check for all tasks (the signgle next_mm
& LAST_USER_MM_L1D_FLUSH) check as well?

> 
> > +             clear_ti_thread_flag(&next->thread_info, TIF_SPEC_L1D_FLUSH);
> > +             next->l1d_flush_kill.func = l1d_flush_kill;
> > +             task_work_add(next, &next->l1d_flush_kill, true);
> 
> int task_work_add(struct task_struct *task, struct callback_head *twork,
>                   enum task_work_notify_mode mode);
> 
> true is truly not a valid enum constant ....

:) I might really have added it when we were transitioning from true to
TWA_RESUME, I am surprised the compiler did not catch it, I'll move it
over.

> 
> > +     }
> 
> So you really want to have:
> 
> DEFINE_STATIC_KEY_FALSE(l1dflush_enabled);
> static bool l1dflush_mitigation __init_data;
> 
> and then with the command line option you set l1dflush_mitigation and in
> check_bugs() you invoke l1dflush_select_mitigation() which does:
> 
>        if (!l1dflush_mitigation || !boot_cpu_has_bug(X86_BUG_L1TF) ||
>            !boot_cpu_has(X86_FEATURE_FLUSH_L1D))
>                 return;
> 
>        static_branch_enable(&l1dflush_enabled);
> 
> and then in l1d_flush_out_prctl_set()
> 
>        if (!static_branch_unlikely(&l1dflush_enabled))
>                 return -ENOTSUPP;
> 
> Then make the whole switch machinery do:
> 
>       if (static_branch_unlikely(&l1dflush_enabled)) {
>             if (unlikely((next_mm | prev_mm) & LAST_USER_MM_L1D_FLUSH))
>                  l1dflush_evaluate(next_mm, prev_mm);
>       }
> 
> and l1dflush_evaluate()
> 
>      if (prev_mm & LAST_USER_MM_L1D_FLUSH)
>           l1d_flush();
> 
>      if ((next_mm & LAST_USER_MM_L1D_FLUSH) &&
>          this_cpu_read(cpu_info.smt_active)) {
> 
>           clear_ti_thread_flag(&next->thread_info, TIF_SPEC_L1D_FLUSH);
>           next->l1d_flush_kill.func = l1d_flush_kill;
>           task_work_add(next, &next->l1d_flush_kill, TWA_RESUME);
>      }
> 
> That way the overhead is on systems where the admin decides to enable it
> and if enabled the evaluation of prev_mm and next_mm is pushed out of
> line.
>

OK, I'll rewrite it and see how it looks

Thanks for the review,
Balbir Singh