linux-kernel - Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier in context switch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180129122803.GA23664@isilmar-4.linta.de>
Date:   Mon, 29 Jan 2018 13:28:03 +0100
From:   Dominik Brodowski <linux@...inikbrodowski.net>
To:     David Woodhouse <dwmw@...zon.co.uk>
Cc:     arjan@...ux.intel.com, tglx@...utronix.de, karahmed@...zon.de,
        x86@...nel.org, linux-kernel@...r.kernel.org,
        tim.c.chen@...ux.intel.com, bp@...en8.de, peterz@...radead.org,
        pbonzini@...hat.com, ak@...ux.intel.com,
        torvalds@...ux-foundation.org, gregkh@...ux-foundation.org,
        mingo@...nel.org, luto@...nel.org, jcm@...hat.com
Subject: Re: [PATCH] x86/speculation: Use Indirect Branch Prediction Barrier
 in context switch

On Mon, Jan 29, 2018 at 11:33:28AM +0000, David Woodhouse wrote:
> From: Tim Chen <tim.c.chen@...ux.intel.com>
> 
> Flush indirect branches when switching into a process that marked itself
> non dumpable. This protects high value processes like gpg better,
> without having too high performance overhead.
> 
> If done naïvely, we could switch to a kernel idle thread and then back
> to the original process, such as:
> 
>     process A -> idle -> process A
> 
> In such scenario, we do not have to do IBPB here even though the process
> is non-dumpable, as we are switching back to the same process after a
> hiatus.
> 
> To avoid the redundant IBPB, which is expensive, we track the last mm
> user context ID. The cost is to have an extra u64 mm context id to track
> the last mm we were using before switching to the init_mm used by idle.
> Avoiding the extra IBPB is probably worth the extra memory for this
> common scenario.
> 
> For those cases where tlb_defer_switch_to_init_mm() returns true (non
> PCID), lazy tlb will defer switch to init_mm, so we will not be changing
> the mm for the process A -> idle -> process A switch. So IBPB will be
> skipped for this case.
> 
> Thanks to the reviewers and Andy Lutomirski for the suggestion of
> using ctx_id which got rid of the problem of mm pointer recycling.
> 
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> Signed-off-by: David Woodhouse <dwmw@...zon.co.uk>
> ---
> How close are we to done with bikeshedding this one?... 

The commit message is much more about the A->idle-> improvement than
on the basic design decisions to limit this to non-dumpable processes. And
that still seems to be under discussion (see, for example, Jon Masters
message of today, https://lkml.org/lkml/2018/1/29/34 ). So this design
choice should, at least, be more explicit (if not tunable...).

> @@ -219,6 +220,25 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
>  	} else {
>  		u16 new_asid;
>  		bool need_flush;
> +		u64 last_ctx_id = this_cpu_read(cpu_tlbstate.last_ctx_id);
> +
> +		/*
> +		 * Avoid user/user BTB poisoning by flushing the branch
> +		 * predictor when switching between processes. This stops
> +		 * one process from doing Spectre-v2 attacks on another.
> +		 *
> +                 * As an optimization, flush indirect branches only when
> +                 * switching into processes that disable dumping.
> +                 *
> +                 * This will not flush branches when switching into kernel
> +		 * threads. It will also not flush if we switch to idle

Whitespace damage. And maybe add ", as the kernel depends on retpoline
protection instead" after "threads" here -- I think that was the reason why
you think kernel threads are safe; or did I misunderstand you?

> +		 * thread and back to the same process. It will flush if we
> +		 * switch to a different non-dumpable process.

"process, as that gives additional protection to high value processes like
gpg. Other processes are left unprotected here to reduce the overhead of the
barrier [... maybe add some rationale here ...]"

Thanks,
	Dominik