linux-kernel - Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4CA057EA.7000609@redhat.com>
Date:	Mon, 27 Sep 2010 10:38:02 +0200
From:	Avi Kivity <avi@...hat.com>
To:	x86@...nel.org
CC:	linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH] x86, nmi: workaround sti; hlt race vs nmi; intr

  On 09/19/2010 06:28 PM, Avi Kivity wrote:
> On machines without monitor/mwait we use an sti; hlt sequence to atomically
> enable interrupts and put the cpu to sleep.  The sequence uses the "interrupt
> shadow" property of the sti instruction: interrupts are enabled only after
> the instruction following sti has been executed.  This means an interrupt
> cannot happen in the middle of the sequence, which would leave us with
> the interrupt processed but the cpu halted.
>
> The interrupt shadow, however, can be broken by an nmi; the following
> sequence
>
>     sti
>       nmi ... iret
>       # interrupt shadow disabled
>       intr ... iret
>     hlt
>
> puts the cpu to sleep, even though the interrupt may need additional processing
> after the hlt (like scheduling a task).
>
> sti is explicitly documented not to force an interrupt shadow; though many
> processors do inhibit nmi immediately after sti.
>
> Avoid the race by checking, during an nmi, if we hit the safe halt sequence.
> If we did, increment the instruction pointer past the hlt instruction; this
> allows an immediately following interrupt to return to a safe place.
>
> Signed-off-by: Avi Kivity<avi@...hat.com>

Ping.

> ---
>   arch/x86/include/asm/irqflags.h |   18 +++++++++++++++++-
>   arch/x86/kernel/traps.c         |   14 ++++++++++++++
>   2 files changed, 31 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> index 9e2b952..f412167 100644
> --- a/arch/x86/include/asm/irqflags.h
> +++ b/arch/x86/include/asm/irqflags.h
> @@ -44,9 +44,25 @@ static inline void native_irq_enable(void)
>   	asm volatile("sti": : :"memory");
>   }
>
> +extern void safe_halt_addr(void);
> +
>   static inline void native_safe_halt(void)
>   {
> -	asm volatile("sti; hlt": : :"memory");
> +	asm volatile("sti \n\t"
> +		     /*
> +		      * If NMI hits us here, it negates the interrupt shadow
> +		      * induced by STI.  So the NMI handler checks for
> +		      * safe_halt_addr and skips the hlt if it loses the
> +		      * interrupt shadow.
> +		      *
> +		      * If native_safe_halt() is ever instantiated more
> +		      * than once, this will fail to build, and we'll need
> +		      * a list of addresses in a special section.
> +		      */
> +		     ".globl safe_halt_addr \n\t"
> +		     "safe_halt_addr: \n\t"
> +		     "hlt"
> +		     : : :"memory");
>   }
>
>   static inline void native_halt(void)
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 60788de..f67da48 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -438,6 +438,20 @@ do_nmi(struct pt_regs *regs, long error_code)
>
>   	inc_irq_stat(__nmi_count);
>
> +	/*
> +	 * We hit in the middle of an sti; hlt instruction.  When we return,
> +	 * the interrupt shadow cast by sti will no longer be in effect; then,
> +	 * if an interrupt causes a wakeup, we won't notice it since the hlt
> +	 * will take effect and block the cpu.
> +	 *
> +	 * If we detect this situation, fix it by advancing the instruction
> +	 * pointer past the hlt instruction; if the interrupt doesn't happen,
> +	 * we'll spend a few cycles falling asleep again, but that's better
> +	 * than a missed wakeup.
> +	 */
> +	if (regs->cs == __KERNEL_CS&&  regs->ip == (ulong)safe_halt_addr)
> +		++regs->ip;
> +
>   	if (!ignore_nmis)
>   		default_do_nmi(regs);
>


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/