linux-kernel - Re: irq_fpu_usable() is irreliable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.11.1511171438090.3761@nanos>
Date:	Tue, 17 Nov 2015 15:06:20 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	"Jason A. Donenfeld" <Jason@...c4.com>
cc:	mingo@...hat.com, hpa@...or.com,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: irq_fpu_usable() is irreliable

Jason,

On Tue, 17 Nov 2015, Jason A. Donenfeld wrote:
> The availability of the FPU in kernel space, as you know, is determined by
> this function:
> 
> bool irq_fpu_usable(void)
> {
>         return !in_interrupt() ||
>                 interrupted_user_mode() ||
>                 interrupted_kernel_fpu_idle();
> }
> 
> My understanding is that the first check is !in_interrupt(), because if
> `current` is valid - if we are in process context - then we have a place to
> store the existing FPU regs in kernel_fpu_begin, to be restored later in
> kernel_fpu_end. Recently I've been tracking down a problem in
> which irq_fpu_usable() returns false, yet a stack trace shows the first
> function is the syscall entry point. This leads me to believe that
> in_interrupt() is not an adequate way of testing for a valid `current`.

This function has absolute nothing to do with current. current is
always valid. The function checks whether we can use the fpu safely in
kernel context.

> In my particular problematic case, the reason in_interrupt() was
> returning false is because a number of rcu_read_lock_bh()s were
> being held; IOW this is occurring in the ndo_start_xmit path of a
> network driver.
>
> I therefore propose changing the function to this:
> 
> bool irq_fpu_usable(void)
> {
>         return (!in_irq() && !in_nmi()) ||
>                 interrupted_user_mode() ||
>                 interrupted_kernel_fpu_idle();
> }
> 
> What would you think of that?

That's broken. Assume we interrupted a kernel thread which fiddles
with the FPU and then on irq exit we run a softirq which tries to use
the FPU....

The real question in your case is WHY interrupted_kernel_fpu_idle()
returns false. We know for sure that in a syscall with BH disabled the
first two checks are false.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/