linux-kernel - Re: Kernel oops with 2.6.26, padlock and ipsec: probably problem with fpu state changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080809185224.GH13158@linux-os.sc.intel.com>
Date:	Sat, 9 Aug 2008 11:52:24 -0700
From:	Suresh Siddha <suresh.b.siddha@...el.com>
To:	"H. Peter Anvin" <hpa@...or.com>
Cc:	Wolfgang Walter <wolfgang.walter@...m.de>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	"Siddha, Suresh B" <suresh.b.siddha@...el.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>,
	"viro@...IV.linux.org.uk" <viro@...iv.linux.org.uk>,
	"vegard.nossum@...il.com" <vegard.nossum@...il.com>
Subject: Re: Kernel oops with 2.6.26, padlock and ipsec: probably problem with fpu state changes

On Sat, Aug 09, 2008 at 09:10:05AM -0700, H. Peter Anvin wrote:
> Wolfgang Walter wrote:
> > How could any kernel code use MMX/SSE/FPU when the interrupt case isn't
> > handled?
> 
> I don't think we have ever allowed MMX/SSE/FPU code in interrupt
> handlers.  kernel_fpu_begin()..end() lock out preemption, and so could
> only be interrupted, not preempted.

Yes, fast handlers fall back to slow handlers in the interrupt context
and don't touch FP/SSE and thus avoid the kernel nesting.

hmm, in the padlock interrupt usage scenario(even though it doesn't touch FP/SSE
registers), kernel_fpu_begin/end() will not solve the problem,
as nesting of kernel_fpu_begin() is not ok, as we unconditionally
do stts() in kernel_fpu_end(). So the proposed patch is not ok,
as we end up corrupting first kernel FP usage.

> > Or is your argument that its lazy allocation itself is the problem: this
> > nesting could always happen and was a bug but only with lazy allocation it is
> > dangerous (as it may cause a spurious math fault in the race window).
> >
> > If this were right than any kernel code executing SSE may trigger now a oops
> > in __switch_to() under some special circumstances.
> 
> If lazy allocation can cause the RAID code, for example (which executes
> SSE instructions in the kernel, but not at interrupt time) to start
> randomly oopsing, then lazy allocations have to be pulled.

While the lazy allocation is not a big thing and can be pulled(with a
very small patch), this has brought two existing security issues to light
so far. one in lguest code(fixed now) and now in padlock usage. I think even
in 2.6.25, padlock usage can easily can cause the FPU leakage as I mentioned
in another response.

Backing out lazy allocation is not just enough here. Let me think a little
more on this.

thanks,
suresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/