linux-kernel - Re: [tip:perfcounters/core] perf_counter: x86: Fix call-chain support to use NMI-safe methods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.01.0906151029160.3305@localhost.localdomain>
Date:	Mon, 15 Jun 2009 10:37:51 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	mingo@...hat.com, hpa@...or.com, mathieu.desnoyers@...ymtl.ca,
	paulus@...ba.org, acme@...hat.com, linux-kernel@...r.kernel.org,
	a.p.zijlstra@...llo.nl, penberg@...helsinki.fi,
	vegard.nossum@...il.com, efault@....de, jeremy@...p.org,
	npiggin@...e.de, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org
Subject: Re: [tip:perfcounters/core] perf_counter: x86: Fix call-chain support
 to use NMI-safe methods

On Mon, 15 Jun 2009, Ingo Molnar wrote:
> 
> A simple cr2 corruption would explain all those cc1 SIGSEGVs and 
> other user-space crashes i saw, with sufficiently intense sampling - 
> easily.

Note that we could work around the %cr2 issue, since any corruption is 
always nicely "nested" (ie there are never any SMP issues with async 
writes to the register).

So what we _could_ do is to have a magic value for %cr2, along with a "NMI 
sequence count", and if we see that value, we just return (without doing 
anything) from the page fault handler.

Then, the NMI handler would be changed to always write that value to %cr2 
after it has done the operation that could fault, and do an atomic 
increment of the NMI sequence count. Then, we can do something like this 
in the page fault handler:

	if (cr2 == MAGIC_CR2) {
		static unsigned long my_seqno = -1;
		if (my_seqno != nmi_seqno) {
			my_seqno = nmi_seqno;
			return;
		}
	}

where the whole (and only) point of that "seqno" is to protect against 
user space doing something like

	int i = *(int *)MAGIC_CR2;

and causing infinite faults.

If a real NMI happens, then nmi_seqno will always be different, and we'll 
just retry the fault (the NMI handler would do something like

	write_cr2(MAGIC_CR2);
	atomic_inc(&nmi_seqno);

to set it all up).

Anyway, I do think that the _correct_ solution is to not do page faults 
from within NMI's, but the above is an outline of how we could _try_ to 
handle it if we really really wanted to. IOW, the fact that cr2 gets 
corrupted is not insurmountable, exactly because we _could_ always just 
retrigger the page fault, and thus "re-create' the corrupted %cr2 value.

Hacky, hacky. And I'm not sure how happy CPU's even are to have %cr2 
written to, so we could hit CPU issues.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/