[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090615180527.GB4201@Krystal>
Date: Mon, 15 Jun 2009 14:05:27 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Ingo Molnar <mingo@...e.hu>, mingo@...hat.com, hpa@...or.com,
paulus@...ba.org, acme@...hat.com, linux-kernel@...r.kernel.org,
a.p.zijlstra@...llo.nl, penberg@...helsinki.fi,
vegard.nossum@...il.com, efault@....de, jeremy@...p.org,
npiggin@...e.de, tglx@...utronix.de,
linux-tip-commits@...r.kernel.org
Subject: Re: [tip:perfcounters/core] perf_counter: x86: Fix call-chain
support to use NMI-safe methods
* Linus Torvalds (torvalds@...ux-foundation.org) wrote:
>
>
> On Mon, 15 Jun 2009, Ingo Molnar wrote:
> >
> > A simple cr2 corruption would explain all those cc1 SIGSEGVs and
> > other user-space crashes i saw, with sufficiently intense sampling -
> > easily.
>
> Note that we could work around the %cr2 issue, since any corruption is
> always nicely "nested" (ie there are never any SMP issues with async
> writes to the register).
>
> So what we _could_ do is to have a magic value for %cr2, along with a "NMI
> sequence count", and if we see that value, we just return (without doing
> anything) from the page fault handler.
>
> Then, the NMI handler would be changed to always write that value to %cr2
> after it has done the operation that could fault, and do an atomic
> increment of the NMI sequence count. Then, we can do something like this
> in the page fault handler:
>
> if (cr2 == MAGIC_CR2) {
> static unsigned long my_seqno = -1;
> if (my_seqno != nmi_seqno) {
> my_seqno = nmi_seqno;
> return;
> }
> }
>
> where the whole (and only) point of that "seqno" is to protect against
> user space doing something like
>
> int i = *(int *)MAGIC_CR2;
>
> and causing infinite faults.
>
> If a real NMI happens, then nmi_seqno will always be different, and we'll
> just retry the fault (the NMI handler would do something like
>
> write_cr2(MAGIC_CR2);
> atomic_inc(&nmi_seqno);
>
> to set it all up).
>
> Anyway, I do think that the _correct_ solution is to not do page faults
> from within NMI's, but the above is an outline of how we could _try_ to
> handle it if we really really wanted to. IOW, the fact that cr2 gets
> corrupted is not insurmountable, exactly because we _could_ always just
> retrigger the page fault, and thus "re-create' the corrupted %cr2 value.
>
> Hacky, hacky. And I'm not sure how happy CPU's even are to have %cr2
> written to, so we could hit CPU issues.
>
Hrm, would it be possible to save the c2 register upon nmi handler entry
and restore it before iret instead ? This would ensure a
nmi-interrupted page fault handler would continue what it was doing with
a non-corrupted cr2 register after returning from nmi.
Plus, this involves no modification to the page fault handler fast path.
But I fear I might be missing something totally obvious.
Mathieu
> Linus
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists