[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <239ada57a88c69072fc2933a39affe3923c90800.camel@surriel.com>
Date: Fri, 23 Jul 2021 21:38:38 -0400
From: Rik van Riel <riel@...riel.com>
To: Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Cc: Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>, kernel-team@...com,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org
Subject: Re: [PATCH] x86,mm: print likely CPU at segfault time
On Wed, 2021-07-21 at 22:36 +0200, Thomas Gleixner wrote:
> Rik,
>
> On Mon, Jul 19 2021 at 15:00, Rik van Riel wrote:
> >
> > Adding a printk to show_signal_msg() achieves that purpose. It
> > isn't
> > perfect since the task might get rescheduled on another CPU between
> > when the fault hit and when the message is printed, but it should
> > be
> > good enough to show correlation between userspace and kernel errors
> > when dealing with a bad CPU.
>
> we could collect the cpu number in do_*_addr_fault() before
> interrupts
> are enabled and just hand it through. There are only a few callchains
> which end up in __bad_area_nosemaphore().
We could, but do we really want to add that to the hot path
for page faults, when segfaults are so rare?
I suspect the simple patch I sent will be good enough to
identify a bad CPU, even if only 3 out of 4 userspace crashes
get attributed to the right CPU...
I would be happy to write a patch that does what you want
though, so you can compare them side by side :)
--
All Rights Reversed.
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists