lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 12 Apr 2017 13:59:03 -0600 From: Vishal Verma <vishal.l.verma@...el.com> To: Borislav Petkov <bp@...e.de> Cc: linux-kernel@...r.kernel.org, linux-nvdimm@...ts.01.org, x86@...nel.org, Ross Zwisler <ross.zwisler@...ux.intel.com>, Tony Luck <tony.luck@...el.com>, Dan Williams <dan.j.williams@...el.com> Subject: Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' On 04/12, Borislav Petkov wrote: > On Tue, Apr 11, 2017 at 04:44:57PM -0600, Vishal Verma wrote: > > The NFIT MCE handler callback (for handling media errors on NVDIMMs) > > takes a mutex to add the location of a memory error to a list. But since > > the notifier call chain for machine checks (x86_mce_decoder_chain) is > > atomic, we get a lockdep splat like: > > > > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 > > in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0 > > [..] > > Call Trace: > > dump_stack+0x86/0xc3 > > ___might_sleep+0x178/0x240 > > __might_sleep+0x4a/0x80 > > mutex_lock_nested+0x43/0x3f0 > > ? __lock_acquire+0xcbc/0x1290 > > nfit_handle_mce+0x33/0x180 [nfit] > > notifier_call_chain+0x4a/0x70 > > atomic_notifier_call_chain+0x6e/0x110 > > ? atomic_notifier_call_chain+0x5/0x110 > > mce_gen_pool_process+0x41/0x70 > > > > Commit 648ed94038c030245a06e4be59744fd5cdc18c40 > > x86/mce: Provide a lockless memory pool to save error records > > Changes the mce notifier callbacks to be run in a process context, and > > this can allow us to use the 'blocking' type notifier, where we can take > > mutexes etc. in the call chain functions. > > > > Reported-by: Ross Zwisler <ross.zwisler@...ux.intel.com> > > Cc: Borislav Petkov <bp@...e.de> > > Cc: Tony Luck <tony.luck@...el.com> > > Cc: Dan Williams <dan.j.williams@...el.com> > > Signed-off-by: Vishal Verma <vishal.l.verma@...el.com> > > --- > > arch/x86/kernel/cpu/mcheck/mce-genpool.c | 2 +- > > arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 +- > > arch/x86/kernel/cpu/mcheck/mce.c | 8 ++++---- > > 3 files changed, 6 insertions(+), 6 deletions(-) > > > > While this patch almost solves the problem, I think it is not quite right. > > The x86_mce_decoder_chain is also called from print_mce for fatal machine > > checks, and that is, afaict, still from an atomic context. One thing Tony > > suggested was splitting the notifier chain into two distinct chains, one > > for regular logging and recoverable actions that allows blocking, the > > other from the panic path. > > Well, if Mohammad won't come to the mountain... > > So the NFIT handler has: > > /* We only care about memory errors */ > if (!(mce->status & MCACOD)) > return NOTIFY_DONE; > > what severity are we talking here? Errors which can be reported on the > panic path, i.e., in atomic context or only AO/AR ones which don't raise > an #MC exception? I don't think we can do anything about the panic path errors. The NFIT handler takes the recoverable machine checks, and essentially, adds the location to a list. > > -- > Regards/Gruss, > Boris. > > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) > --
Powered by blists - more mailing lists