lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 29 Nov 2012 17:41:12 -0800
From:	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To:	Ben Hutchings <ben@...adent.org.uk>
Cc:	linux-kernel@...r.kernel.org, stable@...r.kernel.org,
	kernel-team@...ts.ubuntu.com,
	Gavin Shan <shangw@...ux.vnet.ibm.com>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Herton Ronaldo Krzesinski <herton.krzesinski@...onical.com>
Subject: Re: [PATCH 026/270] powerpc/eeh: Lock module while handling EEH event

On Tue, Nov 27, 2012 at 02:18:34AM +0000, Ben Hutchings wrote:
> On Mon, 2012-11-26 at 14:55 -0200, Herton Ronaldo Krzesinski wrote:
> > 3.5.7u1 -stable review patch.  If anyone has any objections, please let me know.
> > 
> > ------------------
> > 
> > From: Gavin Shan <shangw@...ux.vnet.ibm.com>
> > 
> > commit feadf7c0a1a7c08c74bebb4a13b755f8c40e3bbc upstream.
> > 
> > The EEH core is talking with the PCI device driver to determine the
> > action (purely reset, or PCI device removal). During the period, the
> > driver might be unloaded and in turn causes kernel crash as follows:
> > 
> > EEH: Detected PCI bus error on PHB#4-PE#10000
> > EEH: This PCI device has failed 3 times in the last hour
> > lpfc 0004:01:00.0: 0:2710 PCI channel disable preparing for reset
> > Unable to handle kernel paging request for data at address 0x00000490
> > Faulting instruction address: 0xd00000000e682c90
> > cpu 0x1: Vector: 300 (Data Access) at [c000000fc75ffa20]
> >     pc: d00000000e682c90: .lpfc_io_error_detected+0x30/0x240 [lpfc]
> >     lr: d00000000e682c8c: .lpfc_io_error_detected+0x2c/0x240 [lpfc]
> >     sp: c000000fc75ffca0
> >    msr: 8000000000009032
> >    dar: 490
> >  dsisr: 40000000
> >   current = 0xc000000fc79b88b0
> >   paca    = 0xc00000000edb0380	 softe: 0	 irq_happened: 0x00
> >     pid   = 3386, comm = eehd
> > enter ? for help
> > [c000000fc75ffca0] c000000fc75ffd30 (unreliable)
> > [c000000fc75ffd30] c00000000004fd3c .eeh_report_error+0x7c/0xf0
> > [c000000fc75ffdc0] c00000000004ee00 .eeh_pe_dev_traverse+0xa0/0x180
> > [c000000fc75ffe70] c00000000004ffd8 .eeh_handle_event+0x68/0x300
> > [c000000fc75fff00] c0000000000503a0 .eeh_event_handler+0x130/0x1a0
> > [c000000fc75fff90] c000000000020138 .kernel_thread+0x54/0x70
> > 1:mon>
> > 
> > The patch increases the reference of the corresponding driver modules
> > while EEH core does the negotiation with PCI device driver so that the
> > corresponding driver modules can't be unloaded during the period and
> > we're safe to refer the callbacks.
> > 
> > Reported-by: Alexey Kardashevskiy <aik@...abs.ru>
> > Signed-off-by: Gavin Shan <shangw@...ux.vnet.ibm.com>
> > Signed-off-by: Benjamin Herrenschmidt <benh@...nel.crashing.org>
> > [ herton: backported for 3.5, adjusted driver assignments, return 0
> >   instead of NULL, assume dev is not NULL ]
> > Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@...onical.com>
> [...]
> 
> Greg, you probably want this in 3.4 and 3.6.

Many thanks.  Herton, any reason why you didn't forward on this
backported version of the patch?

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ