linux-kernel - Re: 3.12: kernel panic when resuming from suspend to RAM (x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 2 Dec 2013 12:20:36 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Francis Moreau <francis.moro@...il.com>
cc:	Jingoo Han <jg1.han@...sung.com>,
	'Wei WANG' <wei_wang@...lsil.com.cn>,
	'Samuel Ortiz' <sameo@...ux.intel.com>,
	'Chris Ball' <cjb@...top.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	'Borislav Petkov' <bp@...en8.de>,
	'LKML' <linux-kernel@...r.kernel.org>
Subject: Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)

On Mon, 2 Dec 2013, Thomas Gleixner wrote:
> On Sat, 30 Nov 2013, Francis Moreau wrote:
> > Hello Thomas,
> > 
> > Sorry for the delay.
> > 
> > On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
> > > On Fri, 29 Nov 2013, Francis Moreau wrote:
> > >> Since it seems to be related to rtsx driver or its upper layer, could
> > >> the folks involved in this area have a look to this issue please ?
> > > 
> > > I'm not involved, but looking at the debug objects backtrace it's
> > > related to the delayed work in rtsx.
> > > 
> > > Does the untested patch below cure the issue?
> > > 
> > 
> > It seems it does since I can't see the debug object trace anymore
> > however Ican see this now:
> 
> <SNIP>
>  
> > So I don't think it completely solve the problem but it's a good start.
> 
> I kinda expected that, but I wanted to confirm my suspicion, that the
> interrupt hits after the delayed work is canceled and just requeues it
> again, which then leads to an armed timer being freed further down.
> 
> I'm not familiar with that driver and I leave the final fixup to the
> driver maintainers. It's enough data for them to figure out the real
> solution.

Just had a quick look and the obvious solution is to disable the
interrupts at the device level _BEFORE_ doing anything else in the
teardown path. Updated patch below. That should avoid the nobody cared
splat on the other irq line.

Thanks,

	tglx

Index: linux-2.6/drivers/mfd/rtsx_pcr.c
===================================================================
--- linux-2.6.orig/drivers/mfd/rtsx_pcr.c
+++ linux-2.6/drivers/mfd/rtsx_pcr.c
@@ -1228,8 +1228,14 @@ static void rtsx_pci_remove(struct pci_d
 
 	pcr->remove_pci = true;
 
-	cancel_delayed_work(&pcr->carddet_work);
-	cancel_delayed_work(&pcr->idle_work);
+	/* Disable interrupts at the pcr level */
+	spin_lock_irq(&pcr->lock);
+	rtsx_pci_writel(pcr, RTSX_BIER, 0);
+	pcr->bier = 0;
+	spin_unlock_irq(&pcr->lock);
+
+	cancel_delayed_work_sync(&pcr->carddet_work);
+	cancel_delayed_work_sync(&pcr->idle_work);
 
 	mfd_remove_devices(&pcidev->dev);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/