lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 6 Jun 2013 18:02:25 -0700
From:	Greg KH <gregkh@...uxfoundation.org>
To:	Yanmin Zhang <yanmin_zhang@...ux.intel.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Liu ShuoX <shuox.liu@...el.com>, linux-kernel@...r.kernel.org,
	fweisbec@...il.com, sedat.dilek@...il.com,
	srivatsa.bhat@...ux.vnet.ibm.com,
	Zhang Yanmin <yanmin.zhang@...el.com>
Subject: Re: [PATCH] irq: add a new function irq_in_progress to check pending
 IRQ handlers

On Fri, Jun 07, 2013 at 08:53:29AM +0800, Yanmin Zhang wrote:
> On Thu, 2013-06-06 at 15:18 +0200, Thomas Gleixner wrote:
> > On Thu, 6 Jun 2013, shuox.liu@...el.com wrote:
> > > From: Zhang Yanmin <yanmin.zhang@...el.com>
> > > 
> > > synchronize_irq waits pending IRQ handlers to be finished. If using this
> > > function while holding a resource, the IRQ handler may cause deadlock.
> > > 
> > > Here we add a new function irq_in_progress which doesn't wait for the handlers
> > > to be finished.
> > > 
> > > A typical use case at suspend-to-ram:
> > > 
> > > device driver's irq handler is complicated and might hold a mutex at rare cases.
> > > Its suspend function is called and a suspended flag is set.
> > > In case its IRQ handler is running, suspend function calls irq_in_progress. if
> > > handler is running, abort suspend.
> > > The irq handler checks the suspended flag. If the device is suspended, irq handler
> > > either ignores the interrupt, or wakes up the whole system, and the driver's
> > > resume function could deal with the delayed interrupt handling.
> > 
> > This is as wrong as it can be. Fix the driver instead of hacking racy
> > functions into the core code.
> > 
> > So your problem looks like this:
> > 
> > CPU 0				CPU 1
> > irq_handler_thread()		suspend()
> >    .....			mutex_lock(&m);
> >    mutex_lock(&m);		synchronize_irq();
> > 
> > So why needs the mutex to be taken before synchronize_irq()? Why not
> > doing the obvious?
> > 
> > suspend()
> >   disable_irq(); (Implies synchronize_irq)
> >   mutex_lock(&m);
> >   ....
> >   mutex_unlock(&m);
> >   enable_irq();
> Thanks for the kind comment.
> 
> We do consider your solution before and it works well indeed with some specific
> simple drivers. For example, some drives use GPIO pin as interrupt source.
> 
> On one specific platform, disable_irq would really disable the irq at RTE entry,
> which means we lose the wakeup capability of this device.
> synchronize_irq can be another solution. But we did hit 'DPM device timeout' issue
> reported by dpm_wd_handler at suspend-to-ram.
> 
> The driver is complicated. We couldn't change too many functions to optimize it.
> In addition, we have to use the driver instead of throwing it away.

What is preventing you from rewriting it to work properly?

> With irq_in_progress, we can resolve this issue and it does work, although it
> looks like ugly.

Don't paper over driver bugs in the core kernel, fix the driver.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ