lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 6 Jun 2013 21:19:22 -0700
From:	Greg KH <gregkh@...uxfoundation.org>
To:	Yanmin Zhang <yanmin_zhang@...ux.intel.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Liu ShuoX <shuox.liu@...el.com>, linux-kernel@...r.kernel.org,
	fweisbec@...il.com, sedat.dilek@...il.com,
	srivatsa.bhat@...ux.vnet.ibm.com,
	Zhang Yanmin <yanmin.zhang@...el.com>
Subject: Re: [PATCH] irq: add a new function irq_in_progress to check pending
 IRQ handlers

On Fri, Jun 07, 2013 at 11:18:06AM +0800, Yanmin Zhang wrote:
> On Thu, 2013-06-06 at 20:08 -0700, Greg KH wrote:
> > On Fri, Jun 07, 2013 at 10:37:52AM +0800, Yanmin Zhang wrote:
> > > On Thu, 2013-06-06 at 18:02 -0700, Greg KH wrote:
> > > > On Fri, Jun 07, 2013 at 08:53:29AM +0800, Yanmin Zhang wrote:
> > > > > On Thu, 2013-06-06 at 15:18 +0200, Thomas Gleixner wrote:
> > > > > > On Thu, 6 Jun 2013, shuox.liu@...el.com wrote:
> > > > > > > From: Zhang Yanmin <yanmin.zhang@...el.com>
> > > > > > > 
> > > > > > > synchronize_irq waits pending IRQ handlers to be finished. If using this
> > > > > > > function while holding a resource, the IRQ handler may cause deadlock.
> > > > > > > 
> > > > > > > Here we add a new function irq_in_progress which doesn't wait for the handlers
> > > > > > > to be finished.
> > > > > > > 
> > > > > > > A typical use case at suspend-to-ram:
> > > > > > > 
> > > > > > > device driver's irq handler is complicated and might hold a mutex at rare cases.
> > > > > > > Its suspend function is called and a suspended flag is set.
> > > > > > > In case its IRQ handler is running, suspend function calls irq_in_progress. if
> > > > > > > handler is running, abort suspend.
> > > > > > > The irq handler checks the suspended flag. If the device is suspended, irq handler
> > > > > > > either ignores the interrupt, or wakes up the whole system, and the driver's
> > > > > > > resume function could deal with the delayed interrupt handling.
> > > > > > 
> > > > > > This is as wrong as it can be. Fix the driver instead of hacking racy
> > > > > > functions into the core code.
> > > > > > 
> > > > > > So your problem looks like this:
> > > > > > 
> > > > > > CPU 0				CPU 1
> > > > > > irq_handler_thread()		suspend()
> > > > > >    .....			mutex_lock(&m);
> > > > > >    mutex_lock(&m);		synchronize_irq();
> > > > > > 
> > > > > > So why needs the mutex to be taken before synchronize_irq()? Why not
> > > > > > doing the obvious?
> > > > > > 
> > > > > > suspend()
> > > > > >   disable_irq(); (Implies synchronize_irq)
> > > > > >   mutex_lock(&m);
> > > > > >   ....
> > > > > >   mutex_unlock(&m);
> > > > > >   enable_irq();
> > > > > Thanks for the kind comment.
> > > > > 
> > > > > We do consider your solution before and it works well indeed with some specific
> > > > > simple drivers. For example, some drives use GPIO pin as interrupt source.
> > > > > 
> > > > > On one specific platform, disable_irq would really disable the irq at RTE entry,
> > > > > which means we lose the wakeup capability of this device.
> > > > > synchronize_irq can be another solution. But we did hit 'DPM device timeout' issue
> > > > > reported by dpm_wd_handler at suspend-to-ram.
> > > > > 
> > > > > The driver is complicated. We couldn't change too many functions to optimize it.
> > > > > In addition, we have to use the driver instead of throwing it away.
> > > > 
> > > > What is preventing you from rewriting it to work properly?
> > > It's complicated.
> > 
> > That sounds like your issue, not ours, so please don't push the problem
> > onto someone else.  Take ownership of the driver, fix it up, and all
> > will be good.
> > 
> > 
> > > > > With irq_in_progress, we can resolve this issue and it does work, although it
> > > > > looks like ugly.
> > > > 
> > > > Don't paper over driver bugs in the core kernel, fix the driver.
> > > It's hard to say it's a driver bug. Could generic kernel provide some
> > > flexibility for such complicated drivers?
> > 
> > Please post the code for the driver, and then we will be glad to
> > continue the dicussion.
> Greg,
> 
> Thanks for your enthusiasm. It's a private driver for products.

What do you mean "private driver"?  All drivers need to be merged into
the mainline kernel, it saves you time and money, and we will fix your
bugs for you.

You know that, your bosses know that, your management knows that, so why
are you trying to hide things?

totally confused,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ