lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ebd40b1d-c4a2-7e39-2841-88e60edddafe@rock-chips.com>
Date:   Tue, 29 Aug 2017 16:08:52 +0800
From:   Shawn Lin <shawn.lin@...k-chips.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     shawn.lin@...k-chips.com, Ulf Hansson <ulf.hansson@...aro.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Heiko Stuebner <heiko@...ech.de>,
        Jaehoon Chung <jh80.chung@...sung.com>,
        linux-mmc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] driver core: detach device's pm_domain after
 devres_release_all

Hi Greg,

On 2017/8/29 14:42, Greg Kroah-Hartman wrote:
> On Tue, Aug 15, 2017 at 04:36:56PM +0800, Shawn Lin wrote:
>> Move dev_pm_domain_detach after devres_release_all to avoid
>> accessing device's registers with genpd been powered off.
> 
> So, what is this going to break that is working already today?  :)

Thanks for your comment!

The background of this patch is that:
(1) Some SoCs, including Rockchips' SoCs, couldn't support
accessing controllers' registers w/o clk and power domain enabled.
(2) Many common drivers use devm_request_irq to request irq for either
shared irq or non-shared irq.
(3) So we rely on devres_release_all to free irq automatically.

So the actually race condition is:
(1) Driver A probe failed or calling remove
(2) power domain is detached right now
(3) A irq triggerd cocurrently just before calling devm_irq_release..
(4) Driver A's ISR read its register .. panic..

The issue is exposed by enabing CONFIG_DEBUG_SHIRQ. Thus devres_free_irq
will try to call the ISR as it says: "It's a shared IRQ -- the driver
ought to be prepared for an IRQ event to happen even now it's being
freed". So it calls the driver's ISR w/o power domain enabled, which
hangup the system... This is theoretically help folks to make the code
robust enough to deal with shared case.

But, for no matter whether the irq is shared or non-shared, the race
condition is there. So we possible have two choices that
(1) Either using request_irq and free_irq directly
(2) Or moving dev_pm_domain_detach after devres_release_all which
makes sure we free the irq before powering off power domain.

However doesn't choice(1) imply that devm_request_irq shouldn't
exist? :) So I try to fix it like what this patch does.

> 
>>
>> Signed-off-by: Shawn Lin <shawn.lin@...k-chips.com>
>> ---

...

> 
> Why is this set to true if you have a driver remove function, but not if
> you only have a bus remove function?  Why the difference?
> 
> 

Sorry, I will fix these all and always call dev_pm_domain_detach on the
error  path.

>> +		}
>>   		devres_release_all(dev);
>> +		if (do_pm_domain)
>> +			dev_pm_domain_detach(dev, true);
>>   		driver_sysfs_remove(dev);
>>   		dev->driver = NULL;
>>   		dev_set_drvdata(dev, NULL);
>> @@ -458,6 +476,8 @@ static int really_probe(struct device *dev, struct device_driver *drv)
>>   pinctrl_bind_failed:
>>   	device_links_no_driver(dev);
>>   	devres_release_all(dev);
>> +	if (do_pm_domain)
>> +		dev_pm_domain_detach(dev, true);
> 
> Can't you just always call this on the error path?
> 
>>   	driver_sysfs_remove(dev);
>>   	dev->driver = NULL;
>>   	dev_set_drvdata(dev, NULL);
>> @@ -818,6 +838,7 @@ int driver_attach(struct device_driver *drv)
>>   static void __device_release_driver(struct device *dev, struct device *parent)
>>   {
>>   	struct device_driver *drv;
>> +	bool do_pm_domain = false;
>>   
>>   	drv = dev->driver;
>>   	if (drv) {
>> @@ -855,15 +876,19 @@ static void __device_release_driver(struct device *dev, struct device *parent)
>>   
>>   		pm_runtime_put_sync(dev);
>>   
>> -		if (dev->bus && dev->bus->remove)
>> +		if (dev->bus && dev->bus->remove) {
>>   			dev->bus->remove(dev);
>> -		else if (drv->remove)
>> +		} else if (drv->remove) {
>> +			do_pm_domain = true;
> 
> Same question here about drivers and bus default functions.
> 
> thanks,
> 
> greg k-h
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ