netdev - Re: [PATCH] xen/xenbus: Use 'void' instead of 'int' for the return of xenbus_switch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <542969E6.1010104@citrix.com>
Date:	Mon, 29 Sep 2014 15:17:10 +0100
From:	David Vrabel <david.vrabel@...rix.com>
To:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Chen Gang <gang.chen.5i5j@...il.com>
CC:	<ian.campbell@...rix.com>, <wei.liu2@...rix.com>,
	<boris.ostrovsky@...cle.com>, <bhelgaas@...gle.com>,
	<jgross@...e.com>, <yongjun_wei@...ndmicro.com.cn>,
	<mukesh.rathor@...cle.com>, <xen-devel@...ts.xenproject.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	<linux-pci@...r.kernel.org>, <linux-scsi@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] xen/xenbus: Use 'void' instead of 'int' for the return
 of xenbus_switch_state()

On 29/09/14 15:02, Konrad Rzeszutek Wilk wrote:
> On Sat, Sep 27, 2014 at 12:36:42AM +0800, Chen Gang wrote:
>> When xenbus_switch_state() fails, it will call xenbus_switch_fatal()
> 
> Only on the first depth, not on the subsequent ones (as in if
> the first xenbus_switch_fail fails, it won't try to call
> xenbus_switch_state again and again).
> 
>> internally, so need not return any status value, then use 'void' instead
>> of 'int' for xenbus_switch_state() and __xenbus_switch_state().
> 
> When that switch occurs (to XenbusStateConnected) won't the watches
> fire - meaning we MUST make sure that the watch functions - if they
> use the xenbus_switch_state() they MUST not hold any locks - because
> they could be executed once more?
> 
> Oh wait, we don't have to worry about that right now as the callbacks
> that pick up the messages from the XenBus are all gated on one mutex
> anyhow.
> 
> Hm, anyhow, I would add this extra piece of information to the patch:
> 
> 
> diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
> index c214daa..f7399fd 100644
> --- a/drivers/xen/xen-pciback/xenbus.c
> +++ b/drivers/xen/xen-pciback/xenbus.c
> @@ -661,6 +661,12 @@ static void xen_pcibk_be_watch(struct xenbus_watch *watch,
>  
>  	switch (xenbus_read_driver_state(pdev->xdev->nodename)) {
>  	case XenbusStateInitWait:
> +		/*
> +		 * xenbus_switch_state can call xenbus_switch_fatal which will
> +		 * immediately set the state to XenbusStateClosing which
> +		 * means if we were reading for it here we MUST drop any
> +		 * locks so that we don't dead-lock.
> +		 */

Watches are asynchronous and serialised by the xenwatch thread.  I can't
see what deadlock you're talking about here.  Particularly since the
backend doesn't watch its own state node (it watches the frontend one).

>  		xen_pcibk_setup_backend(pdev);
>  		break;
>  
>>
>> Also need be sure that all callers which check the return value must let
>> 'err' be 0.
> 
> I am bit uncomfortable with that, that is due to:
> 
> 
> .. snip..
>> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
>> index 9c47b89..b5c3d47 100644
>> --- a/drivers/net/xen-netback/xenbus.c
>> +++ b/drivers/net/xen-netback/xenbus.c
>> @@ -337,10 +337,7 @@ static int netback_probe(struct xenbus_device *dev,
>>  	if (err)
>>  		pr_debug("Error writing multi-queue-max-queues\n");
>>  
>> -	err = xenbus_switch_state(dev, XenbusStateInitWait);
>> -	if (err)
>> -		goto fail;
>> -
>> +	xenbus_switch_state(dev, XenbusStateInitWait);
> 
> Which if it fails it won't call:
> 
> 354 fail:                                                                           
> 355         pr_debug("failed\n");                                                   
> 356         netback_remove(dev);                                                    
> 357         return err;         
> 
> 
> And since there is no watch on the backend state to go in Closing it won't
> ever call those and we leak memory.

It's not leaking the memory.  All resources will be recovered when the
device is removed.

> The same is for xen-blkback mechanism in the probe function.

David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html