linux-kernel - Re: [PATCH] xen/xenbus: Use 'void' instead of 'int' for the return of xenbus_switch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140929154055.GA31952@laptop.dumpdata.com>
Date:	Mon, 29 Sep 2014 11:40:55 -0400
From:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To:	David Vrabel <david.vrabel@...rix.com>
Cc:	Chen Gang <gang.chen.5i5j@...il.com>, ian.campbell@...rix.com,
	wei.liu2@...rix.com, boris.ostrovsky@...cle.com,
	bhelgaas@...gle.com, jgross@...e.com,
	yongjun_wei@...ndmicro.com.cn, mukesh.rathor@...cle.com,
	xen-devel@...ts.xenproject.org,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	linux-pci@...r.kernel.org, linux-scsi@...r.kernel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] xen/xenbus: Use 'void' instead of 'int' for the return
 of xenbus_switch_state()

On Mon, Sep 29, 2014 at 03:17:10PM +0100, David Vrabel wrote:
> On 29/09/14 15:02, Konrad Rzeszutek Wilk wrote:
> > On Sat, Sep 27, 2014 at 12:36:42AM +0800, Chen Gang wrote:
> >> When xenbus_switch_state() fails, it will call xenbus_switch_fatal()
> > 
> > Only on the first depth, not on the subsequent ones (as in if
> > the first xenbus_switch_fail fails, it won't try to call
> > xenbus_switch_state again and again).
> > 
> >> internally, so need not return any status value, then use 'void' instead
> >> of 'int' for xenbus_switch_state() and __xenbus_switch_state().
> > 
> > When that switch occurs (to XenbusStateConnected) won't the watches
> > fire - meaning we MUST make sure that the watch functions - if they
> > use the xenbus_switch_state() they MUST not hold any locks - because
> > they could be executed once more?
> > 
> > Oh wait, we don't have to worry about that right now as the callbacks
> > that pick up the messages from the XenBus are all gated on one mutex
> > anyhow.
> > 
> > Hm, anyhow, I would add this extra piece of information to the patch:
> > 
> > 
> > diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
> > index c214daa..f7399fd 100644
> > --- a/drivers/xen/xen-pciback/xenbus.c
> > +++ b/drivers/xen/xen-pciback/xenbus.c
> > @@ -661,6 +661,12 @@ static void xen_pcibk_be_watch(struct xenbus_watch *watch,
> >  
> >  	switch (xenbus_read_driver_state(pdev->xdev->nodename)) {
> >  	case XenbusStateInitWait:
> > +		/*
> > +		 * xenbus_switch_state can call xenbus_switch_fatal which will
> > +		 * immediately set the state to XenbusStateClosing which
> > +		 * means if we were reading for it here we MUST drop any
> > +		 * locks so that we don't dead-lock.
> > +		 */
> 
> Watches are asynchronous and serialised by the xenwatch thread.  I can't
> see what deadlock you're talking about here.  Particularly since the
> backend doesn't watch its own state node (it watches the frontend one).
> 
> >  		xen_pcibk_setup_backend(pdev);
> >  		break;
> >  
> >>
> >> Also need be sure that all callers which check the return value must let
> >> 'err' be 0.
> > 
> > I am bit uncomfortable with that, that is due to:
> > 
> > 
> > .. snip..
> >> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> >> index 9c47b89..b5c3d47 100644
> >> --- a/drivers/net/xen-netback/xenbus.c
> >> +++ b/drivers/net/xen-netback/xenbus.c
> >> @@ -337,10 +337,7 @@ static int netback_probe(struct xenbus_device *dev,
> >>  	if (err)
> >>  		pr_debug("Error writing multi-queue-max-queues\n");
> >>  
> >> -	err = xenbus_switch_state(dev, XenbusStateInitWait);
> >> -	if (err)
> >> -		goto fail;
> >> -
> >> +	xenbus_switch_state(dev, XenbusStateInitWait);
> > 
> > Which if it fails it won't call:
> > 
> > 354 fail:                                                                           
> > 355         pr_debug("failed\n");                                                   
> > 356         netback_remove(dev);                                                    
> > 357         return err;         
> > 
> > 
> > And since there is no watch on the backend state to go in Closing it won't
> > ever call those and we leak memory.
> 
> It's not leaking the memory.  All resources will be recovered when the
> device is removed.

I presume you mean when the XenBus entries are torn down? It does look
like it would call the .remove functionality. That should take care of that.

In which case we can just remove all of the 'netback_remove()' and also
remove some of the labels.


> 
> > The same is for xen-blkback mechanism in the probe function.
> 
> David
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/