lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110524231813.GA2350@neilslaptop.think-freely.org>
Date:	Tue, 24 May 2011 19:18:13 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	Jay Vosburgh <fubar@...ibm.com>
Cc:	=?ISO-8859-1?Q?Nicolas_de_Peslo=FCan?= 
	<nicolas.2p.debian@...il.com>,
	Andy Gospodarek <andy@...yhouse.net>, netdev@...r.kernel.org,
	"David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH] bonding: prevent deadlock on slave store with alb mode

On Tue, May 24, 2011 at 02:12:54PM -0700, Jay Vosburgh wrote:
> Nicolas de Pesloüan <nicolas.2p.debian@...il.com> wrote:
> 
> >Le 24/05/2011 22:37, Neil Horman a écrit :
> >
> >>>>> +		return -EINVAL;
> >>>
> >>> This will turn a warning into an error.
> >>>
> >> Yes, because it should have been an error all along.
> >>
> >>> This warning existed for long, but never caused the bonding setup to
> >>> fail. This patch cause some regression for user space. For example,
> >>> current ifenslave-2.6 package in Debian doesn't ensure bond is UP
> >>> before enslaving, because this was never required.
> >>>
> >> Thats not a regression, thats the kernel returning an error where it should have
> >> done so all along.  Just because a utility got away with it for awhile and it
> >> didn't always cause a lockup, doesn't grandfather that application in to a
> >> situation where the kernel has to support its broken behavior in perpituity.
> >>
> >> Besides, iirc, the ifsenslave utility still uses the ioctl path, which this
> >> patch doesn't touch, so ifenslave is currently unaffected (although I should
> >> look in the ioctl path to see if we have already added such a check, lest you be
> >> able to deadlock your system as previously indicated using that tool).
> >
> >Unfortunately, no. Recent versions of ifenslave-2.6 on Debian don't use
> >ioctl (ifenslave binary) anymore, but only sysfs.
> >
> >Documentation/bonding.txt should be updated to reflect this change.
> >pr_warning should be changed to pr_ err.
> >Bonding version should be bumped.
> >
> >Anyway, I will fix this package, but I suspect there exist many user
> >scripts that don't ensure bond is up before enslaving.
> 
> 	I looked at sysconfig (as supplied with opensuse) and it uses
> sysfs, and does set the master device up first.  The other potential
> user that comes to mind is that OFED at one point had a script to set up
> bonding for Infiniband devices.  I don't know if this is still the case,
> nor do I know if it set the bond device up before enslaving.
> 
> 	Generally speaking, though, in the long run I think it should be
> permissible to change any bonding option when the bond is down (even to
> values that make no sense in context, e.g., setting the primary to a
> device not currently enslaved).  My rationale here is that some options
> are very difficult to modify when the bond is up (e.g., changing the
> mode), and now some other set is precluded when the bond is down.  The
> init scripts already have repeat logic in them; this just makes things
> more complicated.
> 
> 	There should be a state wherein any option can be changed (well,
> maybe not max_bonds), and that should be the down state.  A subset can
> also be changed while up.  I'd be happy to be able to change all options
> while the bond is up, too, but that seems pretty hard to do.
> 
> 	How much harder is it to fix the locking and permit the action
> in question here?
> 
In this case, to just hack something in place is pretty easy, I can just
initalize the spinlocks for all cases in the bond_create path.  But to do in any
sort of sensical way is much harder, since the code is written such that you
initialize various relevant data structures based on the mode of the bond,
which, as you indicated above, you want the right to change up until the point
where you ifup the bonded interface. 

The whole thing is predicated on the notion that
transitioning from the down to up state is the gating factor to initializing the
current configuration.  What might work is an in-between state in which you commit
and initialize a bond based on the current configuation.  Doing so would allow
you to (re-)initialize a bond configuration in a safe state.  Only after
commiting a configuration could you enslave devices or ifup the bond.  Once up,
further commits would be non-permissable until the bond was brought down again.
Of course, this would also require changing the semantics of the user space
tools.

This also begs the question, is it or is it not safe to enslave devices while
the bond is down?  Clearly from the bug report its unsafe, and I don't know what
other (if any) conditions exist that cause problems when doing this (be that a
deadlock, panic or simply undefined or unexpected behavior).  If its really
unsafe, then issuing a warning seems incorrect, we shouldn't allow user space to
cause things like this, and as such, we should return an error.  If it is safe
(generally) and this is an isolated bug, then we should probably remove the
warning.  But to just issue a vague 'This might do bad things' warning seems
wrong in either case.

I'll respin the patch to just initialize the spinlocks in the morning, if thats
what will fix the deadlock, but it really seems like the wrong way to go to me.
If enslaving devices to a bond while its down has been known to cause problems,
then we shouldn't allow it and we should update the user space tools to
understand and handle that.
Neil

> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar@...ibm.com
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ