lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAE3RKNnrDDh1Y_4+NAKuG3-A9qaC-mPXr7e9vefrNHHUUTE=SA@mail.gmail.com>
Date:	Thu, 7 Mar 2013 15:59:18 +0100
From:	Veaceslav Falico <darkmag@...il.com>
To:	Michael Wang <wangyun@...ux.vnet.ibm.com>
Cc:	Linda Walsh <lkml@...nx.org>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Linux-Kernel <linux-kernel@...r.kernel.org>,
	netdev <netdev@...r.kernel.org>, Jay Vosburgh <fubar@...ibm.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Cong Wang <xiyou.wangcong@...il.com>,
	Andy Gospodarek <andy@...yhouse.net>
Subject: Re: upgrade to 3.8.1 : BUG Scheduling while atomic in bonding driver:

On Thu, Mar 7, 2013 at 10:50 AM, Michael Wang
<wangyun@...ux.vnet.ibm.com> wrote:
> On 03/07/2013 04:50 PM, Linda Walsh wrote:
>>
>> I am *not* seeing the bug in 3.8.2 with the 2nd patch applied (in
>> addition to the first)...
>
> So that means bond lock is the reason, nice, but this is really not a
> good fix if we just unlock it...

You're right, we can't unlock bond->lock while in bond_for_each_slave(),
however I think we don't even need that bond_update_speed_duplex() in
bond_miimon_commit() - cause the speed/duplex on interface state change
were already updated via NETDEV_UP/CHANGE event in
bond_slave_netdev_event() - and if not, it's not our fault, but the
driver's that he didn't call the notifiers.

So, maybe something like this would work (any feedback welcome, it's
definitely not the first time we hit the wall with sleeping in
bond_update_speed_duplex):

Subject: [PATCH] bonding: don't call update_speed_duplex() under spinlocks

bond_update_speed_duplex() might sleep while calling underlying slave's
routines. Move it out of atomic context in bond_enslave() and remove it
from bond_miimon_commit() - it was introduced by commit 546add79, however
when the slave interfaces go up/change state it's their responsability to
fire NETDEV_UP/NETDEV_CHANGE events so that bonding can properly update
their speed.

---
 drivers/net/bonding/bond_main.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 7bd068a..c63a33c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1746,6 +1746,8 @@ int bond_enslave(struct net_device *bond_dev,
struct net_device *slave_dev)

 	bond_compute_features(bond);

+	bond_update_speed_duplex(new_slave);
+
 	read_lock(&bond->lock);

 	new_slave->last_arp_rx = jiffies -
@@ -1798,8 +1800,6 @@ int bond_enslave(struct net_device *bond_dev,
struct net_device *slave_dev)
 		new_slave->link == BOND_LINK_DOWN ? "DOWN" :
 			(new_slave->link == BOND_LINK_UP ? "UP" : "BACK"));

-	bond_update_speed_duplex(new_slave);
-
 	if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) {
 		/* if there is a primary slave, remember it */
 		if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
@@ -2373,8 +2373,6 @@ static void bond_miimon_commit(struct bonding *bond)
 				bond_set_backup_slave(slave);
 			}

-			bond_update_speed_duplex(slave);
-
 			pr_info("%s: link status definitely up for interface %s, %u Mbps
%s duplex.\n",
 				bond->dev->name, slave->dev->name,
 				slave->speed, slave->duplex ? "full" : "half");
-- 
1.7.1

>
> The better way is to move the cycle wait logical out of the
> bond_update_speed_duplex() IMO, I think we need the folk who work on
> this driver to make the decision ;-)
>
> Regards,
> Michael Wang
>
>>
>>
>> Michael Wang wrote:
>>>
>>>
>>> And both bond_enslave() and bond_mii_monitor() are using bond_update_speed_duplex()
>>> with preempt disabled.
>>>
>>> Along with the changes in bond_enslave(), I think you also need this (untested):
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index 11d01d6..9af143a 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -2373,7 +2373,9 @@ static void bond_miimon_commit(struct bonding *bond)
>>>                                 bond_set_backup_slave(slave);
>>>                         }
>>>
>>> +                       read_unlock(&bond->lock);
>>>                         bond_update_speed_duplex(slave);
>>> +                       read_lock(&bond->lock);
>>>
>>>                         pr_info("%s: link status definitely up for interface %s, %u Mbps %s duplex.\n",
>>>                                 bond->dev->name, slave->dev->name,
>>>
>>>
>>> Regards,
>>> Michael Wang
>>>
>>>
>>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Best regards,
Veaceslav Falico
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ