linux-kernel - Re: [Patch] bonding: fix netpoll in active-backup mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4D75AD50.7060400@redhat.com>
Date:	Tue, 08 Mar 2011 12:15:12 +0800
From:	Cong Wang <amwang@...hat.com>
To:	Neil Horman <nhorman@...driver.com>
CC:	linux-kernel@...r.kernel.org, Jay Vosburgh <fubar@...ibm.com>,
	"David S. Miller" <davem@...emloft.net>,
	Herbert Xu <herbert@...dor.hengli.com.au>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	"John W. Linville" <linville@...driver.com>,
	Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org
Subject: Re: [Patch] bonding: fix netpoll in active-backup mode

于 2011年03月08日 02:50, Neil Horman 写道:
> On Mon, Mar 07, 2011 at 10:11:50PM +0800, Amerigo Wang wrote:
>> netconsole doesn't work in active-backup mode, because we don't do anything
>> for nic failover in active-backup mode. This patch fixes the problem by:
>>
>> 1) make slave_enable_netpoll() and slave_disable_netpoll() callable in softirq
>>     context, that is, moving code after synchronize_rcu_bh() into call_rcu_bh()
>>     callback function, teaching kzalloc() to use GFP_ATOMIC.
>>
>> 2) disable netpoll on old slave and enable netpoll on the new slave.
>>
>> Tested by ifdown the current active slave and ifup it again for several times,
>> netconsole works well.
>>
>> Signed-off-by: WANG Cong<amwang@...hat.com>
>>
> I may be missing soething but this seems way over-complicated to me.  I presume
> the problem is that in active backup mode a failover results in the new active
> slave not having netpoll setup on it?  If thats the case, why not just setup
> netpoll on all slaves when ndo_netpoll_setup is called on the bonding interface?
> I don't see anything immeidately catastrophic that would happen as a result.


But we still need to clean up the netpoll on the failing slave, which still
needs to call slave_disable_netpoll() in monitor code, I see no big differences
with the solution I take.


> And then you wouldn't have to worry about disabling/enabling anything on a
> failover (or during a panic for that matter).  As for the rcu bits?  Why are
> they needed?  One would presume that wouldn't (or at least shouldn't) be able to
> teardown our netpoll setup until such time as all the pending frames for that
> netpoll client have been transmitted.  If we're not blocknig on that RCU isn't
> really going to help.  Seems like the proper fix is take a reference to the
> appropriate npinfo struct in netpoll_send_skb, and drop it from the skbs
> destructor or some such.

I saw a "scheduling while in atomic" warning without touching the rcu bits.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/