[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1780.1416279507@famine>
Date: Mon, 17 Nov 2014 18:58:27 -0800
From: Jay Vosburgh <jay.vosburgh@...onical.com>
To: Eric Dumazet <eric.dumazet@...il.com>
cc: Wengang <wen.gang.wang@...cle.com>, netdev@...r.kernel.org
Subject: Re: [PATCH] [bonding]: clear header_ops when last slave detached
Eric Dumazet <eric.dumazet@...il.com> wrote:
>On Tue, 2014-11-18 at 09:56 +0800, Wengang wrote:
>> Hi Jay,
>>
>> 于 2014年11月18日 09:38, Jay Vosburgh 写道:
>> > Wengang <wen.gang.wang@...cle.com> wrote:
>> >
>> >> Hi,
>> >>
>> >> Could anybody please review this patch?
>> > I don't see that the original of this ever came through netdev.
>>
>> Oh, that' bad. I sent this to netdev@...r.kernel.org. The mail address
>> is wrong?
>>
>> >> thanks,
>> >> wengang
>> >>
>> >> 于 2014年11月13日 10:19, Wengang Wang 写道:
>> >>> When last slave of a bonding master is removed, the bonding then does not work.
>> >>> When packet_snd is called against with a master net_device, it accesses
>> >>> header_ops. In case the header_ops is not valid any longer(say module unloaded)
>> >>> it will then access an invalid memory address.
>> >>> This patch try to fix this issue by clearing header_ops when last slave
>> >>> detached.
>> > Am I correct in presuming that this behavior is limited to ipoib
>> > slaves only? I don't see that this could occur with ethernet slaves, as
>> > eth_header_ops isn't part of a module. This needs to be mentioned in
>> > the commit log.
>> Yes, the problem is found with ipoib slaves.
>> >>> Signed-off-by: Wengang Wang <wen.gang.wang@...cle.com>
>> >>> ---
>> >>> drivers/net/bonding/bond_main.c | 2 ++
>> >>> 1 file changed, 2 insertions(+)
>> >>>
>> >>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> >>> index c9ac06c..84a34fc 100644
>> >>> --- a/drivers/net/bonding/bond_main.c
>> >>> +++ b/drivers/net/bonding/bond_main.c
>> >>> @@ -1728,6 +1728,8 @@ static int __bond_release_one(struct net_device *bond_dev,
>> >>> unblock_netpoll_tx();
>> >>> synchronize_rcu();
>> >>> bond->slave_cnt--;
>> >>> + if (!bond->slave_cnt)
>> >>> + bond->dev->header_ops = NULL;
>> >>> if (!bond_has_slaves(bond)) {
>> >>> call_netdevice_notifiers(NETDEV_CHANGEADDR, bond->dev);
>> > I believe your addition could be moved into the block for the
>> > next if, as "!bond->slave_cnt" is essentially "!bond_has_slaves()".
>>
>> Yes, Agree.
>> I will send the second prompt soon with commit message mentioning ipoib.
>
>I really don't like this patch. Its quite racy.
>
>bond_setup_by_slave() kind of assume slave_dev->header_ops is always
>present.
Isn't the ipoib header_ops implicitly gated by the presence or
absence of the module itself? An ipoib device can't be enslaved unless
ipoib is loaded, and if ipoib is loaded, the ops are present. And ipoib
can't be removed while there are interfaces enslaved to bonding.
I'm not saying it's not ugly, but I'm not seeing why it won't
work or what the race would be.
>No rcu protection, no module refcount protection for struct header_ops
>
>Considering ipoib_hard_header() is quite small, you might instead move
>ipoib_hard_header() and ipoib_header_ops in static vmlinux, like we do
>for eth_header_ops.
Won't this require including all of the functions referenced by
the ops? The problem here is that packet_snd will call dev_hard_header,
which wants to call header_ops->create.
Ok, now that I check, there's only one op in ipoib_header_ops,
->create, and it's fairly simple.
There was a similar chicken and egg problem with bonding and
ipoib a while back related to the master device having a dangling
pointer into ipoib somewhere; that might have been the header_ops as
well, so there may be a hack or two that could be removed if the ops
cannot disappear.
-J
---
-Jay Vosburgh, jay.vosburgh@...onical.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists