[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1401515401.9728.171.camel@LTIRV-MCHAN1.corp.ad.broadcom.com>
Date: Fri, 30 May 2014 22:50:01 -0700
From: Michael Chan <mchan@...adcom.com>
To: Neil Horman <nhorman@...driver.com>
CC: <netdev@...r.kernel.org>, "David S. Miller" <davem@...emloft.net>,
<fcoe-devel@...n-fcoe.org>, Robert Love <robert.w.love@...el.com>,
Vasu Dev <vasu.dev@...el.com>
Subject: Re: [PATCH] cnic: don't take the rtnl_read_lock in
cnic_rcv_netevent
On Fri, 2014-05-30 at 22:41 -0400, Neil Horman wrote:
> On Fri, May 30, 2014 at 01:58:33PM -0700, Michael Chan wrote:
> > On Fri, 2014-05-30 at 16:38 -0400, Neil Horman wrote:
> > > On Fri, May 30, 2014 at 01:13:40PM -0700, Michael Chan wrote:
> > > > On Fri, 2014-05-30 at 16:03 -0400, Neil Horman wrote:
> > > > > On Fri, May 30, 2014 at 10:58:11AM -0700, Michael Chan wrote:
> > > > > > On Fri, 2014-05-30 at 11:00 -0400, Neil Horman wrote:
> > > > > > > The Cnic driver handles lots of ulp operations in its netdevice event hanlder.
> > > > > > > To do this, it accesses the ulp_ops array, which is an rcu protected array.
> > > > > > > However, some ulp operations (like bnx2fc_indicate_netevent) try to lock
> > > > > > > mutexes, which might sleep (somthing that you can't do while holding rcu read
> > > > > > > side locks if you've configured non-preemptive rcu.
> > > > > > >
> > > > > > > Fix this by changing the dereference method. All accesses to the ulp_ops array
> > > > > > > for a cnic dev are modified under the protection of the rtnl lock, and so we can
> > > > > > > safely just use rcu_dereference_rtnl, and remove the rcu_read_lock here
> > > > > >
> > > > > > Because the bnx2fc function can sleep, we need a more complete fix to
> > > > > > prevent the ulp_ops from going away when the device is unregistered.
> > > > > > synchronize_rcu() won't be able to protect it. I'll post the patch
> > > > > > later today. Thanks.
> > > > > >
> > > > > The device can't be unregistered while we hold rtnl, can it? Since we hold it
> > > > > in this path it seems safe to me, even if we sleep, or am I missing something?
> > > > > Neil
> > > > >
> > > > The netdev cannot be unregistered of course, but I am talking about
> > > > bnx2fc unregistering the cnic device. For example if someone does
> > > > fcoeadm -d or bnx2fc gets unloaded.
> > >
> > > I don't think the latter can happen, as creating an fcoe transport places a hold
> > > on the bnx2fc module (see bnx2fc_create), and the former operation (fcoeadm -d)
> > > will block in bnx2fc_destroy as it requires holding the rtnl_lock, which will
> > > already be held by the netevent notifer, and confirmed by the
> > > rcu_dereference_rtnl in my patch.
> > >
> > > I really think we're safe here
> >
> > Take a look at bnx2fc_mod_exit(). It doesn't look safe to me as it goes
> > through the adapter_list unregistering all cnic devices not under
> > rtnl_lock.
> >
> Right, but you can't get into the module removal code at all until all
> transports are unregistered. I suppose if you have no registered transports and
> remove the bnx2fc module while a netdevice event occurs, there might be a
> problem, but I think that problem is bigger than what we're talking about here,
> as you don't want to remove the module at all while running a netdevice
> notifier, as you'll wind up potentially executing garbage.
As long as we take care of the race conditions, I don't think there is a
bigger problem. During bnx2fc module removal, it will unregister all
cnic devices. If there is a netdev event, we will synchronize and the
unregister call will wait for all pending netdev event handling to be
done before completing. The alternate patch that I sent out should take
care of this condition. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists