[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5515C6C4.4080200@redhat.com>
Date: Fri, 27 Mar 2015 14:08:20 -0700
From: Alexander Duyck <alexander.h.duyck@...hat.com>
To: Cong Wang <cwang@...pensource.com>, Thomas Graf <tgraf@...g.ch>
CC: Cong Wang <xiyou.wangcong@...il.com>,
netdev <netdev@...r.kernel.org>
Subject: Re: [Patch net-next] fib: move fib_rules_cleanup_ops() under rtnl
lock
On 03/27/2015 12:25 PM, Cong Wang wrote:
> On Fri, Mar 27, 2015 at 5:01 AM, Thomas Graf <tgraf@...g.ch> wrote:
>> On 03/26/15 at 04:47pm, Alexander Duyck wrote:
>>> On 03/26/2015 04:05 PM, Cong Wang wrote:
>>>> On Thu, Mar 26, 2015 at 3:32 PM, Cong Wang <cwang@...pensource.com> wrote:
>>>>> On the other hand, the name rules_mod_lock already tells it is
>>>>> just a protection for ops (module) register.
>>>> I even doubt we really need rules_mod_lock, it is per netns,
>>>> which is newly allocated when registering pernet and upper
>>>> layer should guarantee no concurrent unregistering, so we
>>>> probably only need to take rtnl lock.
>>> I'm adding Thomas as he was the original author for the code and might have
>>> a better idea of what needs to be rtnl locked and what doesn't. You should
>>> probably CC him as well on the v2 patch.
>>>
>>> As far as why I am so focused on moving fib4_rules_exit it is because we
>>> don't want to call that delete function until after the table has been
>>> cleared. Otherwise you end up triggering the external_flush and unmerge
>>> code on a full table instead of an empty one. The result is you end up
>>> allocating a bunch of memory before you then turn around and free it. So
>>> even if you retain the rtnl_lock changes it would still be best to move
>>> fib4_rules_exit call to the region after you have freed the FIB tables, but
>>> before you free fib_table_hash.
>> I agree with Alex. Reordering fib4_rules_exit() makes sense. Not only
>> to fix this issue but also for the purpose of correct ordering of
>> allocation and releasing.
> I am surprised you guys only care this one issue, there are more
> for me.
>
> $subject already says this is net-next, nothing needs to worry
> about backport.
Who said anything about backport? I said my concern was the fact that
we were allocating in a shutdown path. We shouldn't be. That was why I
wanted the function moved.
>> It is definitely also safe to move fib_rules_cleanup_ops() out of
>> rules_mod_lock. It is the responsibility of whoever registers the
>> rules that no rule is in use he calls fib_rules_unregister().
>>
>> I don't see why fib_rules should hold rtnl_lock upon delete for the
>> caller. If the caller requires this protection it's up to him to
>> provide it.
> If not rtnl lock, then what prevents the race between netns unregistering
> with rule add/del via netlink?
>
> I know ops is removed from the list at that point, but ops->rules might be
> still being traversed under rtnl lock:
>
> ops = lookup_rules_ops();
> list_del_rcu(&ops->list);
> list_for_each_entry(ops->rules) {
> fib_rules_cleanup_ops(ops);
This locking issue, if present, is separate from the original issue you
reported. I'm going to submit a patch to fix your original issue and
you can chase this locking issue down separately if that is what you
want to do. This way if someone ever decides to backport it they can
actually fix the original issue without pulling in speculative fixes for
the rtnl locking problem since we were already holding the lock for fib4.
- Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists