[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mwnb949b.fsf@xmission.com>
Date: Tue, 17 Sep 2013 02:38:56 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Francesco Ruggeri <fruggeri@...stanetworks.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jiri Pirko <jiri@...nulli.us>,
Alexander Duyck <alexander.h.duyck@...el.com>,
Cong Wang <amwang@...hat.com>, netdev@...r.kernel.org
Subject: Re: [CFT][PATCH] net: Delay default_device_exit_batch until no devices are unregistering
Francesco Ruggeri <fruggeri@...stanetworks.com> writes:
> On Mon, Sep 16, 2013 at 8:49 PM, Eric W. Biederman
> <ebiederm@...ssion.com> wrote:
>> Francesco could you take a look at this. I am about 99% certain this is
>> right but I am starting to fade. So it is entirely possible I missed
>> something.
>
> Same here ...
> The logic looks right to me and I think it should address the original
> issue I ran into.
> Would it make sense to have netdev_unregistering and
> netdev_unregistering_wait be per-namespace, and have
> default_device_exit_batch only wait for the namespaces in net_list? It
> would require some extra loops and locking, but it may help avoid
> unnecessary waits.
It might be worth it, and it certainly would give a good progress
guarantee, as activity in an exiting network namespace is guaranteed to
wind down. Progress guarantees are always good to have.
The downside of a per netns unregistering count is that we will have
extra wake ups which will could result in extra contention on the
rtnl_lock.
The wait queue definitely makes sense to be global as there only ever
can be a single waiter, because the netns_wq is single threaded and
everything we are doing is happening with the net_mutex held.
The count doesn't necessarily need to be atomic as it can be protected
by the rtnl_lock.
Like I said the logic is a little rough as I was tired.
I am heading off to New Orleans for Linux Plumbers Conference in a
couple hours so I don't expect I will be particularly responsive
again until next week.
If you could test this patch perhaps refine it I think we are almost at
a final point of fixing this.
Just to be clear my reason for prefering this approach is that because
it adds no extra wait points (we already wait for the rtnl_lock), the
logic is unconditional and explicit and not hidden in the loopback
device's reference count. Which should allow anyone reading the code
to discover and understand this guarantee. Although a big fat comment
in default_device_exit_batch that we are guaranteeing we don't allow
the network namespace to exit while there are still network devices in
it (or something to that effect) is probably appropriate.
Eric
> Francesco
>
>>
>> net/core/dev.c | 12 ++++++++++++
>> 1 files changed, 12 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 5d702fe..c25e6f3 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -5002,10 +5002,13 @@ static int dev_new_index(struct net *net)
>>
>> /* Delayed registration/unregisteration */
>> static LIST_HEAD(net_todo_list);
>> +static atomic_t netdev_unregistering = ATOMIC_INIT(0);
>> +static DECLARE_WAIT_QUEUE_HEAD(netdev_unregistering_wait);
>>
>> static void net_set_todo(struct net_device *dev)
>> {
>> list_add_tail(&dev->todo_list, &net_todo_list);
>> + atomic_inc(&netdev_unregistering);
>> }
>>
>> static void rollback_registered_many(struct list_head *head)
>> @@ -5673,6 +5676,9 @@ void netdev_run_todo(void)
>> if (dev->destructor)
>> dev->destructor(dev);
>>
>> + if (atomic_dec_and_test(&netdev_unregistering))
>> + wake_up(&netdev_unregistering_wait);
>> +
>> /* Free network device */
>> kobject_put(&dev->dev.kobj);
>> }
>> @@ -6369,7 +6375,13 @@ static void __net_exit default_device_exit_batch(struct list_head *net_list)
>> struct net *net;
>> LIST_HEAD(dev_kill_list);
>>
>> +retry:
>> + wait_event(netdev_unregistering_wait, (atomic_read(&netdev_unregistering) == 0));
>> rtnl_lock();
>> + if (atomic_read(&netdev_unregistering) != 0) {
>> + __rtnl_unlock();
>> + goto retry;
>> + }
>> list_for_each_entry(net, net_list, exit_list) {
>> for_each_netdev_reverse(net, dev) {
>> if (dev->rtnl_link_ops)
>> --
>> 1.7.5.4
>>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists