[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48EA2C71.6070509@bull.net>
Date: Mon, 06 Oct 2008 17:19:13 +0200
From: Benjamin Thery <benjamin.thery@...l.net>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: davem@...emloft.net, netdev@...r.kernel.org, dlezcano@...ibm.com
Subject: Re: [PATCH] net: deadlock during net device unregistration
Herbert Xu wrote:
> Benjamin Thery <benjamin.thery@...l.net> wrote:
>> 1. Unregister a device, the following routines are called:
>>
>> -> unregister_netdev
>> -> rtnl_lock
>> -> unregister_netdevice
>> -> rtnl_unlock
>> -> netdev_run_todo
>> -> netdev_wait_allrefs
>
> OK, this explains lots of dead-locks that people have been seeing.
>
> But I think we can go a step further:
>
> net: Fix netdev_run_todo dead-lock
>
> Benjamin Thery tracked down a bug that explains many instances
> of the error
>
> unregister_netdevice: waiting for %s to become free. Usage count = %d
>
> It turns out that netdev_run_todo can dead-lock with itself if
> a second instance of it is run in a thread that will then free
> a reference to the device waited on by the first instance.
>
> The problem is really quite silly. We were trying to create
> parallelism where none was required. As netdev_run_todo always
> follows a RTNL section, and that todo tasks can only be added
> with the RTNL held, by definition you should only need to wait
> for the very ones that you've added and be done with it.
>
> There is no need for a second mutex or spinlock.
>
> This is exactly what the following patch does.
Herbert, thank you for having looked at the issue too.
When I understood how the dead lock happened, I considered playing
with the locks in net_set_todo()/netdev_run_todo(), but as some comments
in the code in this area sounds a bit too cryptic for my brain, I didn't
dare to change them myself. :)
I guess you know a lot better than me the history behind this piece of
code and why it was done this way.
I tested your patch on my testbed during all the afternoon.
It fixes the dead lock I can easily reproduce here with net namespaces
and I didn't produce any regressions on my setup.
Benjamin
>
> Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index e8eb2b4..021f531 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3808,14 +3808,11 @@ static int dev_new_index(struct net *net)
> }
>
> /* Delayed registration/unregisteration */
> -static DEFINE_SPINLOCK(net_todo_list_lock);
> static LIST_HEAD(net_todo_list);
>
> static void net_set_todo(struct net_device *dev)
> {
> - spin_lock(&net_todo_list_lock);
> list_add_tail(&dev->todo_list, &net_todo_list);
> - spin_unlock(&net_todo_list_lock);
> }
>
> static void rollback_registered(struct net_device *dev)
> @@ -4142,33 +4139,24 @@ static void netdev_wait_allrefs(struct net_device *dev)
> * free_netdev(y1);
> * free_netdev(y2);
> *
> - * We are invoked by rtnl_unlock() after it drops the semaphore.
> + * We are invoked by rtnl_unlock().
> * This allows us to deal with problems:
> * 1) We can delete sysfs objects which invoke hotplug
> * without deadlocking with linkwatch via keventd.
> * 2) Since we run with the RTNL semaphore not held, we can sleep
> * safely in order to wait for the netdev refcnt to drop to zero.
> + *
> + * We must not return until all unregister events added during
> + * the interval the lock was held have been completed.
> */
> -static DEFINE_MUTEX(net_todo_run_mutex);
> void netdev_run_todo(void)
> {
> struct list_head list;
>
> - /* Need to guard against multiple cpu's getting out of order. */
> - mutex_lock(&net_todo_run_mutex);
> -
> - /* Not safe to do outside the semaphore. We must not return
> - * until all unregister events invoked by the local processor
> - * have been completed (either by this todo run, or one on
> - * another cpu).
> - */
> - if (list_empty(&net_todo_list))
> - goto out;
> -
> /* Snapshot list, allow later requests */
> - spin_lock(&net_todo_list_lock);
> list_replace_init(&net_todo_list, &list);
> - spin_unlock(&net_todo_list_lock);
> +
> + __rtnl_unlock();
>
> while (!list_empty(&list)) {
> struct net_device *dev
> @@ -4200,9 +4188,6 @@ void netdev_run_todo(void)
> /* Free network device */
> kobject_put(&dev->dev.kobj);
> }
> -
> -out:
> - mutex_unlock(&net_todo_run_mutex);
> }
>
> static struct net_device_stats *internal_stats(struct net_device *dev)
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 71edb8b..d6381c2 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -73,7 +73,7 @@ void __rtnl_unlock(void)
>
> void rtnl_unlock(void)
> {
> - mutex_unlock(&rtnl_mutex);
> + /* This fellow will unlock it for us. */
> netdev_run_todo();
> }
>
> Cheers,
--
B e n j a m i n T h e r y - BULL/DT/Open Software R&D
http://www.bull.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists