[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <oth2obo.8109e7bd386ff72c34c0bda4b6945edf@obelix.schillstrom.com>
Date: Tue, 14 Jun 2011 00:09:26 +0200 (CEST)
From: "Hans Schillstrom" <hans@...illstrom.com>
To: "Julian Anastasov" <ja@....bg>
Cc: "Hans Schillstrom" <hans.schillstrom@...csson.com>,
horms@...ge.net.au, wensong@...ux-vs.org,
lvs-devel@...r.kernel.org, netdev@...r.kernel.org,
netfilter-devel@...r.kernel.org
Subject: Re: Re: [RFC PATCH 1/1] IPVS netns shutdown/startup dead-lock
>---- Original Message ----
>From: Julian Anastasov <ja@....bg>
>To: "Hans Schillstrom" <hans.schillstrom@...csson.com>
>Cc: horms@...ge.net.au, wensong@...ux-vs.org, lvs-devel@...r.kernel.org, netdev@...r.kernel.org, netfilter-devel@...r.kernel.org, hans@...illstrom.com
>Sent: Mon, Jun 13, 2011, 23:23 PM
>Subject: Re: [RFC PATCH 1/1] IPVS netns shutdown/startup dead-lock
>
>Hello,
>
>On Mon, 13 Jun 2011, Hans Schillstrom wrote:
>
>> ip_vs_mutext is used by both netns shutdown code and startup
>> and both implicit uses sk_lock-AF_INET mutex.
>>
>> cleanup CPU-1 startup CPU-2
>> ip_vs_dst_event() ip_vs_genl_set_cmd()
>> sk_lock-AF_INET __ip_vs_mutex
>> sk_lock-AF_INET
>> __ip_vs_mutex
>> * DEAD LOCK *
>
> So, sk_lock-AF_INET is locked before calling
>ip_vs_dst_event ? Do you have a backtrace for this case?
Yes plenty this one is with lockdep
Chain exists of:
rtnl_mutex --> __ip_vs_mutex --> sk_lock-AF_INET
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sk_lock-AF_INET);
lock(__ip_vs_mutex);
lock(sk_lock-AF_INET);
lock(rtnl_mutex);
*** DEADLOCK ***
3 locks held by ipvsadm/993:
#0: (genl_mutex){+.+.+.}, at: [<ffffffff812edc52>] genl_lock+0x17/0x19
#1: (__ip_vs_mutex){+.+.+.}, at: [<ffffffff81307dcb>] ip_vs_genl_set_cmd+0xe1/0x3a3
#2: (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff8130ffc1>] start_sync_thread+0x3ec/0x5ff
>I assume your patch is tested to fix the problem ?
Yes.
>
>> This can be solved by have the ip_vs_mutex per netns
>> or avid locking when starting/stoping sync-threads.
>> i.e. just add a starting/stoping flag.
>
> sk_release_kernel is called in thread
>context, so ip_vs_mutex is not involved there. We
>have a problem only with start_sync_thread, right?
Yes the socket operations
>
>> ip_vs_mutex per name-space seems to be a more future proof solution.
>
> Global mutex protects some global lists such as
>virtual services. If your patch works, better way to fix this problem
>is to use some new mutex. May be we can move the IPVS_CMD_NEW_DAEMON,
>IPVS_CMD_DEL_DAEMON and IP_VS_SO_GET_DAEMON code before the
>__ip_vs_mutex locking. This mutex should be used for start_sync_thread,
>stop_sync_thread, ip_vs_genl_dump_daemons and IP_VS_SO_GET_DAEMON.
>For example, ip_vs_sync_mutex.
I think we should avoid global mutexes as a rule of tumb,
because it's realy hard to keep track of all possible cases
that can occur when multiple netns is alive and/or goes up and down.
There might be more suprises while a netns exits (in terms of locks)...
my gut feeling is, avoid global locks as long as possible.
>
> Note that __ip_vs_sync_cleanup is missing a
>__ip_vs_mutex lock. We have to use the new mutex there.
OK
>
>> Which one should be used ?
>
> For now __ip_vs_mutex should be global ...
I do agree, but in the long term I vote for mutex per netns.
Thanks a lot Julian
Regards
Hans
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists