[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55C3743F.1010900@iogearbox.net>
Date: Thu, 06 Aug 2015 16:50:39 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Herbert Xu <herbert@...dor.apana.org.au>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Jiri Pirko <jiri@...nulli.us>,
Cong Wang <cwang@...pensource.com>,
David Miller <davem@...emloft.net>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>,
Thomas Graf <tgraf@...g.ch>, Scott Feldman <sfeldma@...il.com>,
Network Development <netdev@...r.kernel.org>
Subject: Re: rtnl_mutex deadlock?
On 08/06/2015 02:30 AM, Herbert Xu wrote:
> On Wed, Aug 05, 2015 at 08:59:07PM +0200, Daniel Borkmann wrote:
>>
>> Here's a theory and patch below. Herbert, Thomas, does this make any
>> sense to you resp. sound plausible? ;)
>
> It's certainly possible. Whether it's plausible I'm not so sure.
> The netlink hashtable is unlimited in size. So it should always
> be expanding, not rehashing. The bug you found should only affect
> rehashing.
>
>> I'm not quite sure what's best to return from here, i.e. whether we
>> propagate -ENOMEM or instead retry over and over again hoping that the
>> rehashing completed (and no new rehashing started in the mean time) ...
>
> Please use something other than ENOMEM as it is already heavily
> used in this context. Perhaps EOVERFLOW?
Okay, I'll do that.
> We should probably add a WARN_ON_ONCE in rhashtable_insert_rehash
> since two concurrent rehashings indicates something is going
> seriously wrong.
So, if I didn't miss anything, it looks like the following could have
happened: the worker thread, that is rht_deferred_worker(), itself could
trigger the first rehashing, e.g. after shrinking or expanding (or also
in case none of both happen).
Then, in __rhashtable_insert_fast(), I could trigger an -EBUSY when I'm
really unlucky and exceed the ht->elasticity limit of 16. I would then
end up in rhashtable_insert_rehash() to find out there's already one
ongoing and thus, I'm getting -EBUSY via __netlink_insert().
Perhaps that is what could have happened? Seems rare though, but it was
also only seen rarely so far ...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists