[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <173244033805.1734440.12627345429438896757@noble.neil.brown.name>
Date: Sun, 24 Nov 2024 20:25:38 +1100
From: "NeilBrown" <neilb@...e.de>
To: "Herbert Xu" <herbert@...dor.apana.org.au>
Cc: "Kent Overstreet" <kent.overstreet@...ux.dev>,
"Thomas Graf" <tgraf@...g.ch>, netdev@...r.kernel.org
Subject: Re: rhashtable issue - -EBUSY
On Sun, 24 Nov 2024, Herbert Xu wrote:
> On Sat, Nov 23, 2024 at 12:21:21PM -0500, Kent Overstreet wrote:
> > I'm seeing an issue where rhashtable inserts sporadically return -EBUSY,
> > thrown from rhashtable_insert_rehash() when there's already a rehash in
> > progress, i.e. rehashes aren't keeping up.
>
> EBUSY should never happen *if* you're using the hash table correctly.
> It is expected to happen if you insert multiple identical entries
> into the same hash table (the correct thing to do in that case is
> to use rhltable which is designed to accomodate multiple entries
> with the same key).
"*if* you're using the hash table correctly" seems to mean that you have
chosen a hash function which is a sufficiently good fit for your
incoming data stream. With the best effort in the world you cannot
provide a perfect guarantee of that - and most of us don't have the
expertise to do better than use the default or something easily
available in the library (and while we can mostly trust the libraries
... https://lwn.net/Articles/687494/ ).
>
> Now assuming that is not the case and you are using rhashtable
> correctly, this is when an EBUSY will occur during an insert:
>
> 1) The hash table elasticity has been violated, meaning that
> more than 16 entries are in a single chain.
>
> 2) The hash table is below 50% capacity (meaning that statistically
> we do not expect 1) to be true).
>
> 3) An existing rehash is already taking place.
>
> The reason we have the EBUSY mechanism in place is to prevent
> the case where we are being actively attacked by a hostile
> actor, who has somehow compromised our hash function.
I think that in many cases this "cure" (i.e. returning -EBUSY) is worse
than the disease.
I believe that inserting into a hash table should *always* succeed with
zero possibility of failure. Nil. None. Nought. (or at least it should
be possible to request this behaviour)
This is why I posted
https://lore.kernel.org/all/152210718434.11435.6551477417902631683.stgit@noble/
six years ago, but you wouldn't accept it.
I understand that there might be situations where failure is tolerable
and avoiding pathological hash patterns is preferred. I imagine that in
parts of the network layer were retry-on-failure is common, that might be
a perfect fit. But in other situations such as in filesystem internals,
I think failure in intolerable.
Failure should not just be extremely unlikely. It should be
mathematically impossible.
thanks,
NeilBrown
>
> The first line of defence is to change our hash function and
> conduct a rehash. This is what would have occured if 3) is
> false.
>
> Once a rehash is in place, if we hit 1) + 2) again, then it
> means that our defence was futile and the hostile actor is
> still able to create arbitrarily long hash chains (possibly
> by compromising our RNG since we use that for the rehash),
> and as we do not have any defence mechanism for that, it is
> better to just fail.
>
> Of course this is meant to be impossible to hit in practice.
> Whenever it has occurred in the past, it's always been because
> people were tring to insert identical keyed entries into the
> same table (Use rhltable instead).
>
> Theoretically it is also possible to hit this if your hash
> table was immense (expected worst-case hash chain length is
> log N / log log N), but it has never been hit in the past
> with our default elasticity of 16.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@...dor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>
Powered by blists - more mailing lists