netdev - Re: rhashtable issue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <pq73vur4sgec4hxjugk5abgpqiftpkkdyvmtcq246jv2vuseok@qn2x6xzfqii7>
Date: Sun, 24 Nov 2024 18:58:12 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: NeilBrown <neilb@...e.de>, Thomas Graf <tgraf@...g.ch>, 
	netdev@...r.kernel.org
Subject: Re: rhashtable issue - -EBUSY

On Mon, Nov 25, 2024 at 07:17:25AM +0800, Herbert Xu wrote:
> On Sun, Nov 24, 2024 at 05:35:56PM -0500, Kent Overstreet wrote:
> >
> > That's what I've been describing, but you keep insisting that this must
> > be misuse, even though I'm telling you I've got the error code that
> > shows what is going on.
> 
> Well, please do as I suggested and dump the chain with over 16
> entries when this happens.  If you can prove to me that you've
> got 16 entries with non-identical keys that hashed to the same
> bucket then I will fix this.  Please also dump the table size
> and the total number of entries currently hashed.
> 
> As I said, every single report in the past has turned out to be
> because people were adding multiple entries with identical keys
> to the same hash table, which will obviously breach the limit of
> 16.
> 
> But I think there is one thing that I will do, the rehash check
> is a bit too loose.  It should only fail if the outstanding rehash
> was caused by insertion failure, and not if it was a growth or
> shrink operation.

Hang on, I see what's going on :) It's not duplicate keys, we're doing
something exceptionally weird here.

We're not hashing the full key, because we require that inodes in
different subvolumes hash to the same bucket - we need to be able to
iterate over cached inodes with the same inode number in all subvolumes
so that fsck can check if deleted inodes are still open, and that
requires iterating over all the subvolumes to look for descendents.

(Yes, it's a bit gross, but I've been trying to avoid a two-level lookup
structure.)

But - your rhltable gives me an idea for a better solution, which would
be to use two different hash tables for this (one indexed by
subvol:inum, for primary lookups, and an rhltable for indexing by inum
for fsck).

Sorry for claiming this was your bug - I do agree with Neal that the
rhastable code could handle this situation better though, so as to avoid
crazy bughunts.