linux-kernel - Re: [PATCH] rhashtable: use likely/unlikely for rhashtable lookup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CADxym3bHM3ykp6nSCNT_8fCVAGk04c1qgoFeAQbrCURSHk3NDg@mail.gmail.com>
Date: Mon, 29 Sep 2025 16:44:50 +0800
From: Menglong Dong <menglong8.dong@...il.com>
To: NeilBrown <neil@...wn.name>
Cc: herbert@...dor.apana.org.au, tgraf@...g.ch, linux-crypto@...r.kernel.org, 
	linux-kernel@...r.kernel.org, jiang.biao@...ux.dev
Subject: Re: [PATCH] rhashtable: use likely/unlikely for rhashtable lookup

On Mon, Sep 29, 2025 at 11:41 AM NeilBrown <neilb@...mail.net> wrote:
>
> On Sun, 28 Sep 2025, Menglong Dong wrote:
> > Sometimes, the result of the rhashtable_lookup() is expected to be found
> > or not found. Therefore, we can use likely() or unlikely() for such cases.
> >
> > Following new functions are introduced, which will use likely or unlikely
> > during the lookup:
> >
> >  rhashtable_lookup_likely
> >  rhashtable_lookup_unlikely
> >  rhltable_lookup_likely
> >  rhltable_lookup_unlikely
> >
> > A micro-benchmark is made for these new functions: lookup a existed entry
> > repeatedly for 100000000 times, and rhashtable_lookup_likely() gets ~30%
> > speedup.
>
> I generally like this patch - it seems well structured and leaves the
> code easy to maintain.
>
> I think you have made a good case for rhashtable_lookup_likely() and it
> seems sensible to optimise that case.
>
> I'm less sure of rhashtable_lookup_unlikely() - you have provided no
> measurements for that.
>
> In general we expect an rhashtable to be between 33% and 75% full.  The
> inevitable hash collisions will mean that the number of used slots in
> the bucket table will be a little less that this.  But let's assume 50%
> of the buckets are in use at any time on average.
>
> If you are using rhashtable_lookup_likely() you expect to find the
> target so you expect the bucket to not be empty, so it is reasonable to
> tell the compiler that it is "likely" that the pointer isn't NULL.
>
> However if you don't expect to find the target, that DOESN'T mean you
> expect the bucket to be empty - it could have another item it in.  All
> you can really say is that the probability of an empty bucket matches
> the degree of utilisation of the table - so about 50% as discussed
> above.
>
> So I don't think it is reasonable to ever tell the compiler that an
> bucket being empty is "likely".  You also use "likely()" for deciding
> whether or not to subtract the key offset from the address before
> returning a pointer.  This is a valid thing to tell the compiler, but we
> would need numbers to confirm whether or not it was worth adding to the
> API.
>
> If, however, you could provide numbers showing that in an rhashtable
> with lots of entries, a lookup for a non-existing key was faster with
> the rhashtable_lookup_unlikely() code, then I would find that
> interesting and worth pursuing.

Ah, you are right, I wasn't aware of this part. I think it makes no sense
to use unlikely() if the possibility of hitting the budget is ~50%. The only
thing that may matter is the using of unlikely() in rhashtable_lookup()
before returning the pointer.

I'll do a bench testing on that part, and I'll remove the unlikely version
directly in the next version if it doesn't help.

Thanks a lot!
Menglong Dong

>
> In general it would be interesting to know the number for both
> successful lookups and unsuccessful lookups across all three proposed
> lookup versions.  I don't know how helpful it would be, but I would find
> it interesting...
>
> Thanks,
> NeilBrown