linux-kernel - Re: [PATCH] rhashtable: use likely/unlikely for rhashtable lookup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-id: <175911730161.1696783.8081419303155421417@noble.neil.brown.name>
Date: Mon, 29 Sep 2025 13:41:41 +1000
From: NeilBrown <neilb@...mail.net>
To: "Menglong Dong" <menglong8.dong@...il.com>
Cc: herbert@...dor.apana.org.au, tgraf@...g.ch, linux-crypto@...r.kernel.org,
 linux-kernel@...r.kernel.org, jiang.biao@...ux.dev
Subject: Re: [PATCH] rhashtable: use likely/unlikely for rhashtable lookup

On Sun, 28 Sep 2025, Menglong Dong wrote:
> Sometimes, the result of the rhashtable_lookup() is expected to be found
> or not found. Therefore, we can use likely() or unlikely() for such cases.
> 
> Following new functions are introduced, which will use likely or unlikely
> during the lookup:
> 
>  rhashtable_lookup_likely
>  rhashtable_lookup_unlikely
>  rhltable_lookup_likely
>  rhltable_lookup_unlikely
> 
> A micro-benchmark is made for these new functions: lookup a existed entry
> repeatedly for 100000000 times, and rhashtable_lookup_likely() gets ~30%
> speedup.

I generally like this patch - it seems well structured and leaves the
code easy to maintain.

I think you have made a good case for rhashtable_lookup_likely() and it
seems sensible to optimise that case.

I'm less sure of rhashtable_lookup_unlikely() - you have provided no
measurements for that.

In general we expect an rhashtable to be between 33% and 75% full.  The
inevitable hash collisions will mean that the number of used slots in
the bucket table will be a little less that this.  But let's assume 50%
of the buckets are in use at any time on average.

If you are using rhashtable_lookup_likely() you expect to find the
target so you expect the bucket to not be empty, so it is reasonable to
tell the compiler that it is "likely" that the pointer isn't NULL.

However if you don't expect to find the target, that DOESN'T mean you
expect the bucket to be empty - it could have another item it in.  All
you can really say is that the probability of an empty bucket matches
the degree of utilisation of the table - so about 50% as discussed
above.

So I don't think it is reasonable to ever tell the compiler that an
bucket being empty is "likely".  You also use "likely()" for deciding
whether or not to subtract the key offset from the address before
returning a pointer.  This is a valid thing to tell the compiler, but we
would need numbers to confirm whether or not it was worth adding to the
API.

If, however, you could provide numbers showing that in an rhashtable
with lots of entries, a lookup for a non-existing key was faster with
the rhashtable_lookup_unlikely() code, then I would find that
interesting and worth pursuing.

In general it would be interesting to know the number for both
successful lookups and unsuccessful lookups across all three proposed
lookup versions.  I don't know how helpful it would be, but I would find
it interesting...

Thanks,
NeilBrown