linux-kernel - Re: [PATCH] selinux: Make avc cache slot size configurable during boot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250912105939.1079014-1-zhanghongru@xiaomi.com>
Date: Fri, 12 Sep 2025 18:59:39 +0800
From: Hongru Zhang <zhanghongru06@...il.com>
To: stephen.smalley.work@...il.com
Cc: linux-kernel@...r.kernel.org,
	omosnace@...hat.com,
	paul@...l-moore.com,
	selinux@...r.kernel.org,
	zhanghongru06@...il.com,
	zhanghongru@...omi.com
Subject: Re: [PATCH] selinux: Make avc cache slot size configurable during boot

> > > Baseline: 512 nodes, 512 buckets
> > > Comparison: 8192 nodes, 8192 buckets
> > >
> > > Metrics (Average value over 1800 samples):
> > > * Bucket utilization rate (higher -> better, same chain length assumed)
> > >         * Baseline: 52.5%
> > >         * Comparison: 49.5%
> > > * Max chain length (lower -> better, positive correlation with worst-case latency)
> > >         * Baseline: 7.5
> > >         * Comparison: 11.4
> > >
> > > Experimental results show that scaling buckets and nodes from 512 to 8192:
> > > 1. The distribution uniformity under the current hash algorithm does not
> > > degrade significantly;
> > > 2. The maximum chain length rise significantly, potentially degrading
> > > worst-case performance (ignoring other code in avc_search_node function).
> > >
> > > Details:
> > > url: https://gist.github.com/zhr250/cb7ebca61ff5455098082677d75b1795
> > >
> > > I will modify the hash algorithm in the avc_hash function and collect data
> > > again to see if we can achieve performance improvements.
> >
> > If you look elsewhere in the SELinux code, you'll see that others have
> > been converting other hash tables to using the jhash functions, so may
> > want to try those here too.
> 
> Or you could follow the example of ss/avtab.c which was converted to
> MurmurHash3.

I read the code of jhash and avtab_hash, it seems these algorithms are
more complex, with higher overhead compared to avc_hash. For cases with
a large number of nodes and limited number of buckets, the benefits can
be significant. However, for scenarios with a small number of nodes, the
overhead may already outweigh the gains. In my case, 8192 nodes are
sufficient, and I'm not sure if there are other cases with higher
requirements.

Based on the 'Max chain length' data I presented earlier, if we want the
hash operation and table lookup to yield an overall performance gain, the
overhead of the hash operation should not exceed the cost of traversing
4 additional nodes (11.4 - 7.5 ~= 4). If measured by 
'Bucket utilization rate' (~50%, means average 2 nodes per chain), this
overhead should not exceed the cost of traversing 1 extra node. Otherwise,
even if uniform distribution improves lookup performance, the hash
computation overhead could still degrade overall performance.

In scenarios requiring a large number of nodes, it seems necessary to
optimize the hash algorithm to avoid excessive bucket allocation for
maintaining performance, which would otherwise increase memory overhead.

I will first refer to the avtab_hash code to improve the avc_hash
function, and then show you the data. I will collect the following
information based on several different configuration using the same test
model:

Baseline: 512 nodes, 512 buckets, original avc_hash
A: 512 nodes, 512 buckets
B: 8192 nodes, 8192 buckets
C: 8192 nodes, 8192 buckets, original avc_hash
D: 8192 nodes, 4096 buckets ("large number" of nodes in limited buckets)

1. /sys/fs/selinux/avc/hash_stats
	a. assess the effectiveness of the optimized algorithm based on A, B
		and Baseline. Expect bucket utilization rate: A ~= B > Baseline.
2. total latency of hash operation and table lookup
	a. A vs Baseline: expect A is no obvious latency increasing
	b. B vs A: expect B is close to A
	c. C vs B: expect B is no worse than C
	c. D vs C: see if we can save some memory with no obvious latency increasing