netdev - Re: [PATCH v1] neighbour: Don't let neigh_forced

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c4d90b06-a941-b275-38a9-2a891485ca4d@ssi.bg>
Date: Fri, 1 Dec 2023 20:21:34 +0200 (EET)
From: Julian Anastasov <ja@....bg>
To: Doug Anderson <dianders@...omium.org>
cc: Eric Dumazet <edumazet@...gle.com>, Judy Hsiao <judyhsiao@...omium.org>,
        David Ahern <dsahern@...nel.org>, Simon Horman <horms@...nel.org>,
        Brian Haley <haleyb.dev@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Joel Granados <joel.granados@...il.com>,
        Leon Romanovsky <leon@...nel.org>,
        Luis Chamberlain <mcgrof@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v1] neighbour: Don't let neigh_forced_gc() disable
 preemption for long


	Hello,

On Fri, 1 Dec 2023, Doug Anderson wrote:

> The place we hit this wasn't actually with fuzzers but with normal
> usage in our labs. The only case where it was a really big problem was
> when neigh_forced_gc() was scheduled on a "little" CPU (in a
> big.LITTLE system) and that little CPU happened to be running at the
> lowest CPU frequency. Specifically Judy was testing on sc7180-trogdor
> and the lowest CPU Frequency of the "little" CPUs was 300 MHz. Since
> the littles are less powerful than the bigs, this is roughly the
> equivalent processing power of a big core running at 120 MHz.

	If we are talking about 32-bit systems with high HZ value I now 
see a little problem in neigh_alloc() where we may start neigh_forced_gc() 
later on gc_thresh3, not early on gc_thresh2 as expected. This can happen 
after a long idle period when last_flush becomes invalid and 'now'
is 'time_before' last_flush:

-	time_after(now, tbl->last_flush + 5 * HZ))) {
+	!time_in_range_open(now, tbl->last_flush, tbl->last_flush + 5 * HZ))) {

	With a big gap between gc_thresh2 and gc_thresh3 we
may work on large gc_list if we react on gc_thresh3 instead of
gc_thresh2. But such storms can not happen more than once per jiffie wrap.

> FWIW, we are apparently no longer seeing the bad latency after
> <https://crrev.com/c/4914309>, which does this:
> 
> # Increase kernel neighbor table size.
> echo 1024 > /proc/sys/net/ipv4/neigh/default/gc_thresh1
> echo 4096 > /proc/sys/net/ipv4/neigh/default/gc_thresh2

	We expect to keep the entries not much above 4096 and
we should see small gc_list.

> echo 8192 > /proc/sys/net/ipv4/neigh/default/gc_thresh3

	On invalid last_flush we will free 8192-4096 => 4096 entries
under BH lock.

Regards

--
Julian Anastasov <ja@....bg>