lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CAD=FV=XvczUqrYVzDYy=uQ7-LMcRgpJj3iiehzfJNAq=UvJXbg@mail.gmail.com> Date: Fri, 1 Dec 2023 10:40:41 -0800 From: Doug Anderson <dianders@...omium.org> To: Eric Dumazet <edumazet@...gle.com> Cc: Judy Hsiao <judyhsiao@...omium.org>, David Ahern <dsahern@...nel.org>, Simon Horman <horms@...nel.org>, Brian Haley <haleyb.dev@...il.com>, "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Joel Granados <joel.granados@...il.com>, Julian Anastasov <ja@....bg>, Leon Romanovsky <leon@...nel.org>, Luis Chamberlain <mcgrof@...nel.org>, Paolo Abeni <pabeni@...hat.com>, linux-kernel@...r.kernel.org, netdev@...r.kernel.org Subject: Re: [PATCH v1] neighbour: Don't let neigh_forced_gc() disable preemption for long Hi, On Fri, Dec 1, 2023 at 9:35 AM Eric Dumazet <edumazet@...gle.com> wrote: > > > > > > @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl) > > > > > { > > > > > int max_clean = atomic_read(&tbl->gc_entries) - > > > > > READ_ONCE(tbl->gc_thresh2); > > > > > + u64 tmax = ktime_get_ns() + NSEC_PER_MSEC; > > > > > > > > It might be nice to make the above timeout based on jiffies. On a > > > > HZ=100 system it's probably OK to keep preemption disabled for 10 ms > > > > but on a HZ=1000 system you'd want 1 ms. ...so maybe you'd want to use > > > > jiffies_to_nsecs(1)? > > > > > > I do not think so. 10ms would be awfully long. > > > > > > We have nsec based time service, why downgrading to jiffies resolution ??? > > > > Well, the whole issue is that we're disabling preemption, right? > > Unless I'm mistaken, on a 1000 HZ system then a task's timeslice is > > 1ms and on a 100 HZ system then a task's timeslice is 10ms. When we > > disable preemption then the problem is that we can keep going past the > > end of our timeslice. This is bad for whatever task the system is > > trying to schedule instead of us since it will be blocked waiting for > > us to re-enable preemption. > > > > So essentially the problem here is really tied to jiffies resolution, > > right? Specifically, if jiffies is 100 Hz then it's actually > > inefficient to timeout every 1 ms--I think it would be better to use > > up our whole timeslice. > > It is not because a kernel is built with HZ=100 that each thread has > to consume cpu times in 10ms slices. > > Process scheduler does not use jiffies at all, but high resolution time service. > > Keep in mind this code is run from soft-interrupt, not a dedicated processus. Fair enough. I guess my mental model is wrong here. Please disregard my suggestion about using something based on how long "jiffies" is then. Using a fixed 1 ms timeout sounds great, then. > > > Can you tell us in which scenario this gc_list can be so big, other > > > than fuzzers ? > > > > The place we hit this wasn't actually with fuzzers but with normal > > usage in our labs. The only case where it was a really big problem was > > when neigh_forced_gc() was scheduled on a "little" CPU (in a > > big.LITTLE system) and that little CPU happened to be running at the > > lowest CPU frequency. Specifically Judy was testing on sc7180-trogdor > > and the lowest CPU Frequency of the "little" CPUs was 300 MHz. Since > > the littles are less powerful than the bigs, this is roughly the > > equivalent processing power of a big core running at 120 MHz. > > > > FWIW, we are apparently no longer seeing the bad latency after > > <https://crrev.com/c/4914309>, which does this: > > > > # Increase kernel neighbor table size. > > echo 1024 > /proc/sys/net/ipv4/neigh/default/gc_thresh1 > > echo 4096 > /proc/sys/net/ipv4/neigh/default/gc_thresh2 > > echo 8192 > /proc/sys/net/ipv4/neigh/default/gc_thresh3 > > echo 1024 > /proc/sys/net/ipv6/neigh/default/gc_thresh1 > > echo 4096 > /proc/sys/net/ipv6/neigh/default/gc_thresh2 > > echo 8192 > /proc/sys/net/ipv6/neigh/default/gc_thresh3 > > > > However, I still believe that we should land something like Judy's > > patch because, no matter what kernel tunings we have, the kernel > > shouldn't be disabling preemption for so long. > > Sure, and I suggested a refinement, because as I said jiffies can > stick to a value. > > Not sure why a refinement given by a network maintainer is not an option ? > > I must be missing something. Your refinement was good. I was merely trying to answer the question you asked about how we got into it as completely as possible. Sorry if I caused confusion.
Powered by blists - more mailing lists