[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABWYdi1kiu1g1mAq6DpQWczg78tMzaVFnytNMemZATFHqYSqYw@mail.gmail.com>
Date: Thu, 19 Oct 2023 15:35:01 -0700
From: Ivan Babrou <ivan@...udflare.com>
To: Linux Kernel Network Developers <netdev@...r.kernel.org>
Cc: kernel-team <kernel-team@...udflare.com>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: wait_for_unix_gc can cause CPU overload for well behaved programs
Hello,
We have observed this issue twice (2019 and 2023): a well behaved
service that doesn't pass any file descriptors around starts to spend
a ton of CPU time in wait_for_unix_gc.
The cause of this is that the unix send path unconditionally calls
wait_for_unix_gc, which is a global garbage collection. If any
misbehaved program exists on a system, it can force extra work for
well behaved programs.
This behavior is not new: 9915672d4127 ("af_unix: limit
unix_tot_inflight") is from 2010.
I managed to come up with a repro for this behavior:
* https://gist.github.com/bobrik/82e5722261920c9f23d9402b88a0bb27
It also includes a flamegraph illustrating the issue. It's all in one
program for convenience, but in reality the offender not picking up
SCM_RIGHTS messages and the suffering program just minding its own
business are separate.
It is also non-trivial to find the offender when this happens as it
can be completely idle while wrecking havoc for the rest of the
system.
I don't think it's fair to penalize every unix_stream_sendmsg like
this. The 16k threshold also doesn't feel very flexible, surely
computers are bigger these days and can handle more.
Powered by blists - more mailing lists