[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZC+Tt2WqyFmNEm/w@casper.infradead.org>
Date: Fri, 7 Apr 2023 04:53:27 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Hillf Danton <hdanton@...a.com>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Ingo Molnar <mingo@...hat.com>, Mel Gorman <mgorman@...e.de>,
Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Vincent Guittot <vincent.guittot@...aro.org>,
linux-mm@...ck.org, Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v4] signal: Let tasks cache one sigqueue struct.
On Fri, Apr 07, 2023 at 08:03:06AM +0800, Hillf Danton wrote:
> On Thu, 6 Apr 2023 22:47:21 +0200 Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> > The sigqueue caching originated in the PREEMPT_RT tree. A few of the
> > applications, that were ported to Linux, were ported from OS-9. Sending
> > notifications from one task to another via a signal was a common
> > communication model there and so the applications are heavy signal
> > users. Removing the allocation reduces the critical path, avoids locks
> > and so lowers the maximal latency of the task while sending a signal.
It might lower the _average_ latency, but it certainly doesn't lower
the _maximum_ latency, because you have to assume worst case scenario
for maximum latency. Which is that there's no sigqueue cached, so you
have to go into the slab allocator. And again, worst case scenario is
that you have to go into the page allocator from there, and further that
you have to run reclaim, and ...
What I find odd about the numbers that you quote:
> The numbers of system boot followed by an allmod kernel build:
> Out of 333216 allocations, 194876 (~58%) were served from the cache.
> From all free invocations, 4212 were in a path were caching is not done
> and 329002 sigqueue were cached.
... is that there's no absolute numbers. Does it save 1% of the cost
of sending a signal? 10%? What does perf say about the cost saved
by no longer going into slab? Because the fast path in slab is very
fast. It might even be quicker than your fast path for multithreaded
applications which have threads running on different NUMA nodes.
Powered by blists - more mailing lists