[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1pn06oeno.fsf@fess.ebiederm.org>
Date: Wed, 10 Mar 2021 15:57:31 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Oleg Nesterov <oleg@...hat.com>,
Matt Fleming <matt@...eblueprint.co.uk>
Subject: Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct
Thomas Gleixner <tglx@...utronix.de> writes:
> On Thu, Mar 04 2021 at 21:58, Thomas Gleixner wrote:
>> On Thu, Mar 04 2021 at 13:04, Eric W. Biederman wrote:
>>> Thomas Gleixner <tglx@...utronix.de> writes:
>>>>
>>>> We could of course do the caching unconditionally for all tasks.
>>>
>>> Is there any advantage to only doing this for realtime tasks?
>>
>> It was mostly to avoid tons of cached entries hanging around all over
>> the place. So I limited it to the case which the RT users deeply cared
>> about. Also related to the accounting question below.
>>
>>> If not it probably makes sense to do the caching for all tasks.
>>>
>>> I am wondering if we want to count the cached sigqueue structure to the
>>> users rt signal rlimit?
>>
>> That makes some sense, but that's a user visible change as a single
>> signal will up the count for a tasks lifetime while today it is removed
>> from accounting again once the signal is delivered. So that needs some
>> thought.
>
> Thought more about it. To make this accounting useful we'd need:
>
> - a seperate user::sigqueue_cached counter
> - a seperate RLIMIT_SIGQUEUE_CACHED
>
> Then you need to think about the defaults. Any signal heavy application
> will want this enabled and obviously automagically.
>
> Also there is an argument not to have this due to possible pointless
> memory consumption.
>
> But what are we talking about? 80 bytes worth of memory per task in the
> worst case. Which is compared to the rest of a task's memory consumption
> just noise.
>
> Looking at some statistics from a devel system there are less than 10
> items cached when the machine is fully idle after boot. During a kernel
> compile the cache utilization goes up to ~150 at max (make -j128 and 64
> CPUs). What's interesting is the allocation statistics after boot and
> full kernel compile:
>
> from slab: 23996
> from task cache: 52223
>
> A typical pattern there is:
>
> <ls>-58490 [010] d..2 7765.664198: __sigqueue_alloc: 58488 from slab ffff8881132df460 10
> <ls>-58488 [002] d..1 7765.664294: __sigqueue_free.part.35: cache ffff8881132df460 10
> <ls>-58488 [002] d..2 7765.665146: __sigqueue_alloc: 1149 from cache ffff8881103dc550 10
> bash-1149 [000] d..2 7765.665220: exit_task_sighand: free ffff8881132df460 8 9
> bash-1149 [000] d..1 7765.665662: __sigqueue_free.part.35: cache ffff8881103dc550 9
>
> 58488 grabs the sigqueue from bash's task cache and bash sticks it back
> in. Lather, rinse and repeat.
>
> IMO, not bothering with an extra counter and rlimit plus the required
> atomic operations is just fine and having this for all tasks
> unconditionally looks like a clear win.
>
> I'll post an updated version of this soonish.
That looks like a good analysis.
I see that there is a sigqueue_cachep. As I recall there are per cpu
caches and all kinds of other good stuff when using kmem_cache_alloc.
Are those goodies falling down?
I am just a little unclear on why a slab allocation is sufficiently
problematic that we want to avoid it.
Eric
Powered by blists - more mailing lists