linux-kernel - Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1pn06oeno.fsf@fess.ebiederm.org>
Date:   Wed, 10 Mar 2021 15:57:31 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Matt Fleming <matt@...eblueprint.co.uk>
Subject: Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

Thomas Gleixner <tglx@...utronix.de> writes:

> On Thu, Mar 04 2021 at 21:58, Thomas Gleixner wrote:
>> On Thu, Mar 04 2021 at 13:04, Eric W. Biederman wrote:
>>> Thomas Gleixner <tglx@...utronix.de> writes:
>>>>
>>>> We could of course do the caching unconditionally for all tasks.
>>>
>>> Is there any advantage to only doing this for realtime tasks?
>>
>> It was mostly to avoid tons of cached entries hanging around all over
>> the place. So I limited it to the case which the RT users deeply cared
>> about. Also related to the accounting question below.
>>
>>> If not it probably makes sense to do the caching for all tasks.
>>>
>>> I am wondering if we want to count the cached sigqueue structure to the
>>> users rt signal rlimit?
>>
>> That makes some sense, but that's a user visible change as a single
>> signal will up the count for a tasks lifetime while today it is removed
>> from accounting again once the signal is delivered. So that needs some
>> thought.
>
> Thought more about it. To make this accounting useful we'd need:
>
>   - a seperate user::sigqueue_cached counter
>   - a seperate RLIMIT_SIGQUEUE_CACHED
>
> Then you need to think about the defaults. Any signal heavy application
> will want this enabled and obviously automagically.
>
> Also there is an argument not to have this due to possible pointless
> memory consumption.
>
> But what are we talking about? 80 bytes worth of memory per task in the
> worst case. Which is compared to the rest of a task's memory consumption
> just noise.
>
> Looking at some statistics from a devel system there are less than 10
> items cached when the machine is fully idle after boot. During a kernel
> compile the cache utilization goes up to ~150 at max (make -j128 and 64
> CPUs). What's interesting is the allocation statistics after boot and
> full kernel compile:
>
>   from slab:            23996
>   from task cache:	52223
>
> A typical pattern there is:
>
>     <ls>-58490   [010] d..2  7765.664198: __sigqueue_alloc: 58488 from slab ffff8881132df460 10
>     <ls>-58488   [002] d..1  7765.664294: __sigqueue_free.part.35: cache ffff8881132df460 10
>     <ls>-58488   [002] d..2  7765.665146: __sigqueue_alloc: 1149 from cache ffff8881103dc550 10
>      bash-1149   [000] d..2  7765.665220: exit_task_sighand: free ffff8881132df460 8 9
>      bash-1149   [000] d..1  7765.665662: __sigqueue_free.part.35: cache ffff8881103dc550 9
>
> 58488 grabs the sigqueue from bash's task cache and bash sticks it back
> in. Lather, rinse and repeat. 
>
> IMO, not bothering with an extra counter and rlimit plus the required
> atomic operations is just fine and having this for all tasks
> unconditionally looks like a clear win.
>
> I'll post an updated version of this soonish.

That looks like a good analysis.

I see that there is a sigqueue_cachep.  As I recall there are per cpu
caches and all kinds of other good stuff when using kmem_cache_alloc.

Are those goodies falling down?

I am just a little unclear on why a slab allocation is sufficiently
problematic that we want to avoid it.

Eric