linux-kernel - Re: [PATCH] sched/rt: Rework for_each_process_thread() iterations in tg_has_rt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54eef2b3-ac54-aeac-9b2c-6cc6f7132170@virtuozzo.com>
Date:   Fri, 20 Apr 2018 14:21:30 +0300
From:   Kirill Tkhai <ktkhai@...tuozzo.com>
To:     Juri Lelli <juri.lelli@...il.com>
Cc:     mingo@...hat.com, peterz@...radead.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/rt: Rework for_each_process_thread() iterations in
 tg_has_rt_tasks()

On 20.04.2018 13:58, Juri Lelli wrote:
> On 20/04/18 12:43, Kirill Tkhai wrote:
>> On 20.04.2018 12:25, Juri Lelli wrote:
> 
> [...]
> 
>>> Isn't this however checking against the current (dynamic) number of
>>> runnable tasks/groups instead of the "static" group membership (which
>>> shouldn't be affected by a task running state)?
>>
>> Ah, you are sure. I forgot that rt_nr_running does not contain sleeping tasks.
>> We need to check something else here. I'll try to find another way.
> 
> n/p. Maybe a per rt_rq flag linked to "static" membership (I didn't
> really thought this through though :).

sched_move_task() does not change any rt_rq's fields on moving a dequeued
task, so we definitely can't use rt_rq to detect all the tasks.

> BTW, since you faced this problem, I guess this is on RT_GROUP_SCHED
> enabled boxes, so I'd have a couple of questions (not strictly related
> to the problem at hand):
> 
>  - do such boxes rely on RT throttling being performed at root level?
>  - is RT_RUNTIME_SHARE enabled?

This is a machine with many fair_sched_class tasks, while there are no
(almost) RT tasks there, and there is no "real" real-time with throttling.
They are not interesting from this point of view. Very small number
of task group may have rt_runtime != 0 there, and where so, rt_runtime value
is small. The problem I'm solving happened, when rt_period of one of such groups
were restored, and it hanged for ages, because of enormous number of child
cgroups and tasks (of fair_sched_class) in system.

RT_RUNTIME_SHARE is not enabled, as it's RH7-based 3.10 kernel.

Really, it will be very difficult to provide real-time on machines with
such the big number of tasks, in case of tasks allocates some resources
(not all the memory are pinned from going to slab, there is some IO, etc),
since there are some tasks-to-solve even in !rt case.

Kirill