[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xm26wolbyfe9.fsf@bsegall-linux.svl.corp.google.com>
Date: Wed, 06 Mar 2019 11:25:02 -0800
From: bsegall@...gle.com
To: Phil Auld <pauld@...hat.com>
Cc: mingo@...hat.com, peterz@...radead.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC] sched/fair: hard lockup in sched_cfs_period_timer
Phil Auld <pauld@...hat.com> writes:
> On Tue, Mar 05, 2019 at 12:45:34PM -0800 bsegall@...gle.com wrote:
>> Phil Auld <pauld@...hat.com> writes:
>>
>> > Interestingly, if I limit the number of child cgroups to the number of
>> > them I'm actually putting processes into (16 down from 2500) the problem
>> > does not reproduce.
>>
>> That is indeed interesting, and definitely not something we'd want to
>> matter. (Particularly if it's not root->a->b->c...->throttled_cgroup or
>> root->throttled->a->...->thread vs root->throttled_cgroup, which is what
>> I was originally thinking of)
>>
>
> The locking may be a red herring.
>
> The setup is root->throttled->a where a is 1-2500. There are 4 threads in
> each of the first 16 a groups. The parent, throttled, is where the
> cfs_period/quota_us are set.
>
> I wonder if the problem is the walk_tg_tree_from() call in unthrottle_cfs_rq().
>
> The distribute_cfg_runtime looks to be O(n * m) where n is number of
> throttled cfs_rqs and m is the number of child cgroups. But I'm not
> completely clear on how the hierarchical cgroups play together here.
>
> I'll pull on this thread some.
>
> Thanks for your input.
>
>
> Cheers,
> Phil
Yeah, that isn't under the cfs_b lock, but is still part of distribute
(and under rq lock, which might also matter). I was thinking too much
about just the cfs_b regions. I'm not sure there's any good general
optimization there.
I suppose cfs_rqs (tgs/cfs_bs?) could have "nearest
ancestor with a quota" pointer and ones with quota could have
"descendants with quota" list, parallel to the children/parent lists of
tgs. Then throttle/unthrottle would only have to visit these lists, and
child cgroups/cfs_rqs without their own quotas would just check
cfs_rq->nearest_quota_cfs_rq->throttle_count. throttled_clock_task_time
can also probably be tracked there.
Powered by blists - more mailing lists