linux-kernel - Re: [PATCH] sched/fair: make CFS bandwidth slice per cpu group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAM_iQpXi60Ubd_-fzycSZ8fO=QSOc_nE=7GniU_t=QPAfxCbkw@mail.gmail.com>
Date:   Tue, 1 May 2018 11:06:08 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, Paul Turner <pjt@...gle.com>,
        Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH] sched/fair: make CFS bandwidth slice per cpu group

On Tue, May 1, 2018 at 12:11 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, Apr 30, 2018 at 01:37:16PM -0700, Cong Wang wrote:
>> On Mon, Apr 30, 2018 at 12:42 PM, Peter Zijlstra <peterz@...radead.org> wrote:
>> > On Mon, Apr 30, 2018 at 12:29:25PM -0700, Cong Wang wrote:
>> >> Currently, the sched_cfs_bandwidth_slice_us is a global setting which
>> >> affects all cgroups. Different groups may want different values based
>> >> on their own workload, one size doesn't fit all. The global pool filled
>> >> periodically is per cgroup too, they should have the right to distribute
>> >> their own quota to each local CPU with their own frequency.
>> >
>> > Why.. what happens? This doesn't really tell us anything.
>>
>> We saw tasks in a container got throttled for many times even
>> when they don't apparently over-burn the CPU's. I tried to reduce
>> the sched_cfs_bandwidth_slice_us from the default 5ms to 1ms,
>> it solved the problem as no tasks got throttled after this change.
>> This is why I want to change it.
>
> The 1ms slice distributes time better at the cost of higher overhead,
> right?


Right, slightly higher. According to this paper [1]:

"While decreasing this value increases the frequency at which CPUs
request for quota from the global pool, we did not notice any measurable
impact on performance."

1. https://landley.net/kdocs/ols/2010/ols2010-pages-245-254.pdf


>
>> And I don't think 1ms will be good for all containers, so in order to
>> minimize the impact, I would like to keep the slice change within
>> each container. This is why I propose this patch rather just
>> `sysctl  -w`. Do you think otherwise?
>
> Well, I think I don't quite remember everything and a Changelog that
> tells me why you want stuff in a little more detail and helps me
> remember some things is a lot more useful than me having to go dig
> through the code myself (which I'll invariably postpone because I'm a
> busy sort of person).

Sure, you just tell me how I should improve my changelog. I didn't
realize I need to state why I need to change sched_cfs_bandwidth_slice_us
since it is already a tunable. I will add it in v2.

>
>> BTW, people reported a similar (if not same) issue here before:
>> https://gist.github.com/bobrik/2030ff040fad360327a5fab7a09c4ff1
>
> That's not a report, that's a random person on the interweb posting
> random crap. A report lands in my inbox.

Well, I found it via LKML:
https://marc.info/?l=linux-kernel&m=151270583632566&w=2

(I was too lazy to search for the original LKML link.)

I will update this patch to improve the changelog and address kbuild
warnings, if you don't object.

Thanks.