lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABk29NvPJ3S1xq5xm+52OoUGDyuMSxGOLJbopPa3+-QmLnVYeQ@mail.gmail.com>
Date:   Tue, 4 Jan 2022 17:48:08 -0800
From:   Josh Don <joshdon@...gle.com>
To:     Cruz Zhao <CruzZhao@...ux.alibaba.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Benjamin Segall <bsegall@...gle.com>,
        Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Eric Dumazet <edumazet@...gle.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH 1/2] sched/core: Cookied forceidle accounting per cpu

Hi Cruz,

Could you add a bit more background to help me understand what case
this patch solves? Is your issue that you want to be able to
attribute forced idle time to the specific cpus it happens on, or do
you simply want a more convenient way of summing forced idle without
iterating your cookie'd tasks and summing the schedstat manually?

> @@ -190,6 +202,9 @@ static int show_stat(struct seq_file *p, void *v)
>                 seq_put_decimal_ull(p, " ", nsec_to_clock_t(steal));
>                 seq_put_decimal_ull(p, " ", nsec_to_clock_t(guest));
>                 seq_put_decimal_ull(p, " ", nsec_to_clock_t(guest_nice));
> +#ifdef CONFIG_SCHED_CORE
> +               seq_put_decimal_ull(p, " ", nsec_to_clock_t(cookied_forceidle));
> +#endif

IMO it would be better to always print this stat, otherwise it sets a
weird precedent for new stats added in the future (much more difficult
for userspace to reason about which column corresponds with which
field, since it would depend on kernel config).

Also, did you intend to put this in /proc/stat instead of
/proc/schedstat (the latter of which would be more attractive to
prevent calculation of these stats unless schestat was enabled)?

> diff --git a/kernel/sched/core_sched.c b/kernel/sched/core_sched.c
> @@ -260,6 +261,21 @@ void __sched_core_account_forceidle(struct rq *rq)
>
>         rq->core->core_forceidle_start = now;
>
> +       for_each_cpu(i, smt_mask) {
> +               rq_i = cpu_rq(i);
> +               p = rq_i->core_pick ?: rq_i->curr;
> +
> +               if (!rq->core->core_cookie)
> +                       continue;

I see this is temporary given your other patch, but just a note that
if your other patch is dropped, this check can be pulled outside the
loop.

> +               if (p == rq_i->idle && rq_i->nr_running) {
> +                       cpustat = kcpustat_cpu(i).cpustat;
> +                       cpustat[CPUTIME_COOKIED_FORCEIDLE] += delta;
> +               }
> +       }

I don't think this is right. If a cpu was simply idle while some other
SMT sibling on its core was forced idle, and then a task happens to
wake on the idle cpu, that cpu will now be charged the full delta here
as forced idle (when actually it was never forced idle, we just
haven't been through pick_next_task yet). One workaround would be to
add a boolean to struct rq to cache whether the rq was in forced idle
state.

Best,
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ