[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <675544de-3369-e26e-65ba-3b28fff5c126@gnuweeb.org>
Date: Tue, 5 Apr 2022 20:13:42 +0700
From: Ammar Faizi <ammarfaizi2@...weeb.org>
To: Dietmar Eggemann <dietmar.eggemann@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Cc: Ben Segall <bsegall@...gle.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
GNU/Weeb Mailing List <gwml@...r.gnuweeb.org>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Mel Gorman <mgorman@...e.de>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Vincent Guittot <vincent.guittot@...aro.org>
Subject: Re: [Linux 5.18-rc1] WARNING: CPU: 1 PID: 0 at
kernel/sched/fair.c:3355 update_blocked_averages
On 4/5/22 7:21 PM, Dietmar Eggemann wrote:
> Tried to recreate the issue but no success so far. I used you config
> file, clang-14 and a Xeon CPU E5-2690 v2 (2 sockets 40 CPUs) with 20
> two-level cgoupv1 taskgroups '/X/Y' with 'hackbench (10 groups, 40 fds)
> + idling' running in all '/X/Y/'.
>
> What userspace are you running?
HP Laptop, Intel i7-1165G7, 8 CPUs, with 16 GB of RAM. Ubuntu 21.10. Just for
daily workstation. Compiling kernel, browsing and coding stuff.
> There seemed to be some pressure on your machine when it happened?
Yeah, might be, I don't fully remember the activity at the time it
happened, though.
>> <6>[13420.623334][ C7] perf: interrupt took too long (2530 > 2500),
>> lowering kernel.perf_event_max_sample_rate to 78900
>
> Maybe you could split the SCHED_WARN_ON so we know which signal causes this?
OK, I will apply the diff on top of 5.18-rc1 and will start using it for daily
routine tomorrow morning. Let's see if I can hit this bug again. Will send an
update later...
Thank you.
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d4bd299d67ab..0d45e09e5bfc 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3350,9 +3350,9 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq
> *cfs_rq)
> * Make sure that rounding and/or propagation of PELT values never
> * break this.
> */
> - SCHED_WARN_ON(cfs_rq->avg.load_avg ||
> - cfs_rq->avg.util_avg ||
> - cfs_rq->avg.runnable_avg);
> + SCHED_WARN_ON(cfs_rq->avg.load_avg);
> + SCHED_WARN_ON(cfs_rq->avg.util_avg);
> + SCHED_WARN_ON(cfs_rq->avg.runnable_avg);
>
> return true;
> }
>
> [...]
--
Ammar Faizi
Powered by blists - more mailing lists