linux-kernel - Re: [PATCH] sched/fair: Ensure _sum and

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <E16E71B3-E941-4522-AFF1-ABDF918FED19@linux.vnet.ibm.com>
Date:   Thu, 1 Jul 2021 15:17:16 +0530
From:   Sachin Sant <sachinp@...ux.vnet.ibm.com>
To:     Odin Ugedal <odin@...d.al>,
        Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Naresh Kamboju <naresh.kamboju@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched/fair: Ensure _sum and _avg values stay consistent



> On 24-Jun-2021, at 4:48 PM, Odin Ugedal <odin@...d.al> wrote:
> 
> The _sum and _avg values are in general sync together with the PELT
> divider. They are however not always completely in perfect sync,
> resulting in situations where _sum gets to zero while _avg stays
> positive. Such situations are undesirable.
> 
> This comes from the fact that PELT will increase period_contrib, also
> increasing the PELT divider, without updating _sum and _avg values to
> stay in perfect sync where (_sum == _avg * divider). However, such PELT
> change will never lower _sum, making it impossible to end up in a
> situation where _sum is zero and _avg is not.
> 
> Therefore, we need to ensure that when subtracting load outside PELT,
> that when _sum is zero, _avg is also set to zero. This occurs when
> (_sum < _avg * divider), and the subtracted (_avg * divider) is bigger
> or equal to the current _sum, while the subtracted _avg is smaller than
> the current _avg.
> 
> Reported-by: Sachin Sant <sachinp@...ux.vnet.ibm.com>
> Reported-by: Naresh Kamboju <naresh.kamboju@...aro.org>
> Signed-off-by: Odin Ugedal <odin@...d.al>

Hello Odin, Vincent,

The issue of kernel warning(during boot) seen on Power sever, reported few days back is not completely fixed.
I am able to recreate this issue with latest 5.13 kernel(which has this fix)

# git log --oneline kernel/sched/fair.c
a6eaf3850cb1 Merge tag 'sched-urgent-2021-06-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
54a728dc5e4f Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
1c35b07e6d39 sched/fair: Ensure _sum and _avg values stay consistent  <<==


[   65.407631] ------------[ cut here ]------------
[   65.407656] cfs_rq->avg.load_avg || cfs_rq->avg.util_avg || cfs_rq->avg.runnable_avg
[   65.407666] WARNING: CPU: 18 PID: 6642 at kernel/sched/fair.c:3308 update_blocked_averages+0x748/0x7a0
[   65.407693] Modules linked in: dm_mod bonding nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set rfkill nf_tables nfnetlink sunrpc xfs libcrc32c pseries_rng xts uio_pdrv_genirq vmx_crypto uio sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse
[   65.407776] CPU: 18 PID: 6642 Comm: gcc Not tainted 5.13.0-05357-gdbe69e433722 #1
[   65.407789] NIP:  c0000000001b2e48 LR: c0000000001b2e44 CTR: c00000000072cac0
[   65.407798] REGS: c00000003f2cf660 TRAP: 0700   Not tainted  (5.13.0-05357-gdbe69e433722)
[   65.407809] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 48088224  XER: 00000005
[   65.407837] CFAR: c00000000014d520 IRQMASK: 1 
               GPR00: c0000000001b2e44 c00000003f2cf900 c0000000029bc700 0000000000000048 
               GPR04: 00000000ffff7fff c00000003f2cf5c0 0000000000000027 c0000008baa07e18 
               GPR08: 0000000000000023 0000000000000001 0000000000000027 c000000002876888 
               GPR12: 0000000000008000 c00000001ec35a80 c0000008baa92580 0000000f3a993c3c 
               GPR16: 0000000000000000 c0000008baa92580 c00000001a53b400 0000000000000001 
               GPR20: 0000000000000000 c0000000029fdfe0 0000000000000000 0000000000000012 
               GPR24: 0000000000000000 c0000008baa92f90 0000000000000001 c0000008baa92600 
               GPR28: 0000000000000012 0000000f3a993c3c c00000001a53b5c0 0000000000000001 
[   65.407984] NIP [c0000000001b2e48] update_blocked_averages+0x748/0x7a0
[   65.407995] LR [c0000000001b2e44] update_blocked_averages+0x744/0x7a0
[   65.408005] Call Trace:
[   65.408010] [c00000003f2cf900] [c0000000001b2e44] update_blocked_averages+0x744/0x7a0 (unreliable)
[   65.408025] [c00000003f2cfa20] [c0000000001bdd78] newidle_balance+0x258/0x5c0
[   65.408038] [c00000003f2cfab0] [c0000000001be1bc] pick_next_task_fair+0x7c/0x4c0
[   65.408051] [c00000003f2cfb10] [c000000000cf223c] __schedule+0x15c/0x17b0
[   65.408063] [c00000003f2cfc50] [c0000000001a6044] do_task_dead+0x64/0x70
[   65.408076] [c00000003f2cfc80] [c0000000001567d8] do_exit+0x868/0xce0
[   65.408089] [c00000003f2cfd50] [c000000000156d24] do_group_exit+0x64/0xe0
[   65.408101] [c00000003f2cfd90] [c000000000156dc4] sys_exit_group+0x24/0x30
[   65.408115] [c00000003f2cfdb0] [c0000000000307b0] system_call_exception+0x150/0x2d0
[   65.408128] [c00000003f2cfe10] [c00000000000cc5c] system_call_common+0xec/0x278
[   65.408141] --- interrupt: c00 at 0x7fffa163b80c
[   65.408150] NIP:  00007fffa163b80c LR: 00007fffa1597004 CTR: 0000000000000000
[   65.408159] REGS: c00000003f2cfe80 TRAP: 0c00   Not tainted  (5.13.0-05357-gdbe69e433722)
[   65.408168] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000000
[   65.408199] IRQMASK: 0 
               GPR00: 00000000000000ea 00007fffdf5cd220 00007fffa1757300 0000000000000000 
               GPR04: 0000000000000000 00007fffdf5cd078 0000000000000000 0000000000000000 
               GPR08: 0000000000000001 0000000000000000 0000000000000000 0000000000000000 
               GPR12: 0000000000000000 00007fffa192a930 0000000000000000 0000000000000000 
               GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
               GPR20: 0000000000000000 00007fffa1754ee0 0000000000000004 0000000000000001 
               GPR24: 00007fffa17508a0 0000000000000000 0000000000000000 0000000000000001 
               GPR28: 0000000000000000 0000000000000000 00007fffa1923950 0000000000000000 
[   65.408308] NIP [00007fffa163b80c] 0x7fffa163b80c
[   65.408317] LR [00007fffa1597004] 0x7fffa1597004
[   65.408324] --- interrupt: c00
[   65.408331] Instruction dump:
[   65.408339] 2f890000 409efa3c 4bffa119 4bfffa34 60000000 60000000 e9210070 e8610088 
[   65.408361] 39400001 99490003 4bf9a679 60000000 <0fe00000> e95201ba 2faa0000 4bfffc50 
[   65.408383] ---[ end trace 14708e73bf91cf1c ]—


Let me know if any additional information is required.

I have attached the dmesg log captured after the failure.

Thanks
-Sachin

> ---
> 
> Reports and discussion can be found here:
> 
> https://lore.kernel.org/lkml/2ED1BDF5-BC0C-47CD-8F33-9A46C738F8CF@linux.vnet.ibm.com/
> https://lore.kernel.org/lkml/CA+G9fYsMXELmjGUzw4SY1bghTYz_PeR2diM6dRp2J37bBZzMSA@mail.gmail.com/
> 
> kernel/sched/fair.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index bfaa6e1f6067..def48bc2e90b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3688,15 +3688,15 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
> 
> 		r = removed_load;
> 		sub_positive(&sa->load_avg, r);
> -		sub_positive(&sa->load_sum, r * divider);
> +		sa->load_sum = sa->load_avg * divider;
> 
> 		r = removed_util;
> 		sub_positive(&sa->util_avg, r);
> -		sub_positive(&sa->util_sum, r * divider);
> +		sa->util_sum = sa->util_avg * divider;
> 
> 		r = removed_runnable;
> 		sub_positive(&sa->runnable_avg, r);
> -		sub_positive(&sa->runnable_sum, r * divider);
> +		sa->runnable_sum = sa->runnable_avg * divider;
> 
> 		/*
> 		 * removed_runnable is the unweighted version of removed_load so we
> -- 
> 2.32.0
> 



View attachment "5.13-dmesg-log.txt" of type "text/plain" (24685 bytes)