linux-kernel - Re: [PATCH] timers/nohz: Update nohz load even if tick already stopped

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <813ed21938aa47b15f35f8834ffd98ad4dd27771.camel@redhat.com>
Date:   Fri, 01 Nov 2019 00:11:09 -0500
From:   Scott Wood <swood@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Frederic Weisbecker <frederic@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] timers/nohz: Update nohz load even if tick already
 stopped

On Wed, 2019-10-30 at 14:31 +0100, Peter Zijlstra wrote:
> On Wed, Oct 30, 2019 at 03:48:26AM -0500, Scott Wood wrote:
> > On Tue, 2019-10-29 at 11:05 +0100, Peter Zijlstra wrote:
> > > @@ -3686,6 +3688,7 @@ static void sched_tick_remote(struct work_struct
> > > *work)
> > >  	curr->sched_class->task_tick(rq, curr, 0);
> > >  
> > >  out_unlock:
> > > +	calc_load_nohz_remote(cpu);
> > >  	rq_unlock_irq(rq, &rf);
> > 
> > This gets skipped when the cpu is idle, so it still misses the update.
> 
> Oh argh! that's a bit radical of the remote tick. The normal tick runs
> just fine on idle CPUs, so lets mirror that.
> 
> How's this then?
> 
> ---
> diff --git a/include/linux/sched/nohz.h b/include/linux/sched/nohz.h
> index 1abe91ff6e4a..6d67e9a5af6b 100644
> --- a/include/linux/sched/nohz.h
> +++ b/include/linux/sched/nohz.h
> @@ -15,9 +15,11 @@ static inline void nohz_balance_enter_idle(int cpu) { }
>  
>  #ifdef CONFIG_NO_HZ_COMMON
>  void calc_load_nohz_start(void);
> +void calc_load_nohz_remote(struct rq *rq);
>  void calc_load_nohz_stop(void);
>  #else
>  static inline void calc_load_nohz_start(void) { }
> +static inline void calc_load_nohz_remote(struct rq *rq) { }
>  static inline void calc_load_nohz_stop(void) { }
>  #endif /* CONFIG_NO_HZ_COMMON */
>  
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index eb42b71faab9..d02d1b8f40af 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3660,21 +3660,17 @@ static void sched_tick_remote(struct work_struct
> *work)
>  	u64 delta;
>  	int os;
>  
> -	/*
> -	 * Handle the tick only if it appears the remote CPU is running in
> full
> -	 * dynticks mode. The check is racy by nature, but missing a tick or
> -	 * having one too much is no big deal because the scheduler tick
> updates
> -	 * statistics and checks timeslices in a time-independent way,
> regardless
> -	 * of when exactly it is running.
> -	 */
> -	if (idle_cpu(cpu) || !tick_nohz_tick_stopped_cpu(cpu))
> +	if (!tick_nohz_tick_stopped_cpu(cpu))
>  		goto out_requeue;
>  
>  	rq_lock_irq(rq, &rf);
> -	curr = rq->curr;
> -	if (is_idle_task(curr) || cpu_is_offline(cpu))
> +	/*
> +	 * We must not call calc_load_nohz_remote() when not in NOHZ mode.
> +	 */
> +	if (cpu_is_offline(cpu) || !tick_nohz_tick_stopped(cpu))
>  		goto out_unlock;

Needs to be tick_nohz_tick_stopped_cpu(cpu)

After fixing that, I get:

[    7.439068] WARNING: CPU: 20 PID: 7 at /home/root/linux/kernel/sched/core.c:3681 sched_tick_remote+0x132/0x150
[    7.439068] Modules linked in:
[    7.439068] CPU: 20 PID: 7 Comm: kworker/u209:0 Not tainted 5.4.0-rc5.std+ #15
[    7.439068] Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
[    7.439068] Workqueue: events_unbound sched_tick_remote
[    7.446308] pci_bus 0000:9f: resource 1 [mem 0xe6a00000-0xe6bfffff]
[    7.455068] RIP: 0010:sched_tick_remote+0x132/0x150
[    7.455068] Code: 00 e9 b2 fd fe ff 0f 0b e9 46 ff ff ff 83 f8 02 89 c2 74 d3 8d 4a ff 89 d0 f0 0f b1 0e 0f 94 c1 84 c9 0f 85 23 ff ff ff eb e3 <0f> 0b eb 9a 80 3d 9c d6 2c 01 00 0f 1f 00 0f 85 71 ff ff ff e8 05
[    7.455068] RSP: 0000:ffffc9000c683e58 EFLAGS: 00010002
[    7.455068] RAX: 00000000e7061da1 RBX: ffff8897e026e688 RCX: 0000000181f93295
[    7.455068] RDX: 00000000b2d05e00 RSI: ffff8897e0269e50 RDI: 0000000000000004
[    7.455068] RBP: ffff8881004c0000 R08: ffff8e8191a2b423 R09: 0000000000000000
[    7.455068] R10: 0000000000000010 R11: 0000000000000018 R12: ffff8897e0269240
[    7.455068] R13: ffff8897e0240000 R14: 0000000000000000 R15: ffff888107edc2e8
[    7.455068] FS:  0000000000000000(0000) GS:ffff8897e0700000(0000) knlGS:0000000000000000
[    7.455068] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    7.455068] CR2: 0000000000000000 CR3: 000000303e60a001 CR4: 00000000007606e0
[    7.455068] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    7.459338] pci_bus 0000:9f: resource 2 [mem 0x3a0000000000-0x3a00001fffff 64bit pref]
[    7.465068] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    7.465068] PKRU: 55555554
[    7.465068] Call Trace:
[    7.465068]  process_one_work+0x165/0x3c0
[    7.465068]  worker_thread+0x46/0x3d0
[    7.465068]  kthread+0xf8/0x130
[    7.465068]  ? process_one_work+0x3c0/0x3c0
[    7.476788] pci_bus 0000:a0: resource 1 [mem 0xe6c00000-0xe6dfffff]
[    7.465068]  ? kthread_bind+0x10/0x10
[    7.465068]  ret_from_fork+0x35/0x40

-Scott