[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180402220438.GA15885@linux.vnet.ibm.com>
Date: Mon, 2 Apr 2018 15:04:38 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: linux-kernel@...r.kernel.org
Cc: frederic@...nel.org, cmetcalf@...lanox.com, cl@...ux.com,
lcapitulino@...hat.com, efault@....de, peterz@...radead.org,
riel@...hat.com, tglx@...utronix.de, kernellwp@...il.com,
mingo@...nel.org
Subject: NO_HZ_FULL and tick running within a reasonable amount of time
Hello!
I am hitting the following on today's mainline under rcutorture, but
only on scenarios built with CONFIG_NO_HZ_FULL=y:
------------------------------------------------------------------------
WARNING: CPU: 0 PID: 7 at /home/paulmck/public_git/linux-rcu/kernel/sched/core.c:3124 sched_tick_remote+0x113/0x120
Modules linked in:
CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 4.16.0+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Workqueue: events_unbound sched_tick_remote
RIP: 0010:sched_tick_remote+0x113/0x120
RSP: 0018:ffff94d540103e20 EFLAGS: 00010002
RAX: 000000012e9bb357 RBX: ffff8f95dfd21840 RCX: 000000000000001f
RDX: 00000000b2d05e00 RSI: 00000000ffffffff RDI: ffff8f95dfd21858
RBP: ffff94d540103e48 R08: 00000000f6499019 R09: 00000000f6499000
R10: 00000000b163d33b R11: ffffffffa5c8c212 R12: ffff8f95dfd25518
R13: ffff8f95de9e4200 R14: 0000000000003402 R15: ffff8f95dfd21858
FS: 0000000000000000(0000) GS:ffff8f95dfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000a015b40 CR3: 000000001de14000 CR4: 00000000000006f0
Call Trace:
process_one_work+0x1d9/0x6a0
worker_thread+0x42/0x420
kthread+0xf3/0x130
? rescuer_thread+0x340/0x340
? kthread_delayed_work_timer_fn+0x80/0x80
ret_from_fork+0x3a/0x50
Code: ff 48 8b 83 80 0b 00 00 48 85 c0 0f 85 41 ff ff ff e9 45 ff ff ff be ff ff ff ff 4c 89 ff e8 55 44 02 00 85 c0 75 87 0f 0b eb 83 <0f> 0b eb 97 66 0f 1f 84 00 00 00 00 00 41 57 41 56 41 55 41 54
---[ end trace fbdcbe529a8ae799 ]--
------------------------------------------------------------------------
The WARN_ON_ONCE() triggering is this guy:
delta = rq_clock_task(rq) - curr->se.exec_start;
WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
But given that ->se.exec_start is zeroed from time to time, for example,
in migrate_task_rq_fair(), I am a bit suspicious of this check.
What am I missing here?
Thanx, Paul
Powered by blists - more mailing lists