linux-kernel - Re: Warning in irq_work_queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150902215020.GA21505@lerouge>
Date:	Wed, 2 Sep 2015 23:50:22 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>
Subject: Re: Warning in irq_work_queue_on()

On Wed, Sep 02, 2015 at 03:44:05PM -0400, Tejun Heo wrote:
> (cc'ing peterz)
> 
> Ooh, this is from irq_work which doesn't have much to do with
> workqueue.  Peter?
> 
> On Mon, Aug 24, 2015 at 05:16:11PM -0700, Paul E. McKenney wrote:
> > Hello, Tejun,
> > 
> > As discussed last week, I am getting an occasional warning out of
> > irq_work_queue_on() WARN_ON_ONCE(cpu_is_offline(cpu)).  The repeat-by
> > seems to be a week or so of rcutorture runs on 16-CPU KVM instances
> > on x86.  So please see below on the off-chance that this is of use.
> > I have also attached a .config file.
> > 
> > Thoughts?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > [  875.702254] ------------[ cut here ]------------
> > [  875.703111] WARNING: CPU: 0 PID: 768 at /home/paulmck/public_git/bisect-linux-rcu/kernel/irq_work.c:69 irq_work_queue_on+0xd4/0x110()
> > [  875.703227] Modules linked in:
> > [  875.703227] CPU: 0 PID: 768 Comm: rcu_torture_rea Tainted: G        W       4.1.0-rc4+ #1
> > [  875.703227] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > [  875.703227]  ffffffff81baadd8 ffff88001dc5fce8 ffffffff81895418 00000000000000aa
> > [  875.703227]  0000000000000000 ffff88001dc5fd28 ffffffff810517d5 0000000000015bc0
> > [  875.703227]  0000000000000004 0000000000000004 ffff88001fc8f980 ffff88001fc8d500
> > [  875.703227] Call Trace:
> > [  875.703227]  [<ffffffff81895418>] dump_stack+0x45/0x57
> > [  875.703227]  [<ffffffff810517d5>] warn_slowpath_common+0x85/0xc0
> > [  875.703227]  [<ffffffff810518b5>] warn_slowpath_null+0x15/0x20
> > [  875.703227]  [<ffffffff811119a4>] irq_work_queue_on+0xd4/0x110
> > [  875.703227]  [<ffffffff810c2d74>] tick_nohz_full_kick_cpu+0x44/0x50

It happens in nohz full, but I'm not sure the guilty is nohz full.

The problem here is that wake_up_nohz_cpu() selects a CPU that is offline.
But this shouldn't happen. Either it selects a CPU that is in the domain tree,
and I suspect offline CPUs aren't supposed to be there, or it selects the current
CPU. And if the CPU is offlined, it shouldn't be running some kthread...

> > [  875.703227]  [<ffffffff81076384>] wake_up_nohz_cpu+0xb4/0x100
> > [  875.703227]  [<ffffffff810b1196>] internal_add_timer+0x86/0xa0
> > [  875.703227]  [<ffffffff810b30f1>] mod_timer+0xf1/0x1e0
> > [  875.703227]  [<ffffffff810a63a4>] rcu_torture_reader+0x2a4/0x2e0
> > [  875.703227]  [<ffffffff810a63e0>] ? rcu_torture_reader+0x2e0/0x2e0
> > [  875.703227]  [<ffffffff810a6100>] ? rcutorture_trace_dump.part.10+0x20/0x20
> > [  875.703227]  [<ffffffff8106d75d>] kthread+0xcd/0xf0
> > [  875.703227]  [<ffffffff8106d690>] ? kthread_create_on_node+0x180/0x180
> > [  875.703227]  [<ffffffff8189fb92>] ret_from_fork+0x42/0x70
> > [  875.703227]  [<ffffffff8106d690>] ? kthread_create_on_node+0x180/0x180
> > [  875.703227] ---[ end trace 74175128740d0113 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/