lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140618130211.GN7959@redhat.com>
Date:	Wed, 18 Jun 2014 09:02:11 -0400
From:	Don Zickus <dzickus@...hat.com>
To:	David Rientjes <rientjes@...gle.com>
Cc:	LKML <linux-kernel@...r.kernel.org>, akpm@...ux-foundation.org,
	peter@...ensteyn.nl, mhocko@...e.cz
Subject: Re: [PATCH] watchdog: Remove preemption restrictions when restarting
 lockup detector

On Tue, Jun 17, 2014 at 05:59:05PM -0700, David Rientjes wrote:
> On Tue, 17 Jun 2014, Don Zickus wrote:
> 
> > Peter Wu noticed the following splat on his machine when updating
> > /proc/sys/kernel/watchdog_thresh:
> > 
> > [    0.676701] BUG: sleeping function called from invalid context at /tmp/linux/mm/slub.c:965
> > [    0.679396] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
> > [    0.681204] 3 locks held by init/1:
> > [    0.682371]  #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff8117b663>] vfs_write+0x143/0x180
> > [    0.685887]  #1:  (watchdog_proc_mutex){+.+.+.}, at: [<ffffffff810e02d3>] proc_dowatchdog+0x33/0x110
> > [    0.689631]  #2:  (cpu_hotplug.lock){.+.+.+}, at: [<ffffffff810589c2>] get_online_cpus+0x32/0x80
> > [    0.693117] Preemption disabled at:[<ffffffff810e0384>] proc_dowatchdog+0xe4/0x110
> > [    0.695753]
> > [    0.696588] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-testing #34
> > [    0.698404] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > [    0.704622]  ffff88003f250000 ffff88003f1ffd10 ffffffff81624c59 0000000000000000
> > [    0.707749]  ffff88003f1ffd30 ffffffff8108df1d 0000000000000010 ffffffff81c56360
> > [    0.711010]  ffff88003f1ffd78 ffffffff8116a5de ffffffff811159a5 ffffffff810a5f4a
> > [    0.714053] Call Trace:
> > [    0.715015]  [<ffffffff81624c59>] dump_stack+0x4e/0x7a
> > [    0.716619]  [<ffffffff8108df1d>] __might_sleep+0x11d/0x190
> > [    0.718232]  [<ffffffff8116a5de>] kmem_cache_alloc_trace+0x4e/0x1e0
> > [    0.720214]  [<ffffffff811159a5>] ? perf_event_alloc+0x55/0x440
> > [    0.721910]  [<ffffffff810a5f4a>] ? mark_held_locks+0x6a/0x90
> > [    0.723558]  [<ffffffff811159a5>] perf_event_alloc+0x55/0x440
> > [    0.725304]  [<ffffffff810e0050>] ? restart_watchdog_hrtimer+0x50/0x50
> > [    0.727279]  [<ffffffff81116d16>] perf_event_create_kernel_counter+0x26/0xe0
> > [    0.729269]  [<ffffffff810dfe05>] watchdog_nmi_enable+0x75/0x140
> > [    0.730965]  [<ffffffff810dffb3>] update_timers_all_cpus+0x53/0xa0
> > [    0.732953]  [<ffffffff810e0384>] proc_dowatchdog+0xe4/0x110
> > [    0.738408]  [<ffffffff811d8c93>] proc_sys_call_handler+0xb3/0xc0
> > [    0.740266]  [<ffffffff811d8cb4>] proc_sys_write+0x14/0x20
> > [    0.742086]  [<ffffffff8117b5cd>] vfs_write+0xad/0x180
> > [    0.743669]  [<ffffffff810a606d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
> > [    0.745593]  [<ffffffff8117b799>] SyS_write+0x49/0xb0
> > [    0.747101]  [<ffffffff8162fe92>] system_call_fastpath+0x16/0x1b
> > [    0.749069] NMI watchdog: disabled (cpu0): hardware events not enabled
> > 
> > What happened is after updating the watchdog_thresh, the lockup detector is
> > restarted to utilize the new value.  Part of this process involved disabling
> > preemption.  Once preemption was disabled, perf tried to allocate a new event
> > (as part of the restart).  This caused the above BUG_ON as you can't sleep with
> > preemption disabled.
> > 
> > The preemption restriction seemed agressive as we are not doing anything
> > on that particular cpu, but with all the online cpus (which are protected by
> > the get_online_cpus lock).  Remove the restriction and the BUG_ON goes away.
> > 
> > Reported-and-Tested-by: Peter Wu <peter@...ensteyn.nl>
> > Acked-by: Michal Hocko <mhocko@...e.cz>
> > Signed-off-by: Don Zickus <dzickus@...hat.com>
> 
> Acked-by: David Rientjes <rientjes@...gle.com>
> 
> I think this deserves a Cc: stable@...r.kernel.org # 3.13+

Agreed. :-)

Thanks!

Cheers,
Don

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ