lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 22 Sep 2011 06:46:22 +0200
From:	Mike Galbraith <efault@....de>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-rt-users <linux-rt-users@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Miklos Szeredi <miklos@...redi.hu>, mingo <mingo@...hat.com>
Subject: Re: rt14: strace ->  migrate_disable_atomic imbalance

On Wed, 2011-09-21 at 20:50 +0200, Peter Zijlstra wrote:
> On Wed, 2011-09-21 at 19:01 +0200, Peter Zijlstra wrote:
> > On Wed, 2011-09-21 at 12:17 +0200, Mike Galbraith wrote:
> > > [  144.212272] ------------[ cut here ]------------
> > > [  144.212280] WARNING: at kernel/sched.c:6152 migrate_disable+0x1b6/0x200()
> > > [  144.212282] Hardware name: MS-7502
> > > [  144.212283] Modules linked in: snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device edd nfsd lockd parport_pc parport nfs_acl auth_rpcgss sunrpc bridge ipv6 stp cpufreq_conservative microcode cpufreq_ondemand cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf nls_iso8859_1 nls_cp437 vfat fat fuse ext3 jbd dm_mod usbmouse usb_storage usbhid snd_hda_codec_realtek usb_libusual uas sr_mod cdrom hid snd_hda_intel e1000e snd_hda_codec kvm_intel snd_hwdep sg snd_pcm kvm i2c_i801 snd_timer snd firewire_ohci firewire_core soundcore snd_page_alloc crc_itu_t button ext4 mbcache jbd2 crc16 uhci_hcd sd_mod ehci_hcd usbcore rtc_cmos ahci libahci libata scsi_mod fan processor thermal
> > > [  144.212317] Pid: 6215, comm: strace Not tainted 3.0.4-rt14 #2052
> > > [  144.212319] Call Trace:
> > > [  144.212323]  [<ffffffff8104662f>] warn_slowpath_common+0x7f/0xc0
> > > [  144.212326]  [<ffffffff8104668a>] warn_slowpath_null+0x1a/0x20
> > > [  144.212328]  [<ffffffff8103f606>] migrate_disable+0x1b6/0x200
> > > [  144.212331]  [<ffffffff8105a2a8>] ptrace_stop+0x128/0x240
> > > [  144.212334]  [<ffffffff81057b9b>] ? recalc_sigpending+0x1b/0x50
> > > [  144.212337]  [<ffffffff8105b6f1>] get_signal_to_deliver+0x211/0x530
> > > [  144.212340]  [<ffffffff81001835>] do_signal+0x75/0x7a0
> > > [  144.212342]  [<ffffffff8105ae68>] ? kill_pid_info+0x58/0x80
> > > [  144.212344]  [<ffffffff8105c34c>] ? sys_kill+0xac/0x1e0
> > > [  144.212347]  [<ffffffff81001fe5>] do_notify_resume+0x65/0x80
> > > [  144.212350]  [<ffffffff8135978b>] int_signal+0x12/0x17
> > > [  144.212352] ---[ end trace 0000000000000002 ]---
> > 
> > 
> > Right, that's because of 
> > 53da1d9456fe7f87a920a78fdbdcf1225d197cb7, I think we simply want a full
> > revert of that for -rt.
> 
> This also made me stare at the trainwreck called wait_task_inactive(),
> how about something like the below, it survives a boot and simple
> strace.

There's a missing hunklet, but...

@@ -8325,9 +8290,7 @@ void __init sched_init(void)
 
 	set_load_weight(&init_task);
 
-#ifdef CONFIG_PREEMPT_NOTIFIERS
 	INIT_HLIST_HEAD(&init_task.preempt_notifiers);
-#endif
 
 #ifdef CONFIG_SMP
 	open_softirq(SCHED_SOFTIRQ, run_rebalance_domains);

..perturbation (100% userspace hog) measurement proggy and jitter
measurement proggy pinned to the same cpu makes 100% repeatable boom.

Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 3
Pid: 6226, comm: pert Not tainted 3.0.4-rt14 #2053
Call Trace:
 <NMI>  [<ffffffff81355f00>] panic+0xa0/0x1a8
 [<ffffffff8108fe47>] watchdog_overflow_callback+0xe7/0xf0
 [<ffffffff810c1c7c>] __perf_event_overflow+0x9c/0x250
 [<ffffffff810c2734>] perf_event_overflow+0x14/0x20
 [<ffffffff81014c7c>] intel_pmu_handle_irq+0x21c/0x440
 [<ffffffff81010fb9>] perf_event_nmi_handler+0x39/0xc0
 [<ffffffff8106f42c>] notifier_call_chain+0x4c/0x70
 [<ffffffff8106fa6a>] __atomic_notifier_call_chain+0x4a/0x70
 [<ffffffff8106faa6>] atomic_notifier_call_chain+0x16/0x20
 [<ffffffff8106fc2e>] notify_die+0x2e/0x30
 [<ffffffff81002c8a>] do_nmi+0xaa/0x240
 [<ffffffff813592ea>] nmi+0x1a/0x20
 <<EOE>> <0>Rebooting in 60 seconds..[    0.000000]


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ