lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 Feb 2007 15:29:06 -0500
From:	Andy Gospodarek <andy@...yhouse.net>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	netdev@...r.kernel.org,
	Stephen Hemminger <shemminger@...ux-foundation.org>,
	lpiccilli@...re.com.br,
	"bugme-daemon@...nel-bugs.osdl.org" 
	<bugme-daemon@...zilla.kernel.org>
Subject: Re: [Bugme-new] [Bug 7974] New: BUG: scheduling while atomic: swapper/0x10000100/0

On Fri, Feb 09, 2007 at 01:38:02PM -0800, Andrew Morton wrote:
> 
> cond_resched() called from softirq, amongst other problems.
> 
> On Fri, 9 Feb 2007 08:23:44 -0800
> bugme-daemon@...zilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=7974
> > 
> >            Summary: BUG: scheduling while atomic: swapper/0x10000100/0
> >     Kernel Version: 2.6.20
> >             Status: NEW
> >           Severity: normal
> >              Owner: acme@...ectiva.com.br
> >          Submitter: lpiccilli@...re.com.br
> > 
> > 
> > The machine hangs in normal boot with 2.6.19 and 2.6.20 after network starts. If
> > I boot in single mode and start the services manually, the machine and network
> > works fine, but I see this on dmesg:
> > 
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff802607d0>] __sched_text_start+0x60/0xb27
> >  [<ffffffff8802c8a1>] :tg3:tg3_setup_copper_phy+0x9d9/0xad9
> >  [<ffffffff8026f8d5>] smp_local_timer_interrupt+0x34/0x5b
> >  [<ffffffff8802d6d4>] :tg3:tg3_setup_phy+0xd33/0xe16
> >  [<ffffffff8027f572>] __cond_resched+0x1c/0x44
> >  [<ffffffff802613aa>] cond_resched+0x2e/0x39
> >  [<ffffffff80209f5a>] kmem_cache_alloc+0x14/0x58
> >  [<ffffffff8022df6f>] __alloc_skb+0x36/0x134
> >  [<ffffffff804a4ea7>] rtmsg_ifinfo+0x28/0xa1
> >  [<ffffffff804a4f81>] rtnetlink_event+0x61/0x68
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b0c>] :bonding:alb_swap_mac_addr+0x8b/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > RTNL: assertion failed at net/core/fib_rules.c (444)
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff804a8abb>] fib_rules_event+0x3b/0x120
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b0c>] :bonding:alb_swap_mac_addr+0x8b/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > BUG: scheduling while atomic: swapper/0x10000100/0
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff802607d0>] __sched_text_start+0x60/0xb27
> >  [<ffffffff8026f8d5>] smp_local_timer_interrupt+0x34/0x5b
> >  [<ffffffff8027f572>] __cond_resched+0x1c/0x44
> >  [<ffffffff802613aa>] cond_resched+0x2e/0x39
> >  [<ffffffff80262029>] mutex_lock+0x9/0x18
> >  [<ffffffff8049ea76>] netdev_run_todo+0x16/0x230
> >  [<ffffffff804bcc75>] dst_rcu_free+0x0/0x3f
> >  [<ffffffff804d9c29>] inetdev_event+0x29/0x2d0
> >  [<ffffffff80263138>] _spin_lock_bh+0x9/0x19
> >  [<ffffffff8023d84a>] rt_run_flush+0x92/0xcc
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b0c>] :bonding:alb_swap_mac_addr+0x8b/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > RTNL: assertion failed at net/ipv4/devinet.c (1055)
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff804d9c48>] inetdev_event+0x48/0x2d0
> >  [<ffffffff80263138>] _spin_lock_bh+0x9/0x19
> >  [<ffffffff8023d84a>] rt_run_flush+0x92/0xcc
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b0c>] :bonding:alb_swap_mac_addr+0x8b/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > BUG: scheduling while atomic: swapper/0x10000100/0
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff802607d0>] __sched_text_start+0x60/0xb27
> >  [<ffffffff8802c8a1>] :tg3:tg3_setup_copper_phy+0x9d9/0xad9
> >  [<ffffffff8026f8d5>] smp_local_timer_interrupt+0x34/0x5b
> >  [<ffffffff8802d6d4>] :tg3:tg3_setup_phy+0xd33/0xe16
> >  [<ffffffff8027f572>] __cond_resched+0x1c/0x44
> >  [<ffffffff802613aa>] cond_resched+0x2e/0x39
> >  [<ffffffff80209f5a>] kmem_cache_alloc+0x14/0x58
> >  [<ffffffff8022df6f>] __alloc_skb+0x36/0x134
> >  [<ffffffff804a4ea7>] rtmsg_ifinfo+0x28/0xa1
> >  [<ffffffff804a4f81>] rtnetlink_event+0x61/0x68
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b1e>] :bonding:alb_swap_mac_addr+0x9d/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > RTNL: assertion failed at net/core/fib_rules.c (444)
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff804a8abb>] fib_rules_event+0x3b/0x120
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b1e>] :bonding:alb_swap_mac_addr+0x9d/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > BUG: scheduling while atomic: swapper/0x10000100/0
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff802607d0>] __sched_text_start+0x60/0xb27
> >  [<ffffffff8026f8d5>] smp_local_timer_interrupt+0x34/0x5b
> >  [<ffffffff8027f572>] __cond_resched+0x1c/0x44
> >  [<ffffffff802613aa>] cond_resched+0x2e/0x39
> >  [<ffffffff80262029>] mutex_lock+0x9/0x18
> >  [<ffffffff8049ea76>] netdev_run_todo+0x16/0x230
> >  [<ffffffff804bcc75>] dst_rcu_free+0x0/0x3f
> >  [<ffffffff804d9c29>] inetdev_event+0x29/0x2d0
> >  [<ffffffff80263138>] _spin_lock_bh+0x9/0x19
> >  [<ffffffff8023d84a>] rt_run_flush+0x92/0xcc
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b1e>] :bonding:alb_swap_mac_addr+0x9d/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > RTNL: assertion failed at net/ipv4/devinet.c (1055)
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff804d9c48>] inetdev_event+0x48/0x2d0
> >  [<ffffffff80263138>] _spin_lock_bh+0x9/0x19
> >  [<ffffffff8023d84a>] rt_run_flush+0x92/0xcc
> >  [<ffffffff8028ad24>] notifier_call_chain+0x24/0x36
> >  [<ffffffff8049ccad>] dev_set_mac_address+0x53/0x66
> >  [<ffffffff88019648>] :bonding:alb_set_slave_mac_addr+0x4b/0x73
> >  [<ffffffff88019b1e>] :bonding:alb_swap_mac_addr+0x9d/0x15c
> >  [<ffffffff804a02e7>] dev_mc_add+0x137/0x148
> >  [<ffffffff88014338>] :bonding:bond_change_active_slave+0x1ea/0x34d
> >  [<ffffffff8801481d>] :bonding:bond_select_active_slave+0xbe/0xf4
> >  [<ffffffff880165e5>] :bonding:bond_mii_monitor+0x401/0x45b
> >  [<ffffffff880161e4>] :bonding:bond_mii_monitor+0x0/0x45b
> >  [<ffffffff80287ed2>] run_timer_softirq+0x142/0x1c0
> >  [<ffffffff80211f6a>] __do_softirq+0x5c/0xd2
> >  [<ffffffff8025ef4c>] call_softirq+0x1c/0x28
> >  [<ffffffff80268e95>] do_softirq+0x2c/0x87
> >  [<ffffffff8026fdd7>] smp_apic_timer_interrupt+0x57/0x6c
> >  [<ffffffff8025691a>] mwait_idle+0x0/0x45
> >  [<ffffffff8025e9f6>] apic_timer_interrupt+0x66/0x70
> >  <EOI>  [<ffffffff8025695c>] mwait_idle+0x42/0x45
> >  [<ffffffff802484e5>] cpu_idle+0x5b/0x7a
> >  [<ffffffff8065b88a>] start_secondary+0x4e0/0x4f6
> > 
> > bonding: bond0: first active interface up!
> > NET: Registered protocol family 10
> > lo: Disabled Privacy Extensions
> > ADDRCONF(NETDEV_UP): eth1: link is not ready
> > bond0: no IPv6 routers present
> > eth0: no IPv6 routers present
> > Installing knfsd (copyright (C) 1996 okir@...ad.swb.de).
> > NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> > NFSD: starting 90-second grace period
> > 

I've been working off and on for a little while to resolve these issues
and even posted a patch not long ago to address some these by removing
the timers and using workqueues instead.  This enabled resolution of
quite a few of the issues with bonding since the code was no longer
running in an atomic context and could now more easily take locks.

On the side I've also been working to keep the timers and take the rtnl
lock in the correct place so avoid messages like these:

> > RTNL: assertion failed at net/ipv4/devinet.c (1055)
> > 
> > Call Trace:
> >  <IRQ>  [<ffffffff804d9c48>] inetdev_event+0x48/0x2d0
> >  [<ffffffff80263138>] _spin_lock_bh+0x9/0x19
> >  [<ffffffff8023d84a>] rt_run_flush+0x92/0xcc

but I recently have been getting a panic on one of my systems and need
to get a serial cable so I can get the full string, so I haven't debugged
that yet.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists