linux-kernel - Re: Linux 2.6.21-rc5

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070326083146.GA11666@elte.hu>
Date:	Mon, 26 Mar 2007 10:31:46 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Ayaz Abdulla <aabdulla@...dia.com>,
	Jeff Garzik <jeff@...zik.org>, Adrian Bunk <bunk@...sta.de>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: Linux 2.6.21-rc5


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> There's various fixes here, ranging from some architecture updates 
> (ia64, ARM, MIPS, SH, Sparc64) to KVM, networking and network drivers.

here's a new v2.6.20 -> v2.6.21 forcedeth.c regression:

in the last week or so i've been seeing sporadic under-load forcedeth.c 
crashes (see the full oops further below):

 eth1: too many iterations (6) in nv_nic_irq.
 Unable to handle kernel NULL pointer dereference at 0000000000000088 RIP: 
 [<ffffffff80404587>] nv_tx_done+0xf4/0x1cf

this is line 1906 of drivers/net/forcedeth.c:

    np->stats.tx_bytes += np->get_tx_ctx->skb->len;

struct sk_buff's len field is at offset 88, so np->get_tx_ctx->skb is 
NULL. That is an 'impossible' scenario for tx descriptors here - the tx 
ring descriptors are always set up with a valid skb (and a valid dma 
address), and their completion is serialized via np->lock.

these crashes are almost instant on the .21-rc5-rt kernel, but extremely 
sporadic on the upstream kernel and needed very high networking loads to 
trigger. Today i found a good way to trigger it almost instantly on 
upstream kernels too: apply the debug patch attached further below and 
do:

	echo 100 > /proc/sys/kernel/panic

that will inject 100 artificial 'too many iterations' failures and 
provokes a TX timeout - which TX timeout will crash. (i've used a 
dual-core Athlon64 system in this test)

my first quick guess was to extend np->priv locking to the whole of 
nv_start_xmit/nv_start_xmit_optimized - while that appeared to make the 
crash a bit less likely, it did not prevent it. So there must be some 
other, more fundamental problem be left as well. At first glance the SMP 
locking looks OK, so maybe the ring indices are messed up somehow and we 
got into a 'ring head bites the tail' scenario?

i can provide more info if needed.

	Ingo

-------------->
eth1: too many iterations (6) in nv_nic_irq.
Unable to handle kernel NULL pointer dereference at 0000000000000088 RIP: 
 [<ffffffff80404587>] nv_tx_done+0xf4/0x1cf
PGD 34d03067 PUD 34d02067 PMD 0 
Oops: 0000 [1] PREEMPT SMP 
CPU 1 
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.21-rc5 #8
RIP: 0010:[<ffffffff80404587>]  [<ffffffff80404587>] nv_tx_done+0xf4/0x1cf
RSP: 0018:ffff81003ff6be40  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff810002e26700 RCX: 0000000000000001
RDX: 0000000000000042 RSI: 000000003ef00cbe RDI: ffff81003fbeb070
RBP: ffff81003ff6be60 R08: ffff810002e26a00 R09: 0000000000000003
R10: ffff81003ff4e100 R11: ffff810001e283f8 R12: 000000003ef00cbe
R13: ffff810002e26000 R14: ffff810002e28fc0 R15: 0000000000000000
FS:  00002b6cb57f1db0(0000) GS:ffff81003ff4ad40(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000088 CR3: 0000000034c87000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffff81003ff64000, task ffff81003ff4e100)
Stack:  ffff810002e26700 0000000000000032 ffffc2000001a000 ffff810002e26000
 ffff81003ff6bea0 ffffffff80406dae ffff810002e26700 ffff810002e26700
 ffff810002e26000 00000000000000ff ffffc2000001a000 ffffffff80749080
Call Trace:
 <IRQ>  [<ffffffff80406dae>] nv_nic_irq+0x76/0x261
 [<ffffffff8040961e>] nv_do_nic_poll+0x200/0x284
 [<ffffffff8040941e>] nv_do_nic_poll+0x0/0x284
 [<ffffffff80241995>] run_timer_softirq+0x167/0x1dd
 [<ffffffff8023de45>] __do_softirq+0x5b/0xc9
 [<ffffffff8020af0c>] call_softirq+0x1c/0x28
 [<ffffffff8020c2b4>] do_softirq+0x31/0x84
 [<ffffffff8023db16>] irq_exit+0x3f/0x50
 [<ffffffff802190c2>] smp_apic_timer_interrupt+0x49/0x5b
 [<ffffffff802087fb>] default_idle+0x0/0x44
 [<ffffffff8020a9b6>] apic_timer_interrupt+0x66/0x70
 <EOI>  [<ffffffff8020882a>] default_idle+0x2f/0x44
 [<ffffffff8020804c>] enter_idle+0x22/0x24
 [<ffffffff802088d0>] cpu_idle+0x91/0xd4
 [<ffffffff80218572>] start_secondary+0x2e3/0x2f5

---
 drivers/net/forcedeth.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Index: linux/drivers/net/forcedeth.c
===================================================================
--- linux.orig/drivers/net/forcedeth.c
+++ linux/drivers/net/forcedeth.c
@@ -2908,6 +2908,10 @@ static irqreturn_t nv_nic_irq(int foo, v
 			spin_unlock(&np->lock);
 			break;
 		}
+		if (panic_timeout > 0) {
+			panic_timeout--;
+			i = max_interrupt_work+1;
+		}
 		if (unlikely(i > max_interrupt_work)) {
 			spin_lock(&np->lock);
 			/* disable interrupts on the nic */
@@ -3026,6 +3030,10 @@ static irqreturn_t nv_nic_irq_optimized(
 			break;
 		}
 
+		if (panic_timeout > 0) {
+			panic_timeout--;
+			i = max_interrupt_work+1;
+		}
 		if (unlikely(i > max_interrupt_work)) {
 			spin_lock(&np->lock);
 			/* disable interrupts on the nic */
@@ -3076,6 +3084,10 @@ static irqreturn_t nv_nic_irq_tx(int foo
 			dprintk(KERN_DEBUG "%s: received irq with events 0x%x. Probably TX fail.\n",
 						dev->name, events);
 		}
+		if (panic_timeout > 0) {
+			panic_timeout--;
+			i = max_interrupt_work+1;
+		}
 		if (unlikely(i > max_interrupt_work)) {
 			spin_lock_irqsave(&np->lock, flags);
 			/* disable interrupts on the nic */
@@ -3191,6 +3203,10 @@ static irqreturn_t nv_nic_irq_rx(int foo
 			}
 		}
 
+		if (panic_timeout > 0) {
+			panic_timeout--;
+			i = max_interrupt_work+1;
+		}
 		if (unlikely(i > max_interrupt_work)) {
 			spin_lock_irqsave(&np->lock, flags);
 			/* disable interrupts on the nic */
@@ -3264,6 +3280,10 @@ static irqreturn_t nv_nic_irq_other(int 
 			printk(KERN_DEBUG "%s: received irq with unknown events 0x%x. Please report\n",
 						dev->name, events);
 		}
+		if (panic_timeout > 0) {
+			panic_timeout--;
+			i = max_interrupt_work+1;
+		}
 		if (unlikely(i > max_interrupt_work)) {
 			spin_lock_irqsave(&np->lock, flags);
 			/* disable interrupts on the nic */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/