lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20070601003410.GA10330@gondor.apana.org.au>
Date:	Fri, 1 Jun 2007 10:34:10 +1000
From:	Herbert Xu <herbert@...dor.apana.org.au>
To:	Doug Chapman <doug.chapman@...com>
Cc:	auke-jan.h.kok@...el.com, netdev@...r.kernel.org,
	e1000-devel@...ts.sourceforge.net
Subject: Re: REGRESSION: panic on e1000 driver

On Thu, May 31, 2007 at 06:38:28PM -0400, Doug Chapman wrote:
> 
> I get a backtrace as it probes each e1000 device and I also still get
> the unexpected interrupt message.
> 
> 
> WARNING: at drivers/net/e1000/e1000_main.c:1331 e1000_sw_init()

Thanks for testing!

Although I still don't know what caused the interrupt in your case,
it is clear that we need to be able to deal with interrupts as soon
as the handler is registered since the cause register is not affected
by e1000_irq_disable and a shared interrupt can easily be mistaken as
our own.

So Auke's solution of doing netif_poll_disable should fix this problem.

In looking at this I've found a couple of other problems:

1) Race between IRQ handler and e1000_open:

A shared/spurious interrupt can cause this:

CPU0				CPU1
e1000_open
	request_irq
				spurious/shared IRQ
				e1000_interrupt
	e1000_irq_enable
		atomic_dec_*
					atomic_inc
					IMC <- ~0
		IMS <- MASK

So we end up with IRQs enabled when they shouldn't be.

2) Race between IRQ handler and e1000_clean (and other mgmt functions):

Again shared/spurious interrupts may cause problems:

CPU0				CPU1
e1000_clean
	do work
				spurious/shared IRQ
				e1000_interrupt
					clear ICR
					netif_rx_schedule_prep fails
					e1000_irq_enable
	netif_rx_complete
	e1000_irq_enable

At this point IRQs are on but we've lost an interrupt.

We can fix this by

1) Ignoring IRQs when irq_sem > 0.
2) Always generate an IRQ after e1000_irq_enable.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists