lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200707192113.26878.olaf.kirch@oracle.com>
Date:	Thu, 19 Jul 2007 21:13:25 +0200
From:	Olaf Kirch <olaf.kirch@...cle.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	"Kok, Auke" <auke-jan.h.kok@...el.com>,
	Jarek Poplawski <jarkao2@...pl>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, davem@...emloft.net
Subject: Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

On Thursday 19 July 2007 18:07, Ingo Molnar wrote:
> because i dont seem to be able to trigger Olaf's WARN_ON(), can you see 
> anything in the ethtool output that i sent in the previous mail(s)?

If the WARN_ON doesn't trigger, I cannot see how my patch would affect
your system.

 -	IF we enter the if() branch in poll_napi, then the following
	must hold:
	 -	an interrupt from the e1000 arrived
	 -	e1000_intr disables interrupts, increments irq_sem
		(which is now 1) and schedules rx_action

 -	Now, inside poll_napi, the following happens:
	 -	poll_napi is called, finds the device is marked for
		polling, invokes dev->poll
	 -	dev->poll calls netif_rx_complete (which does *not*
		remove the device from the poll list), and re-enables
		interrupts. irq_sem is now 0.

 -	Finally, the rx_action softirq is run, which calls dev->poll
	again, which ends up invoking netif_rx_complete once more,
	and tries to re-enable interrupts. The latter doesn't do
	anything except decrementing irq_sem once more, which now
	goes negative.

	Which would trigger the WARN_ON.

Now, as you say the WARN_ON is never triggered, it follows that
we never end up in the if() branch of poll_napi. But that is
where the only substantial modification of my patch is.

Here's a somewhat drastic modification that should not change any
timing, but just verifies whether my patch is to blame at all. Can
you give it a try?

Thanks,
Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@....de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Test patch
---
 include/linux/netdevice.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: build-2.6/include/linux/netdevice.h
===================================================================
--- build-2.6.orig/include/linux/netdevice.h
+++ build-2.6/include/linux/netdevice.h
@@ -1027,7 +1027,7 @@ static inline void netif_rx_complete(str
 	 * But at least it doesn't penalize the non-netpoll
 	 * code path. */
 	if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, &dev->state))
-		return;
+		BUG();
 #endif
 
 	local_irq_save(flags);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ