lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <200709071137.02801.ossthema@de.ibm.com>
Date:	Fri, 7 Sep 2007 11:37:02 +0200
From:	Jan-Bernd Themann <ossthema@...ibm.com>
To:	Stephen Hemminger <shemminger@...ux-foundation.org>
Cc:	netdev <netdev@...r.kernel.org>, themann@...ibm.com,
	Christoph Raisch <raisch@...ibm.com>
Subject: new NAPI interface broken

Hi Stephen,

I saw that you developed most of the new NAPI interface.
I already addressed this issue a while ago. Please correct me if I got
it wrong. I think there is still a serious problem with the NAPI
changes to make NAPI polling independent of struct net_device objects.
Its about the question who inserts and removes devices from the poll list.

netif_rx_schedule: sets NAPI_STATE_SCHED flag, insert device in poll list.
netif_rx_complete: clears NAPI_STATE_SCHED
netif_rx_reschedule: sets NAPI_STATE_SCHED, insert device in poll list.
net_rx_action: 
 -removes dev from poll list
 -calls poll function
 -adds dev to poll list if NAPI_STATE_SCHED still set

1) netif_rx_complete and netif_rx_reschedule don't work together
2) On SMP systems: after netif_rx_complete has been called on CPU1
   (+interruts enabled), netif_rx_schedule could be called on CPU2 
   (irq handler) before net_rx_action on CPU1 has checked NAPI_STATE_SCHED. 
   In that case the device would be added to poll lists of CPU1 and CPU2
   as net_rx_action would see NAPI_STATE_SCHED set.
   This must not happen. It will be caught when netif_rx_complete is
   called the second time (BUG() called)

This would mean we have a problem on all SMP machines right now.

If I got all this right then we probably need a further flag to tell
net_rx_action whether to poll again or to stop (with the possibility
that the device has been scheduled on a different CPU in between).
The "old" NAPI interface uses the return value of poll to determine
if the device has to be polled again or not. 
We can either switch back or in case we want to stick to
the new return value, we might have to add something similar to 
the NAPI_STATE_SCHED flag or a new parameter...

Regards,
Jan-Bernd
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ