lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Aug 2010 17:05:45 -0700
From:	Greg KH <gregkh@...e.de>
To:	linux-kernel@...r.kernel.org, stable@...nel.org
Cc:	stable-review@...nel.org, torvalds@...ux-foundation.org,
	akpm@...ux-foundation.org, alan@...rguk.ukuu.org.uk,
	Milton Miller <miltonm@....com>,
	Anton Blanchard <anton@...ba.org>,
	Sonny Rao <sonnyrao@...ibm.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	"David S. Miller" <davem@...emloft.net>
Subject: [30/67] e100/e1000*/igb*/ixgb*: Add missing read memory barrier

2.6.35-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Jeff Kirsher <jeffrey.t.kirsher@...el.com>

commit 2d0bb1c1f4524befe9f0fcf0d0cd3081a451223f upstream.

Based on patches from Sonny Rao and Milton Miller...

Combined the patches to fix up clean_tx_irq and clean_rx_irq.

The PowerPC architecture does not require loads to independent bytes
to be ordered without adding an explicit barrier.

In ixgbe_clean_rx_irq we load the status bit then load the packet data.
With packet split disabled if these loads go out of order we get a
stale packet, but we will notice the bad sequence numbers and drop it.

The problem occurs with packet split enabled where the TCP/IP header
and data are in different descriptors. If the reads go out of order
we may have data that doesn't match the TCP/IP header. Since we use
hardware checksumming this bad data is never verified and it makes it
all the way to the application.

This bug was found during stress testing and adding this barrier has
been shown to fix it.  The bug can manifest as a data integrity issue
(bad payload data) or as a BUG in skb_pull().

This was a nasty bug to hunt down, if people agree with the fix I think
it's a candidate for stable.

Previously Submitted to e1000-devel only for ixgbe

http://marc.info/?l=e1000-devel&m=126593062701537&w=3

We've now seen this problem hit with other device drivers (e1000e mostly)
So I'm resubmitting with fixes for other Intel Device Drivers with
similar issues.

CC: Milton Miller <miltonm@....com>
CC: Anton Blanchard <anton@...ba.org>
CC: Sonny Rao <sonnyrao@...ibm.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@...el.com>
Signed-off-by: David S. Miller <davem@...emloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@...e.de>

---
 drivers/net/e100.c                 |    2 ++
 drivers/net/e1000/e1000_main.c     |    3 +++
 drivers/net/e1000e/netdev.c        |    4 ++++
 drivers/net/igb/igb_main.c         |    2 ++
 drivers/net/igbvf/netdev.c         |    2 ++
 drivers/net/ixgb/ixgb_main.c       |    2 ++
 drivers/net/ixgbe/ixgbe_main.c     |    1 +
 drivers/net/ixgbevf/ixgbevf_main.c |    2 ++
 8 files changed, 18 insertions(+)

--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -1779,6 +1779,7 @@ static int e100_tx_clean(struct nic *nic
 	for (cb = nic->cb_to_clean;
 	    cb->status & cpu_to_le16(cb_complete);
 	    cb = nic->cb_to_clean = cb->next) {
+		rmb(); /* read skb after status */
 		netif_printk(nic, tx_done, KERN_DEBUG, nic->netdev,
 			     "cb[%d]->status = 0x%04X\n",
 			     (int)(((void*)cb - (void*)nic->cbs)/sizeof(struct cb)),
@@ -1927,6 +1928,7 @@ static int e100_rx_indicate(struct nic *
 
 	netif_printk(nic, rx_status, KERN_DEBUG, nic->netdev,
 		     "status=0x%04X\n", rfd_status);
+	rmb(); /* read size after status bit */
 
 	/* If data isn't ready, nothing to indicate */
 	if (unlikely(!(rfd_status & cb_complete))) {
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3448,6 +3448,7 @@ static bool e1000_clean_tx_irq(struct e1
 	while ((eop_desc->upper.data & cpu_to_le32(E1000_TXD_STAT_DD)) &&
 	       (count < tx_ring->count)) {
 		bool cleaned = false;
+		rmb();	/* read buffer_info after eop_desc */
 		for ( ; !cleaned; count++) {
 			tx_desc = E1000_TX_DESC(*tx_ring, i);
 			buffer_info = &tx_ring->buffer_info[i];
@@ -3637,6 +3638,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -3843,6 +3845,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -774,6 +774,7 @@ static bool e1000_clean_rx_irq(struct e1
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
@@ -984,6 +985,7 @@ static bool e1000_clean_tx_irq(struct e1
 	while ((eop_desc->upper.data & cpu_to_le32(E1000_TXD_STAT_DD)) &&
 	       (count < tx_ring->count)) {
 		bool cleaned = false;
+		rmb(); /* read buffer_info after eop_desc */
 		for (; !cleaned; count++) {
 			tx_desc = E1000_TX_DESC(*tx_ring, i);
 			buffer_info = &tx_ring->buffer_info[i];
@@ -1080,6 +1082,7 @@ static bool e1000_clean_rx_irq_ps(struct
 			break;
 		(*work_done)++;
 		skb = buffer_info->skb;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		/* in the packet split case this is header only */
 		prefetch(skb->data - NET_IP_ALIGN);
@@ -1279,6 +1282,7 @@ static bool e1000_clean_jumbo_rx_irq(str
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 
 		status = rx_desc->status;
 		skb = buffer_info->skb;
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -5344,6 +5344,7 @@ static bool igb_clean_tx_irq(struct igb_
 
 	while ((eop_desc->wb.status & cpu_to_le32(E1000_TXD_STAT_DD)) &&
 	       (count < tx_ring->count)) {
+		rmb();	/* read buffer_info after eop_desc status */
 		for (cleaned = false; !cleaned; count++) {
 			tx_desc = E1000_TX_DESC_ADV(*tx_ring, i);
 			buffer_info = &tx_ring->buffer_info[i];
@@ -5549,6 +5550,7 @@ static bool igb_clean_rx_irq_adv(struct
 		if (*work_done >= budget)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		skb = buffer_info->skb;
 		prefetch(skb->data - NET_IP_ALIGN);
--- a/drivers/net/igbvf/netdev.c
+++ b/drivers/net/igbvf/netdev.c
@@ -248,6 +248,7 @@ static bool igbvf_clean_rx_irq(struct ig
 		if (*work_done >= work_to_do)
 			break;
 		(*work_done)++;
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 
 		buffer_info = &rx_ring->buffer_info[i];
 
@@ -780,6 +781,7 @@ static bool igbvf_clean_tx_irq(struct ig
 
 	while ((eop_desc->wb.status & cpu_to_le32(E1000_TXD_STAT_DD)) &&
 	       (count < tx_ring->count)) {
+		rmb();	/* read buffer_info after eop_desc status */
 		for (cleaned = false; !cleaned; count++) {
 			tx_desc = IGBVF_TX_DESC_ADV(*tx_ring, i);
 			buffer_info = &tx_ring->buffer_info[i];
--- a/drivers/net/ixgb/ixgb_main.c
+++ b/drivers/net/ixgb/ixgb_main.c
@@ -1816,6 +1816,7 @@ ixgb_clean_tx_irq(struct ixgb_adapter *a
 
 	while (eop_desc->status & IXGB_TX_DESC_STATUS_DD) {
 
+		rmb(); /* read buffer_info after eop_desc */
 		for (cleaned = false; !cleaned; ) {
 			tx_desc = IXGB_TX_DESC(*tx_ring, i);
 			buffer_info = &tx_ring->buffer_info[i];
@@ -1976,6 +1977,7 @@ ixgb_clean_rx_irq(struct ixgb_adapter *a
 			break;
 
 		(*work_done)++;
+		rmb();	/* read descriptor and rx_buffer_info after status DD */
 		status = rx_desc->status;
 		skb = buffer_info->skb;
 		buffer_info->skb = NULL;
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -748,6 +748,7 @@ static bool ixgbe_clean_tx_irq(struct ix
 	while ((eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)) &&
 	       (count < tx_ring->work_limit)) {
 		bool cleaned = false;
+		rmb(); /* read buffer_info after eop_desc */
 		for ( ; !cleaned; count++) {
 			struct sk_buff *skb;
 			tx_desc = IXGBE_TX_DESC_ADV(*tx_ring, i);
--- a/drivers/net/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ixgbevf/ixgbevf_main.c
@@ -231,6 +231,7 @@ static bool ixgbevf_clean_tx_irq(struct
 	while ((eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD)) &&
 	       (count < tx_ring->work_limit)) {
 		bool cleaned = false;
+		rmb(); /* read buffer_info after eop_desc */
 		for ( ; !cleaned; count++) {
 			struct sk_buff *skb;
 			tx_desc = IXGBE_TX_DESC_ADV(*tx_ring, i);
@@ -518,6 +519,7 @@ static bool ixgbevf_clean_rx_irq(struct
 			break;
 		(*work_done)++;
 
+		rmb(); /* read descriptor and rx_buffer_info after status DD */
 		if (adapter->flags & IXGBE_FLAG_RX_PS_ENABLED) {
 			hdr_info = le16_to_cpu(ixgbevf_get_hdr_info(rx_desc));
 			len = (hdr_info & IXGBE_RXDADV_HDRBUFLEN_MASK) >>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ