lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 16 Dec 2011 05:27:56 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Rick Jones <rick.jones2@...com>
Cc:	Stephen Hemminger <shemminger@...tta.com>,
	Vijay Subramanian <subramanian.vijay@...il.com>,
	tcpdump-workers@...ts.tcpdump.org, netdev@...r.kernel.org,
	Matthew Vick <matthew.vick@...el.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>
Subject: Re: twice past the taps, thence out to net?

Le jeudi 15 décembre 2011 à 14:22 -0800, Rick Jones a écrit :
> On 12/15/2011 11:00 AM, Eric Dumazet wrote:
> >> Device's work better if the driver proactively manages stop_queue/wake_queue.
> >> Old devices used TX_BUSY, but newer devices tend to manage the queue
> >> themselves.
> >>
> >
> > Some 'new' drivers like igb can be fooled in case skb is gso segmented ?
> >
> > Because igb_xmit_frame_ring() needs skb_shinfo(skb)->nr_frags + 4
> > descriptors, igb should stop its queue not at MAX_SKB_FRAGS + 4, but
> > MAX_SKB_FRAGS*4
> >
> > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> > index 89d576c..989da36 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_main.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> > @@ -4370,7 +4370,7 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb,
> >   	igb_tx_map(tx_ring, first, hdr_len);
> >
> >   	/* Make sure there is space in the ring for the next send. */
> > -	igb_maybe_stop_tx(tx_ring, MAX_SKB_FRAGS + 4);
> > +	igb_maybe_stop_tx(tx_ring, MAX_SKB_FRAGS * 4);
> >
> >   	return NETDEV_TX_OK;
> 
> 
> Is there a minimum transmit queue length here?  I get the impression 
> that MAX_SKB_FRAGS is at least 16 and is 18 on a system with 4096 byte 
> pages.  The previous addition then would be OK so long as the TX queue 
> was always at least 22 entries in size, but now it would have to always 
> be at least 72?
> 
> I guess things are "OK" at the moment:
> 
> raj@...dy:~/net-next/drivers/net/ethernet/intel/igb$ grep IGB_MIN_TXD *.[ch]
> igb_ethtool.c:	new_tx_count = max_t(u16, new_tx_count, IGB_MIN_TXD);
> igb.h:#define IGB_MIN_TXD                       80
> 
> but is that getting a little close?
> 
> rick jones

Sure !

I only pointed out a possible problem, and not gave a full patch, since
we also need to change the opposite threshold (when we XON the queue at
TX completion)

You can see its not even consistent with the minimum for a single TSO
frame ! Most probably your high requeue numbers come from this too low
value given the real requirements of the hardware (4 + nr_frags
descriptors per skb)

/* How many Tx Descriptors do we need to call netif_wake_queue ? */ 
#define IGB_TX_QUEUE_WAKE   16


Maybe we should CC Intel guys

Could you try following patch ?

Thanks !

diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index c69feeb..93ce118 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -51,8 +51,8 @@ struct igb_adapter;
 /* TX/RX descriptor defines */
 #define IGB_DEFAULT_TXD                  256
 #define IGB_DEFAULT_TX_WORK		 128
-#define IGB_MIN_TXD                       80
-#define IGB_MAX_TXD                     4096
+#define IGB_MIN_TXD		max_t(unsigned, 80U, IGB_TX_QUEUE_WAKE * 2)
+#define IGB_MAX_TXD             4096
 
 #define IGB_DEFAULT_RXD                  256
 #define IGB_MIN_RXD                       80
@@ -121,8 +121,11 @@ struct vf_data_storage {
 #define IGB_RXBUFFER_16384 16384
 #define IGB_RX_HDR_LEN     IGB_RXBUFFER_512
 
-/* How many Tx Descriptors do we need to call netif_wake_queue ? */
-#define IGB_TX_QUEUE_WAKE	16
+/* How many Tx Descriptors should be available
+ * before calling netif_wake_subqueue() ?
+ */
+#define IGB_TX_QUEUE_WAKE	(MAX_SKB_FRAGS * 4)
+
 /* How many Rx Buffers do we bundle into one write to the hardware ? */
 #define IGB_RX_BUFFER_WRITE	16	/* Must be power of 2 */
 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists