lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 8 Mar 2014 19:30:40 -0500 From: Ming Chen <v.mingchen@...il.com> To: Eric Dumazet <eric.dumazet@...il.com> Cc: netdev@...r.kernel.org, Erez Zadok <ezk@....cs.sunysb.edu>, Dean Hildebrand <dhildeb@...ibm.com>, Geoff Kuenning <geoff@...hmc.edu> Subject: Re: [BUG?] ixgbe: only num_online_cpus() of the tx queues are enabled Hi Eric, Thanks for the suggestion. I believe "tc qdisc replace dev eth0 root fq" can achieve fairness if we have only one queue. My understanding is that we cannot directly apply FQ to a multiqueue devices, isn't it? If we apply FQ separately to each tx queue, then what if we have one flow (fl-0) in tx-0, but two flows (fl-1 and fl-2) in tx-1? With FQ, the two flows in tx-1 should get the same bandwidth. But how about fl-0 and fl-1? Best, Ming On Sat, Mar 8, 2014 at 10:27 AM, Eric Dumazet <eric.dumazet@...il.com> wrote: > On Sat, 2014-03-08 at 01:13 -0500, Ming Chen wrote: >> Hi, >> >> We have an Intel 82599EB dual-port 10GbE NIC, which has 128 tx queues >> (64 per port and we used only one port). We found only 12 of the tx >> queues are enabled, where 12 is number of CPUs of our system. >> >> We realized that, in the driver code, adapter->num_tx_queues (which >> decides netdev->real_num_tx_queues) is indirectly set to "min_t(int, >> IXGBE_MAX_RSS_INDICES, num_online_cpus())". It looks like the limit is >> for RSS. But why tx queues is also set to the same as rx queues? >> >> The problem of having a small number of tx queues is high probability >> of hash collision in skb_tx_hash(). If we have a small number of >> long-lived data-intensive TCP flows, the hash collision can causes >> unfairness. We found this problem during our benchmarking of NFS when >> identical NFS clients are getting very different throughput when >> reading a big file from the server. We call this problem Hash-Cast. If >> interested, you can take a look at this poster: >> http://www.fsl.cs.sunysb.edu/~mchen/fast14poster-hashcast-portrait.pdf >> >> Can anybody take a loot at this? It would be better to have all tx >> queues enabled by default. If this is unlikely to happen, is there a >> way to reconfigure the NIC so that we can use all tx queues if we >> want? >> >> FYI, our kernel version is 3.12.0, but I found the same limit of tx >> queues in the code of the latest kernel. I am counting the number of >> enabled queues using "ls /sys/class/net/p3p1/queues| grep -c tx-" >> >> Best, > > Quite frankly, with a 1Gbe link, I would just use FQ and your problem > would disappear. > > (I also use FQ with 40Gbe links if that matters) > > For a 1Gbe link, the following command is more than enough. > > tc qdisc replace dev eth0 root fq > > Also, following patch would probably help fairness. I'll submit an > official and more complete patch later. > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index bc0fb0fc7552..296c201516d1 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -1911,8 +1911,7 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, > * of queued bytes to ensure line rate. > * One example is wifi aggregation (802.11 AMPDU) > */ > - limit = max_t(unsigned int, sysctl_tcp_limit_output_bytes, > - sk->sk_pacing_rate >> 10); > + limit = 2 * skb->truesize; > > if (atomic_read(&sk->sk_wmem_alloc) > limit) { > set_bit(TSQ_THROTTLED, &tp->tsq_flags); > > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists