lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <21439.11207.651173.426859@fisica.ufpr.br>
Date:	Thu, 10 Jul 2014 21:11:51 -0300
From:	Carlos Carvalho <carlos@...ica.ufpr.br>
To:	Flavio Leitner <fbl@...hat.com>
Cc:	"Skidmore\, Donald C" <donald.c.skidmore@...el.com>,
	Tom Herbert <therbert@...gle.com>,
	Linux Netdev List <netdev@...r.kernel.org>
Subject: Re: RSS is not efficient when forwarding (ixgbe)

Flavio Leitner (fbl@...hat.com) wrote on 9 July 2014 22:14:
 >On Wed, Jul 09, 2014 at 09:08:27PM -0300, Carlos Carvalho wrote:
 >> Flavio Leitner (fbl@...hat.com) wrote on 9 July 2014 02:22:
 >>  >On Tue, Jul 08, 2014 at 02:32:43PM -0300, Carlos Carvalho wrote:
 >>  >> Flavio Leitner (fbl@...hat.com) wrote on 8 July 2014 14:21:
 >>  >>  >On Tue, Jul 08, 2014 at 02:09:13PM -0300, Carlos Carvalho wrote:
 >>  >>  >> Flavio Leitner (fbl@...hat.com) wrote on 7 July 2014 21:28:
 >>  >>  >>  >On Mon, Jul 07, 2014 at 04:33:24PM +0000, Skidmore, Donald C wrote:
 >>  >>  >>  >> 
 >>  >>  >>  >> 
 >>  >>  >>  >> > 
 >>  >>  >>  >> > It's a router forwarding traffic from one interface to another, so I guess it's
 >>  >>  >>  >> > only the kernel. BTW, no firewall.
 >>  >>  >>  >> > 
 >>  >>  >>  >> > Flow Director needs to be enabled and I am using defaults.
 >>  >>  >>  >> 
 >>  >>  >>  >> Flow Director in ATR mode is on by default for ixgbe.  So like Tom mentioned the driver will create hash buckets for egress packets.  You could try disabling ATR and just use RSS.  Which would probably be the right thing to do any way since Flow Director isn't very useful for routing scenarios.  
 >>  >>  >>  >
 >>  >>  >>  >That was it.
 >>  >>  >> 
 >>  >>  >> We have a similar setup and similar problem. How do we disable ATR? I
 >>  >>  >> tried to set ntuple off but this almost zeroed traffic. I also tried
 >>  >>  >> to change rx-flow-hash but ethtool says it's not possible. The docs
 >>  >>  >> say that one can disable ATR by setting AtrSampleRate to 0 but this
 >>  >>  >> parameter doesn't exist in 3.14.10.
 >>  >>  >> 
 >>  >>  >> So, how do we disable ATR and keep RSS?
 >>  >>  > 
 >>  >>  >Keep in mind that this is actually 2 problems. One is enabling the
 >>  >>  >NIC to receive the streams in all queues for this scenario (setting
 >>  >>  >ntuple off and restarting the traffic works for me). The second problem
 >>  >>  >is having all the queue interrupts spread among the CPUs. That's what
 >>  >>  >does irqbalance, tuna, etc...
 >>  >> 
 >>  >> Spreading the interrupts among the cpus is not the issue for us. The
 >>  >> problem is that the number of interrupts is *very* different among the
 >>  >> irq's, so no matter how I distribute them among cores there will
 >>  >> always be a few that get saturated while 70% of the machine capacity
 >>  >> remains idle. Your case seems to be the extreme of ours, where all
 >>  >> the flux goes to a single irq.
 >>  >> 
 >>  >> The problem is in how the NIC distributes traffic among the irq's in
 >>  >> the router. Traffic comes almost only from a single machine and
 >>  >> spreads through several thousand destinations in the internet. That's
 >>  >> why I tried to set the receiving hash mode to the destination IP, but
 >>  >> the NIC or driver refuses. So how do I even out the frequency of irq's?
 >>  >
 >>  >So you see the traffic going to a few queues only and the rest is
 >>  >idle, is that correct?  If so, then RSS seems to be working, but
 >>  >since all the traffic comes from one server and likely one port,
 >>  >maybe the hash is not good enough to distribute among all queues.
 >>  >I'd try using software hashing instead of hw hashing to see if it
 >>  >helps:
 >>  ># ethtool -K <iface> rxhash off
 >> 
 >> I'm all for using software instead of hardware. However, in this case
 >> this is a fundamental function of the NIC, to distribute the load
 >> among cores; if we do it via software, a single core (or subset of
 >> them) will have to do all the work. So in this particular case I think
 >> the correct way is to try to do it in the NIC.
 >
 >Actually no, you can use RPS with software hashing to distribute the
 >workload.  Take a look at Documentation/networking/scaling.txt for
 >more details.

Thanks for the pointer. I'll try that if there's no way to do it on
the NIC.

 >>  >BTW, there was a typo in my previous post, I had to turn on ntuple to
 >>  >disable ATR.
 >> 
 >> Ah. Here it was off by default, which contradicts what Donald said
 >> above... I turned it on, and nothing changed(?!). The nic was reset
 >> but the distribution among queues is the same.
 >
 >ntuple is about Perfect Filters and it's OFF by default which leaves
 >ATR mode ON. In my case, ATR mode was responsible for directing all
 >the packets to a single queue. Once I enable nutple, the driver
 >disables ATR and that works for me.

There does seem to be some confusion about this. However this thread
is getting long and since it makes no difference for us I'd rather
focus on our problem.

 >> I checked now and in fact the distribution is almost constant among
 >> 16 IRQ's. That's the problem, because it leaves the other 24
 >> cores idle. Not completely, but the difference is 4 orders of
 >> magnitude: 5.41e+08 interruptions in the active 16 IRQ's versus
 >> 6.36e+04 in the others. So 60% of the machine is just contributing to
 >> global warming and, importantly, limiting our performance :-(
 >
 >How many NIC queues do you have?  It sounds like you have only 16,
 >so you're limited by the number of queues which maps to 16 CPUs.

40 for each NIC. The driver seems to configure the number of queues
equal to the number of cores; in another machine with the same NIC but
32 cores there are 32 queues.

Does hyperthreading make a difference? There are actually only 20
cores. It seems that hyperthreading doesn't help for NIC interrrupt
processing, so the card/kernel could just be ignoring the virtual
cores. However, in this case it should not allocate queues for the
virtual cores. Also, the number of interruptions in the virtual cores
should be zero, but it isn't.

Further, why doesn't the NIC use all of the 20 real cores? Is it
limited to power of 2 cores?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ