lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 15 Aug 2014 11:49:02 -0700 From: Tom Herbert <therbert@...gle.com> To: Alexander Duyck <alexander.h.duyck@...el.com> Cc: David Miller <davem@...emloft.net>, Eric Dumazet <eric.dumazet@...il.com>, Linux Netdev List <netdev@...r.kernel.org> Subject: Re: Performance regression on kernels 3.10 and newer On Fri, Aug 15, 2014 at 10:15 AM, Alexander Duyck <alexander.h.duyck@...el.com> wrote: > On 08/14/2014 04:20 PM, David Miller wrote: >> From: Alexander Duyck <alexander.h.duyck@...el.com> >> Date: Thu, 14 Aug 2014 16:16:36 -0700 >> >>> Are you sure about each socket having it's own DST? Everything I see >>> seems to indicate it is somehow associated with IP. >> >> Right it should be, unless you have exception entries created by path >> MTU or redirects. >> >> WRT prequeue, it does the right thing for dumb apps that block in >> receive. But because it causes the packet to cross domains as it >> does, we can't do a lot of tricks which we normally can do, and that's >> why the refcounting on the dst is there now. >> >> Perhaps we can find a clever way to elide that refcount, who knows. > > Actually I would consider the refcount issue just the coffin nail in all > of this. It seems like there are multiple issues that have been there > for some time and they are just getting worse with the refcount change > in 3.10. > > With the prequeue disabled what happens is that the frames are making it > up and hitting tcp_rcv_established before being pushed into the backlog > queues and coalesced there. I believe the lack of coalescing on the > prequeue path is one of the reasons why it is twice as expensive as the > non-prequeue path CPU wise even if you eliminate the refcount issue. > > I realize most of my data is anecdotal as I only have the ixgbe/igb > adapters and netperf to work with. This is one of the reasons why I > keep asking if someone can tell me what the use case is for this where > it performs well. From what I can tell it might have had some value > back in the day before the introduction of things such as RPS/RFS where > some of the socket processing would be offloaded to other CPUs for a > single queue device, but even that use case is now deprecated since > RPS/RFS are there and function better than this. What I am basically > looking for is a way to weight the gain versus the penalties to > determine if this code is even viable anymore. > Alex, I tried to repro your problem running your script (on bnx2x). Didn't see see the issue and in fact ip_dest_check did not appear in top perf functions on perf. I assume this is more related to the steering configuration rather than the device (although flow director might be a fundamental difference). > In the meantime I think I will put together a patch to default > tcp_low_latency to 1 for net and stable, and if we cannot find a good > reason for keeping it then I can submit a patch to net-next that will > strip it out since I don't see any benefit to having this code. > > Thanks, > > Alex > > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@...r.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists