lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 28 Apr 2012 13:55:56 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Deng-Cheng Zhu <dczhu@...s.com>
Cc:	davem@...emloft.net, therbert@...gle.com, netdev@...r.kernel.org
Subject: Re: [PATCH] RPS: Sparse connection optimizations

On Sat, 2012-04-28 at 18:10 +0800, Deng-Cheng Zhu wrote:
> From: Deng-Cheng Zhu <dczhu@...s.com>
> 
> Currently, choosing target CPU to process the incoming packet is based on
> skb->rxhash. In the case of sparse connections, this could lead to
> relatively low and inconsistent bandwidth while doing network throughput
> tests -- CPU selection in the RPS map is imbalanced. Even with the same
> hash value, 2 packets could come from different devices. Besides, on
> architectures like MIPS with multi-threaded cores, siblings of CPU0 should
> not be selected when others are not saturated.

What CPU0 is doing so special you have to mention it in this changelog ?

> 
> This patch introduces a feature that allows some flows to select their CPUs
> by looping the RPS CPU maps. Some tests were performed on the MIPS Malta
> 1004K platform (2 cores, each with 2 VPEs) at 25Mhz with 2 Intel Pro/1000
> NICs. The Malta board works as a router between 2 PCs. Using iperf, here
> are results:


RPS on a router ? Thats not very good, unless you perform a crazy amount
of work in iptables rules maybe ?

One packet comes, its better to handle it right now and send it right
now on the same cpu. No IPI cost, no cache line misses...


RPS is something more suitable to TCP handling in local host because
stack has big memory footprint and latencies, not for forwarding
workload.

I suspect you can reach more throughput using appropriate tunings
(correct interrupt affinities). This sounds like a bad config from the
very beginning.


> +};
> +
> +static struct cpu_flow flow[CONFIG_NR_RPS_MAP_LOOPS][NR_CPUS];

Thats absolutely not allowed to add a [NR_CPUS] array anywhere in linux
kernel in 2012.


> +/*
> + * We've got CONFIG_SMP to do RPS, so only arch define is needed here to access
> + * sibling specific information.
> + */
> +#if defined(CONFIG_MIPS)

Thats not allowed to add a CONFIG_somearch in net/core/dev.c



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ