lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 29 Dec 2013 18:11:54 +0100
From:	Nicholas Mc Guire <der.herr@...r.at>
To:	linux-rt-users@...r.kernel.org
Cc:	Sami Pietikainen <Sami.Pietikainen@...ice.com>,
	Jouko Haapaluoma <jouko.haapaluoma@...ice.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>
Subject: [PATCH RT] add missing local serialization in ip_output.c


in response to the oops in ip_output.c:ip_send_unicast_reply under high 
network load with CONFIG_PREEMPT_RT_FULL=y, reported by Sami Pietikainen 
<Sami.Pietikainen@...ice.com>, this patch adds local serialization in 
ip_send_unicast_reply.

from ip_output.c:
/*
 *      Generic function to send a packet as reply to another packet.
 *      Used to send some TCP resets/acks so far.
 *
 *      Use a fake percpu inet socket to avoid false sharing and contention.
 */
static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
...

which was added in commit be9f4a44 in linux-stable. The git log, wich
introduced the PER_CPU unicast_sock, states:
<snip>
commit be9f4a44e7d41cee50ddb5f038fc2391cbbb4046
Author: Eric Dumazet <edumazet@...gle.com>
Date:   Thu Jul 19 07:34:03 2012 +0000

    ipv4: tcp: remove per net tcp_sock
    
    tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
    per network namespace.
    
    This leads to bad behavior on multiqueue NICS, because many cpus
    contend for the socket lock and once socket lock is acquired, extra
    false sharing on various socket fields slow down the operations.
    
    To better resist to attacks, we use a percpu socket. Each cpu can
    run without contention, using appropriate memory (local node)
<snip>

The per-cpu here thus is assuming exclusivity serializing per cpu - so 
the use of get_cpu_ligh introduced in 
net-use-cpu-light-in-ip-send-unicast-reply.patch, which droped the 
preempt_disable in favor of a migrate_disable is probably wrong as this 
only handles the referencial consistency but not the serialization. To
evade a preempt_disable here a local lock would be needed.

Therapie: 
 * add local lock: 
 * and re-introduce local serialization:
 
Tested on x86 with high network load using the testcase from Sami Pietikainen
  while : ; do wget -O - ftp://LOCAL_SERVER/empty_file > /dev/null 2>&1; done

Link: http://www.spinics.net/lists/linux-rt-users/msg11007.html
Signed-off-by: Nicholas Mc Guire <der.herr@...r.at>
---
 net/ipv4/ip_output.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e9fa68c..1e68f65 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -79,6 +79,7 @@
 #include <linux/mroute.h>
 #include <linux/netlink.h>
 #include <linux/tcp.h>
+#include <linux/locallock.h>
 
 int sysctl_ip_default_ttl __read_mostly = IPDEFTTL;
 EXPORT_SYMBOL(sysctl_ip_default_ttl);
@@ -1468,6 +1469,9 @@ static DEFINE_PER_CPU(struct inet_sock, unicast_sock) = {
 	.uc_ttl		= -1,
 };
 
+/* serialize concurrent calls on the same CPU to ip_send_unicast_reply */
+static DEFINE_LOCAL_IRQ_LOCK(unicast_lock);
+
 void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 			   __be32 saddr, const struct ip_reply_arg *arg,
 			   unsigned int len)
@@ -1506,7 +1510,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 		return;
 
 	get_cpu_light();
-	inet = &__get_cpu_var(unicast_sock);
+	inet = &get_locked_var(unicast_lock, unicast_sock);
 
 	inet->tos = arg->tos;
 	sk = &inet->sk;
@@ -1530,6 +1534,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr,
 		ip_push_pending_frames(sk, &fl4);
 	}
 
+	put_locked_var(unicast_lock, unicast_sock);
 	put_cpu_light();
 
 	ip_rt_put(rt);
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ