lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190906172941.25136-1-simon.horman@netronome.com>
Date:   Fri,  6 Sep 2019 19:29:41 +0200
From:   Simon Horman <simon.horman@...ronome.com>
To:     David Miller <davem@...emloft.net>
Cc:     Jakub Kicinski <jakub.kicinski@...ronome.com>,
        netdev@...r.kernel.org, oss-drivers@...ronome.com,
        Fred Lotter <frederik.lotter@...ronome.com>,
        Simon Horman <simon.horman@...ronome.com>
Subject: [PATCH net] nfp: flower: cmsg rtnl locks can timeout reify messages

From: Fred Lotter <frederik.lotter@...ronome.com>

Flower control message replies are handled in different locations. The truly
high priority replies are handled in the BH (tasklet) context, while the
remaining replies are handled in a predefined Linux work queue. The work
queue handler orders replies into high and low priority groups, and always
start servicing the high priority replies within the received batch first.

Reply Type:			Rtnl Lock:	Handler:

CMSG_TYPE_PORT_MOD		no		BH tasklet (mtu)
CMSG_TYPE_TUN_NEIGH		no		BH tasklet
CMSG_TYPE_FLOW_STATS		no		BH tasklet
CMSG_TYPE_PORT_REIFY		no		WQ high
CMSG_TYPE_PORT_MOD		yes		WQ high (link/mtu)
CMSG_TYPE_MERGE_HINT		yes		WQ low
CMSG_TYPE_NO_NEIGH		no		WQ low
CMSG_TYPE_ACTIVE_TUNS		no		WQ low
CMSG_TYPE_QOS_STATS		no		WQ low
CMSG_TYPE_LAG_CONFIG		no		WQ low

A subset of control messages can block waiting for an rtnl lock (from both
work queue priority groups). The rtnl lock is heavily contended for by
external processes such as systemd-udevd, systemd-network and libvirtd,
especially during netdev creation, such as when flower VFs and representors
are instantiated.

Kernel netlink instrumentation shows that external processes (such as
systemd-udevd) often use successive rtnl_trylock() sequences, which can result
in an rtnl_lock() blocked control message to starve for longer periods of time
during rtnl lock contention, i.e. netdev creation.

In the current design a single blocked control message will block the entire
work queue (both priorities), and introduce a latency which is
nondeterministic and dependent on system wide rtnl lock usage.

In some extreme cases, one blocked control message at exactly the wrong time,
just before the maximum number of VFs are instantiated, can block the work
queue for long enough to prevent VF representor REIFY replies from getting
handled in time for the 40ms timeout.

The firmware will deliver the total maximum number of REIFY message replies in
around 300us.

Only REIFY and MTU update messages require replies within a timeout period (of
40ms). The MTU-only updates are already done directly in the BH (tasklet)
handler.

Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts
caused by a blocked work queue waiting on rtnl locks.

Signed-off-by: Fred Lotter <frederik.lotter@...ronome.com>
Signed-off-by: Simon Horman <simon.horman@...ronome.com>
---
 drivers/net/ethernet/netronome/nfp/flower/cmsg.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

 Hi Dave,

 I am handling maintenance of the nfp diver in Jakub's absence while he is
 on vacation this week and next, and I am sending this patchset in that
 capacity.

diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.c b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c
index d5bbe3d6048b..05981b54eaab 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c
@@ -260,9 +260,6 @@ nfp_flower_cmsg_process_one_rx(struct nfp_app *app, struct sk_buff *skb)
 
 	type = cmsg_hdr->type;
 	switch (type) {
-	case NFP_FLOWER_CMSG_TYPE_PORT_REIFY:
-		nfp_flower_cmsg_portreify_rx(app, skb);
-		break;
 	case NFP_FLOWER_CMSG_TYPE_PORT_MOD:
 		nfp_flower_cmsg_portmod_rx(app, skb);
 		break;
@@ -328,8 +325,7 @@ nfp_flower_queue_ctl_msg(struct nfp_app *app, struct sk_buff *skb, int type)
 	struct nfp_flower_priv *priv = app->priv;
 	struct sk_buff_head *skb_head;
 
-	if (type == NFP_FLOWER_CMSG_TYPE_PORT_REIFY ||
-	    type == NFP_FLOWER_CMSG_TYPE_PORT_MOD)
+	if (type == NFP_FLOWER_CMSG_TYPE_PORT_MOD)
 		skb_head = &priv->cmsg_skbs_high;
 	else
 		skb_head = &priv->cmsg_skbs_low;
@@ -368,6 +364,10 @@ void nfp_flower_cmsg_rx(struct nfp_app *app, struct sk_buff *skb)
 	} else if (cmsg_hdr->type == NFP_FLOWER_CMSG_TYPE_TUN_NEIGH) {
 		/* Acks from the NFP that the route is added - ignore. */
 		dev_consume_skb_any(skb);
+	} else if (cmsg_hdr->type == NFP_FLOWER_CMSG_TYPE_PORT_REIFY) {
+		/* Handle REIFY acks outside wq to prevent RTNL conflict. */
+		nfp_flower_cmsg_portreify_rx(app, skb);
+		dev_consume_skb_any(skb);
 	} else {
 		nfp_flower_queue_ctl_msg(app, skb, cmsg_hdr->type);
 	}
-- 
2.11.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ