linux-kernel - [PATCH 3.16 210/328] IB/ipoib: Avoid a race condition between start_xmit and cm_rep

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <lsq.1544392233.367793181@decadent.org.uk>
Date:   Sun, 09 Dec 2018 21:50:33 +0000
From:   Ben Hutchings <ben@...adent.org.uk>
To:     linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC:     akpm@...ux-foundation.org, "Ira Weiny" <ira.weiny@...el.com>,
        "Aaron Knister" <aaron.s.knister@...a.gov>,
        "Jason Gunthorpe" <jgg@...lanox.com>
Subject: [PATCH 3.16 210/328] IB/ipoib: Avoid a race condition between
 start_xmit and cm_rep_handler

3.16.62-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Aaron Knister <aaron.s.knister@...a.gov>

commit 816e846c2eb9129a3e0afa5f920c8bbc71efecaa upstream.

Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which leaves
a window where cm_rep_handler can run, set the connection up, dequeue
pending packets and leave the subsequently queued packets by start_xmit()
sitting on neigh->queue until they're dropped when the connection is torn
down. This only applies to connected mode. These dropped packets can
really upset TCP, for example, and cause multi-minute delays in
transmission for open connections.

Here's the code in start_xmit where we check to see if the connection is
up:

       if (ipoib_cm_get(neigh)) {
               if (ipoib_cm_up(neigh)) {
                       ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                       goto unref;
               }
       }

The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv->lock to dequeue pending skb's) but before the below code snippet in
start_xmit where packets are queued.

       if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
               push_pseudo_header(skb, phdr->hwaddr);
               spin_lock_irqsave(&priv->lock, flags);
               __skb_queue_tail(&neigh->queue, skb);
               spin_unlock_irqrestore(&priv->lock, flags);
       } else {
               ++dev->stats.tx_dropped;
               dev_kfree_skb_any(skb);
       }

The patch acquires the netif tx lock in cm_rep_handler for the section
where it sets the connection up and dequeues and retransmits deferred
skb's.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Aaron Knister <aaron.s.knister@...a.gov>
Tested-by: Ira Weiny <ira.weiny@...el.com>
Reviewed-by: Ira Weiny <ira.weiny@...el.com>
Signed-off-by: Jason Gunthorpe <jgg@...lanox.com>
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
 drivers/infiniband/ulp/ipoib/ipoib_cm.c | 2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -996,12 +996,14 @@ static int ipoib_cm_rep_handler(struct i

 	skb_queue_head_init(&skqueue);

+	netif_tx_lock_bh(p->dev);
 	spin_lock_irq(&priv->lock);
 	set_bit(IPOIB_FLAG_OPER_UP, &p->flags);
 	if (p->neigh)
 		while ((skb = __skb_dequeue(&p->neigh->queue)))
 			__skb_queue_tail(&skqueue, skb);
 	spin_unlock_irq(&priv->lock);
+	netif_tx_unlock_bh(p->dev);

 	while ((skb = __skb_dequeue(&skqueue))) {
 		skb->dev = p->dev;