netdev - Re: [WIP] net+mlx4: auto doorbell

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1480611857.18162.319.camel@edumazet-glaptop3.roam.corp.google.com>
Date:   Thu, 01 Dec 2016 09:04:17 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     Saeed Mahameed <saeedm@....mellanox.co.il>,
        Rick Jones <rick.jones2@....com>,
        Linux Netdev List <netdev@...r.kernel.org>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Tariq Toukan <tariqt@...lanox.com>
Subject: Re: [WIP] net+mlx4: auto doorbell

On Thu, 2016-12-01 at 17:04 +0100, Jesper Dangaard Brouer wrote:

> I think you misunderstood my concept[1].  I don't want to stop the
> queue. The new __QUEUE_STATE_FLUSH_NEEDED does not stop the queue, is
> it just indicating that someone need to flush/ring-doorbell.  Maybe it
> need another name, because it also indicate that the driver can see
> that its TX queue is so busy that we don't need to call it immediately.
> The qdisc layer can then choose to enqueue instead if doing direct xmit.

But driver ndo_start_xmit() does not have a pointer to qdisc.

Also the concept of 'queue busy' just because we queued one packet is a
bit flaky.

> 
> When qdisc layer or trafgen/af_packet see this indication it knows it
> should/must flush the queue when it don't have more work left.  Perhaps
> through net_tx_action(), by registering itself and e.g. if qdisc_run()
> is called and queue is empty then check if queue needs a flush. I would
> also allow driver to flush and clear this bit.

net_tx_action() is not normally called, unless BQL limit is hit and/or
some qdiscs with throttling (HTB, TBF, FQ, ...)

> 
> I just see it as an extension of your solution, as we still need the
> driver to figure out then the doorbell/flush can be delayed.
> p.s. don't be discouraged by this feedback, I'm just very excited and
> happy that your are working on a solution in this area. As this is a
> problem area that I've not been able to solve myself for the last
> approx 2 years. Keep up the good work!

Do not worry, I appreciate the feedbacks ;)

BTW, if you are doing tests on mlx4 40Gbit, would you check the
following quick/dirty hack, using lots of low-rate flows ?

mlx4 has really hard time to transmit small TSO packets (2 or 3 MSS)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 12ea3405f442..96940666abd3 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2631,6 +2631,11 @@ static void mlx4_en_del_vxlan_port(struct  net_device *dev,
        queue_work(priv->mdev->workqueue, &priv->vxlan_del_task);
 }
 
+static int mlx4_gso_segs_min = 4; /* TSO packets with less than 4 segments are segmented */
+module_param_named(mlx4_gso_segs_min, mlx4_gso_segs_min, uint, 0644);
+MODULE_PARM_DESC(mlx4_gso_segs_min, "threshold for software segmentation of small TSO packets");
+
+
 static netdev_features_t mlx4_en_features_check(struct sk_buff *skb,
                                                struct net_device *dev,
                                                netdev_features_t features)
@@ -2651,6 +2656,8 @@ static netdev_features_t mlx4_en_features_check(struct sk_buff *skb,
                    (udp_hdr(skb)->dest != priv->vxlan_port))
                        features &= ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
        }
+       if (skb_is_gso(skb) && skb_shinfo(skb)->gso_segs < mlx4_gso_segs_min)
+               features &= NETIF_F_GSO_MASK;
 
        return features;
 }