lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070515.203953.03891900.taka@valinux.co.jp>
Date:	Tue, 15 May 2007 20:39:53 +0900 (JST)
From:	Hirokazu Takahashi <taka@...inux.co.jp>
To:	herbert@...dor.apana.org.au
Cc:	shemminger@...ux-foundation.org, netdev@...r.kernel.org,
	kaber@...sh.net, davem@...emloft.net, linux-net@...r.kernel.org
Subject: [PATCH 1/2] tbf scheduler: TSO support (update 2)

Hi,

> Hirokazu Takahashi <taka@...inux.co.jp> wrote:
> >
> > Uhh, you are right.
> > skb_shinfo(skb)->gso_segs and skb_shinfo(skb)->gso_size should be used.
> 
> Actually forget about gso_segs, it's only filled in for TCP.

I realized it was really hard to determine the actual size of each
packet that will be generated from TSO packets, the size which
should be used to calculate the really accurate traffic.
There isn't enough information in socket buffers to determine
the size of their headers as gso_size just shows the maximum length of
the segment without any headers and the other members are helpless
either.

              split into
 TSO packet  ----------->  packets after being split
+----------+                +----------+
| headers  |                | headers  |
+----------+                +----------+   ----
| segment1 |                | segment1 |     A
|          |                |          |     | gso_size
|          |                |          |     V
+----------+                +----------+   ----
| segment2 |        
|          |                +----------+
|          |                | headers  |
+----------+                +----------+
| segment3 |                | segment2 |
|          |                |          |
+----------+                |          |
                            +----------+

                            +----------+
                            | headers  |
                            +----------+
                            | segment3 |
                            |          |
                            +----------+

So I decided to make it simple to calculate the traffic:
   - assume each packet generated from the same TSO packet have
     the same length.
   - ignore the length of additional headers which will be
     automatically applied.

It looks working pretty well to control bandwidth as I expected,
but I'm not sure everybody will be satisfied with it.
Do you think this approximate calculation is enough?


I also realized CBQ scheduler have to be fixed to handle large
TSO packets or it may possibly cause Oops. The next mail contains
the patch for CBQ.



--- linux-2.6.21/net/sched/sch_tbf.c.ORG	2007-05-08 20:59:28.000000000 +0900
+++ linux-2.6.21/net/sched/sch_tbf.c	2007-05-15 19:59:34.000000000 +0900
@@ -9,7 +9,8 @@
  * Authors:	Alexey Kuznetsov, <kuznet@....inr.ac.ru>
  *		Dmitry Torokhov <dtor@...l.ru> - allow attaching inner qdiscs -
  *						 original idea by Martin Devera
- *
+ * Fixes:
+ * 		Hirokazu Takahashi <taka@...inux.co.jp> : TSO support
  */
 
 #include <linux/module.h>
@@ -138,8 +139,12 @@ static int tbf_enqueue(struct sk_buff *s
 {
 	struct tbf_sched_data *q = qdisc_priv(sch);
 	int ret;
+	//unsigned int segs = skb_shinfo(skb)->gso_segs ? : 1;
+	unsigned int segs = skb_shinfo(skb)->gso_segs ? :
+	  skb_shinfo(skb)->gso_size ? skb->len/skb_shinfo(skb)->gso_size + 1 : 1;
+	unsigned int len = (skb->len - 1)/segs + 1;
 
-	if (skb->len > q->max_size) {
+	if (len > q->max_size) {
 		sch->qstats.drops++;
 #ifdef CONFIG_NET_CLS_POLICE
 		if (sch->reshape_fail == NULL || sch->reshape_fail(skb, sch))
@@ -204,22 +209,41 @@ static struct sk_buff *tbf_dequeue(struc
 		psched_time_t now;
 		long toks, delay;
 		long ptoks = 0;
-		unsigned int len = skb->len;
+		/*
+		 * Note: TSO packets will be larger than its actual mtu.
+		 * These packets should be treated as packets including
+		 * several ordinary ones. In this case, tokens should
+		 * be held until it reaches the length of them.
+		 *
+		 * To simplify, we assume each segment in a TSO packet
+		 * has the same length though it may probably not be true.
+		 * And ignore the length of headers which will be applied
+		 * to each segment when splitting TSO packets.
+		 * 
+		 * The number of segments are calculated from the segment
+		 * size of TSO packets temporarily if it isn't set.
+		 */
+		unsigned int segs = skb_shinfo(skb)->gso_segs ? :
+		  skb_shinfo(skb)->gso_size ? skb->len/skb_shinfo(skb)->gso_size + 1 : 1;
+		unsigned int len = (skb->len - 1)/segs + 1;
+		unsigned int expect = L2T(q, len) * segs;
+		long max_toks = max(expect, q->buffer);
+
 
 		PSCHED_GET_TIME(now);
 
-		toks = PSCHED_TDIFF_SAFE(now, q->t_c, q->buffer);
+		toks = PSCHED_TDIFF_SAFE(now, q->t_c, max_toks);
 
 		if (q->P_tab) {
 			ptoks = toks + q->ptokens;
-			if (ptoks > (long)q->mtu)
-				ptoks = q->mtu;
-			ptoks -= L2T_P(q, len);
+			if (ptoks > (long)(q->mtu * segs))
+				ptoks = q->mtu * segs;
+			ptoks -= L2T_P(q, len) * segs;
 		}
 		toks += q->tokens;
-		if (toks > (long)q->buffer)
-			toks = q->buffer;
-		toks -= L2T(q, len);
+		if (toks > max_toks)
+			toks = max_toks;
+		toks -= expect;
 
 		if ((toks|ptoks) >= 0) {
 			q->t_c = now;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ