[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080806224248.18266k9ahc5nkk8w@hayate.ip6>
Date: Wed, 06 Aug 2008 22:42:48 +0300
From: "Jussi Kivilinna" <jussi.kivilinna@...et.fi>
To: "Jarek Poplawski" <jarkao2@...il.com>
Cc: "David Miller" <davem@...emloft.net>, kaber@...sh.net,
netdev@...r.kernel.org
Subject: qdisc_enqueue, NET_XMIT_SUCCESS and kfree_skb (Was: Re: [PATCH
take 2] net_sched: Add qdisc __NET_XMIT_BYPASS flag)
Quoting "Jarek Poplawski" <jarkao2@...il.com>:
>>
>> How about making skb shared before passing into qdisc tree?
>> That would make skb usage safe after qdisc enqueues.
>
> It's a bit costly (atomics), so there should be a good reason for this.
> It should be first checked if there is real danger. And if it's only
> for more exact stats, I'm not sure it's worth of it.
>
Ok, I went throught all enqueue (and requeue) functions for any case of
freeing skb and returning full NET_XMIT_SUCCESS without new flags and
found only in sch_blackhole (qdisc_drop + return NET_XMIT_SUCCESS).
This could be fixed by delaying kfree_skb to exit on qdisc_enqueue_root,
here's (completely untested) patch:
---
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index a7abfda..ca083c6 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -175,6 +175,7 @@ struct tcf_proto
struct qdisc_skb_cb {
unsigned int pkt_len;
+ __u8 delayed_enqueue_free:1;
char data[];
};
@@ -364,10 +365,23 @@ static inline int qdisc_enqueue(struct sk_buff
*skb, struct Qdisc *sch)
return sch->enqueue(skb, sch);
}
+static inline void qdisc_delayed_kfree_skb(struct sk_buff *skb)
+{
+ qdisc_skb_cb(skb)->delayed_enqueue_free = 1;
+}
+
static inline int qdisc_enqueue_root(struct sk_buff *skb, struct Qdisc *sch)
{
+ int ret;
+
+ qdisc_skb_cb(skb)->delayed_enqueue_free = 0;
qdisc_skb_cb(skb)->pkt_len = skb->len;
- return qdisc_enqueue(skb, sch) & NET_XMIT_MASK;
+ ret = qdisc_enqueue(skb, sch);
+
+ if (ret == NET_XMIT_SUCCESS &&
qdisc_skb_cb(skb)->delayed_enqueue_free)
+ kfree_skb(skb);
+
+ return ret & NET_XMIT_MASK;
}
static inline int __qdisc_enqueue_tail(struct sk_buff *skb, struct
Qdisc *sch,
diff --git a/net/sched/sch_blackhole.c b/net/sched/sch_blackhole.c
index 507fb48..13230bd 100644
--- a/net/sched/sch_blackhole.c
+++ b/net/sched/sch_blackhole.c
@@ -19,7 +19,8 @@
static int blackhole_enqueue(struct sk_buff *skb, struct Qdisc *sch)
{
- qdisc_drop(skb, sch);
+ qdisc_delayed_kfree_skb(skb);
+ sch->qstats.drops++;
return NET_XMIT_SUCCESS;
}
---
If this isn't good way to solve this, qdisc_pkt_len use for stats could be
fixed with either passing packet length pointer throught qdisc tree or adding
new qdisc_pkt_len_diff and adding difference in at dequeue as you said
(but here
inner dequeue could return NULL and difference wouldn't be added after all but
well it is just stats).
As I went throught code I found two cases where skb pointer is used
after inner
enqueue with full NET_XMIT_SUCCESS (other than qdisc_pkt_len for stats): HTB
uses skb_is_gso(), HFSC uses packet length for set_active(). HTB is trivial
(for me) to fix while HFSC isn't. Because HFSC part it would be easier for me
to declare full NET_XMIT_SUCCESS as safe zone for skb pointer.
- Jussi
PS. I noticed something fishy in HTB; HTB always returns NET_XMIT_DROP if
qdisc_enqueue doesn't return full NET_XMIT_SUCCESS, shouldn't it return return
value from qdisc_enqueue. Same in HTB requeue. That can't be right, right?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists