[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6AADFAC011213A4C87B956458587ADB401315127@dggemi507-mbx.china.huawei.com>
Date: Wed, 27 Dec 2017 08:02:59 +0000
From: "weiyongjun (A)" <weiyongjun1@...wei.com>
To: John Fastabend <john.fastabend@...il.com>,
Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH net-next v2] net: sched: fix skb leak in
dev_requeue_skb()
On 12/27/2017 1:24 PM, John Fastabend wrote:
> On 12/24/2017 07:49 PM, Wei Yongjun wrote:
> > When dev_requeue_skb() is called with bluked skb list, only the
> > first skb of the list will be requeued to qdisc layer, and leak
> > the others without free them.
> >
> > TCP is broken due to skb leak since no free skb will be considered
> > as still in the host queue and never be retransmitted. This happend
> > when dev_requeue_skb() called from qdisc_restart().
> > qdisc_restart
> > |-- dequeue_skb
> > |-- sch_direct_xmit()
> > |-- dev_requeue_skb() <-- skb may bluked
> >
> > Fix dev_requeue_skb() to requeue the full bluked list.
> >
> > Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
> > Signed-off-by: Wei Yongjun <weiyongjun1@...wei.com>
> > ---
> > v1 -> v2: add net-next prefix
> > ---
>
> First, thanks for tracking this down.
>
> > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> > index 981c08f..0df2dbf 100644
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -111,10 +111,16 @@ static inline void
> qdisc_enqueue_skb_bad_txq(struct Qdisc *q,
> >
> > static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
> > {
> > - __skb_queue_head(&q->gso_skb, skb);
> > - q->qstats.requeues++;
> > - qdisc_qstats_backlog_inc(q, skb);
> > - q->q.qlen++; /* it's still part of the queue */
> > + while (skb) {
> > + struct sk_buff *next = skb->next;
> > +
> > + __skb_queue_tail(&q->gso_skb, skb);
>
> Was the change from __skb_queue_head to __skb_queue_tail here
> intentional? We should re-queue packets to the head of the list.
skb is dequeue from gso_skb in order, so if packets re-queue to the head
of the list, skb will be sent in reverse order, so I think __skb_queue_tail is
better here. Tcpdump show those skbs are out of order packets.
>
> > + q->qstats.requeues++;
> > + qdisc_qstats_backlog_inc(q, skb);
> > + q->q.qlen++; /* it's still part of the queue */
> > +
> > + skb = next;
> > + }
> > __netif_schedule(q);
> >
> > return 0;
> > @@ -124,13 +130,20 @@ static inline int dev_requeue_skb_locked(struct
> sk_buff *skb, struct Qdisc *q)
> > {
> > spinlock_t *lock = qdisc_lock(q);
> >
> > - spin_lock(lock);
> > - __skb_queue_tail(&q->gso_skb, skb);
> > - spin_unlock(lock);
> > + while (skb) {
> > + struct sk_buff *next = skb->next;
> > +
> > + spin_lock(lock);
>
> In this case I suspect its better to move the lock to be around the
> while loop rather than grab and drop it repeatedly. I don't have
> any data at this point so OK either way. Assuming other head/tail
> comment is addressed.
I will fix it in v3.
>
> > + __skb_queue_tail(&q->gso_skb, skb);
>
> Same here *_tail should be *_head?
>
> > + spin_unlock(lock);
> > +
> > + qdisc_qstats_cpu_requeues_inc(q);
> > + qdisc_qstats_cpu_backlog_inc(q, skb);
> > + qdisc_qstats_cpu_qlen_inc(q);
> > +
> > + skb = next;
> > + }
> >
> > - qdisc_qstats_cpu_requeues_inc(q);
> > - qdisc_qstats_cpu_backlog_inc(q, skb);
> > - qdisc_qstats_cpu_qlen_inc(q);
> > __netif_schedule(q);
> >
> > return 0;
> >
Powered by blists - more mailing lists