netdev - Re: [PATCH net-next v2] net: sched: fix skb leak in dev_requeue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <42a543e3-82ff-2421-186d-cc0957ea45ed@gmail.com>
Date:   Tue, 26 Dec 2017 21:24:07 -0800
From:   John Fastabend <john.fastabend@...il.com>
To:     Wei Yongjun <weiyongjun1@...wei.com>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Jiri Pirko <jiri@...nulli.us>
Cc:     netdev@...r.kernel.org
Subject: Re: [PATCH net-next v2] net: sched: fix skb leak in dev_requeue_skb()

On 12/24/2017 07:49 PM, Wei Yongjun wrote:
> When dev_requeue_skb() is called with bluked skb list, only the
> first skb of the list will be requeued to qdisc layer, and leak
> the others without free them.
> 
> TCP is broken due to skb leak since no free skb will be considered
> as still in the host queue and never be retransmitted. This happend
> when dev_requeue_skb() called from qdisc_restart().
>   qdisc_restart
>   |-- dequeue_skb
>   |-- sch_direct_xmit()
>       |-- dev_requeue_skb() <-- skb may bluked
> 
> Fix dev_requeue_skb() to requeue the full bluked list.
> 
> Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
> Signed-off-by: Wei Yongjun <weiyongjun1@...wei.com>
> ---
> v1 -> v2: add net-next prefix
> ---

First, thanks for tracking this down.

> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 981c08f..0df2dbf 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -111,10 +111,16 @@ static inline void qdisc_enqueue_skb_bad_txq(struct Qdisc *q,
>  
>  static inline int __dev_requeue_skb(struct sk_buff *skb, struct Qdisc *q)
>  {
> -	__skb_queue_head(&q->gso_skb, skb);
> -	q->qstats.requeues++;
> -	qdisc_qstats_backlog_inc(q, skb);
> -	q->q.qlen++;	/* it's still part of the queue */
> +	while (skb) {
> +		struct sk_buff *next = skb->next;
> +
> +		__skb_queue_tail(&q->gso_skb, skb);

Was the change from __skb_queue_head to __skb_queue_tail here
intentional? We should re-queue packets to the head of the list.

> +		q->qstats.requeues++;
> +		qdisc_qstats_backlog_inc(q, skb);
> +		q->q.qlen++;	/* it's still part of the queue */
> +
> +		skb = next;
> +	}
>  	__netif_schedule(q);
>  
>  	return 0;
> @@ -124,13 +130,20 @@ static inline int dev_requeue_skb_locked(struct sk_buff *skb, struct Qdisc *q)
>  {
>  	spinlock_t *lock = qdisc_lock(q);
>  
> -	spin_lock(lock);
> -	__skb_queue_tail(&q->gso_skb, skb);
> -	spin_unlock(lock);
> +	while (skb) {
> +		struct sk_buff *next = skb->next;
> +
> +		spin_lock(lock);

In this case I suspect its better to move the lock to be around the
while loop rather than grab and drop it repeatedly. I don't have
any data at this point so OK either way. Assuming other head/tail
comment is addressed.

> +		__skb_queue_tail(&q->gso_skb, skb);

Same here *_tail should be *_head?

> +		spin_unlock(lock);
> +
> +		qdisc_qstats_cpu_requeues_inc(q);
> +		qdisc_qstats_cpu_backlog_inc(q, skb);
> +		qdisc_qstats_cpu_qlen_inc(q);
> +
> +		skb = next;
> +	}
>  
> -	qdisc_qstats_cpu_requeues_inc(q);
> -	qdisc_qstats_cpu_backlog_inc(q, skb);
> -	qdisc_qstats_cpu_qlen_inc(q);
>  	__netif_schedule(q);
>  
>  	return 0;
>