[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpVOi0FH_quusHHpvREdvpqq6=RjVOQvcjAWGbh1X0_5tA@mail.gmail.com>
Date: Tue, 18 Apr 2017 16:12:44 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Michael Ma <make0818@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Corrupted SKB
On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0818@...il.com> wrote:
> Hi -
>
> We've implemented a "glue" qdisc similar to mqprio which can associate
> one qdisc to multiple txqs as the root qdisc. Reference count of the
> child qdiscs have been adjusted properly in this case so that it
> represents the number of txqs it has been attached to. However when
> sending packets we saw the skb from dequeue_skb() corrupted with the
> following call stack:
>
> [exception RIP: netif_skb_features+51]
> RIP: ffffffff815292b3 RSP: ffff8817f6987940 RFLAGS: 00010246
>
> #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa
> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9
> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193
> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03
>
> It looks like the skb has already been released since its dev pointer
> field is invalid.
>
> Any clue on how this can be investigated further? My current thought
> is to add some instrumentation to the place where skb is released and
> analyze whether there is any race condition happening there. However
Either dropwatch or perf could do the work to instrument kfree_skb().
> by looking through the existing code I think the case where one root
> qdisc is associated with multiple txqs already exists (when mqprio is
> not used) so not sure why it won't work when we group txqs and assign
> each group a root qdisc. Any insight on this issue would be much
> appreciated!
How do you implement ->attach()? How does it work with netdev_pick_tx()?
Powered by blists - more mailing lists