lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAmHdhy0N2VttXNXL+S+4G=4=mf4ihpW7KsNWUYpiOFXez3B7w@mail.gmail.com>
Date:   Tue, 18 Apr 2017 21:46:26 -0700
From:   Michael Ma <make0818@...il.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        jin.oyj@...baba-inc.com
Subject: Re: Corrupted SKB

2017-04-18 16:12 GMT-07:00 Cong Wang <xiyou.wangcong@...il.com>:
> On Mon, Apr 17, 2017 at 5:39 PM, Michael Ma <make0818@...il.com> wrote:
>> Hi -
>>
>> We've implemented a "glue" qdisc similar to mqprio which can associate
>> one qdisc to multiple txqs as the root qdisc. Reference count of the
>> child qdiscs have been adjusted properly in this case so that it
>> represents the number of txqs it has been attached to. However when
>> sending packets we saw the skb from dequeue_skb() corrupted with the
>> following call stack:
>>
>>     [exception RIP: netif_skb_features+51]
>>     RIP: ffffffff815292b3  RSP: ffff8817f6987940  RFLAGS: 00010246
>>
>>  #9 [ffff8817f6987968] validate_xmit_skb at ffffffff815294aa
>> #10 [ffff8817f69879a0] validate_xmit_skb at ffffffff8152a0d9
>> #11 [ffff8817f69879b0] __qdisc_run at ffffffff8154a193
>> #12 [ffff8817f6987a00] dev_queue_xmit at ffffffff81529e03
>>
>> It looks like the skb has already been released since its dev pointer
>> field is invalid.
>>
>> Any clue on how this can be investigated further? My current thought
>> is to add some instrumentation to the place where skb is released and
>> analyze whether there is any race condition happening there. However
>
> Either dropwatch or perf could do the work to instrument kfree_skb().

Thanks - will try it out.
>
>> by looking through the existing code I think the case where one root
>> qdisc is associated with multiple txqs already exists (when mqprio is
>> not used) so not sure why it won't work when we group txqs and assign
>> each group a root qdisc. Any insight on this issue would be much
>> appreciated!
>
> How do you implement ->attach()? How does it work with netdev_pick_tx()?

attach() essentially grafts the default qdisc(pfifo) to each "txq
group" represented by a TC class. For netdev_pick_txq() we use classid
of the socket to select a class based on a "class id base" and the
class to txq mapping defined together with this glue qdisc - it's
pretty much the same as mqprio with the difference of mapping one
class to multiple txqs and selecting the txq through a hash.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ