netdev - Re: net: heap out-of-bounds in fib6_clean_node/rt6_fill_node/fib6_age/fib6_prune

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 6 Mar 2017 19:51:24 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     David Ahern <dsa@...ulusnetworks.com>
Cc:     Eric Dumazet <eric.dumazet@...il.com>,
        Mahesh Bandewar <maheshb@...gle.com>,
        Eric Dumazet <edumazet@...gle.com>,
        David Miller <davem@...emloft.net>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        James Morris <jmorris@...ei.org>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Patrick McHardy <kaber@...sh.net>,
        netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Cong Wang <xiyou.wangcong@...il.com>,
        syzkaller <syzkaller@...glegroups.com>
Subject: Re: net: heap out-of-bounds in fib6_clean_node/rt6_fill_node/fib6_age/fib6_prune_clone

On Mon, Mar 6, 2017 at 6:31 PM, David Ahern <dsa@...ulusnetworks.com> wrote:
> On 3/4/17 1:15 PM, Eric Dumazet wrote:
>> On Sat, 2017-03-04 at 19:57 +0100, Dmitry Vyukov wrote:
>>> On Fri, Mar 3, 2017 at 8:12 PM, David Ahern <dsa@...ulusnetworks.com> wrote:
>>>> On 3/3/17 6:39 AM, Dmitry Vyukov wrote:
>>>>> I am getting heap out-of-bounds reports in
>>>>> fib6_clean_node/rt6_fill_node/fib6_age/fib6_prune_clone while running
>>>>> syzkaller fuzzer on 86292b33d4b79ee03e2f43ea0381ef85f077c760. They all
>>>>> follow the same pattern: an object of size 216 is allocated from
>>>>> ip_dst_cache slab, and then accessed at offset 272/276 withing
>>>>> fib6_walk. Looks like type confusion. Unfortunately this is not
>>>>> reproducible.
>>>>
>>>> I'll take a look this weekend or Monday at the latest.
>>>
>>>
>>> I've got some additional useful info on this. I think this is
>>> use-after-free rather than out-of-bounds. I've collected stack where
>>> the route was disposed with call_rcu, see the last "Disposed" stack.
>>> The crash happens when cmpxchg in rt_cache_route replaces an existing
>>> route. And that route seems to have some existing pointers to it
>>> (rt->dst.rt6_next) which fib6_walk uses to get to it after its
>>> deletion.
>>
>> rt_cache_route() deals with IPv4 routes.
>>
>> We somehow mix IPv4 and IPv6 dsts in IPv6 tree.
>>
>> We need to add type safety at IPV6 route insertions to catch the
>> offender.
>>
>
> I've seen something like this before -- a rt was on the gc list but
> still linked in the tables because of some reference.
>
> Dmitry: you seem to have reproduced this a few times. Can you share how
> to run whatever tests you are using?


We hit it several thousand times, but we get only several dozens of
crashes per day on ~80 VMs. So if you try to reproduce it on a single
machine it can take days for a single crash.
If you are ready to go that route, here are some instructions on
setting up syzkaller:
https://github.com/google/syzkaller
You also need kernel built with CONFIG_KASAN.
I am ready to help with resolving any issues.

Another possible route is if you give me a patch with some additional
WARNINGs. Then I can deploy it to bots and collect stacks.