[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6414B055-F56A-4F58-BA11-7C45F72FABCB@fb.com>
Date: Wed, 5 Sep 2018 18:10:26 +0000
From: Song Liu <songliubraving@...com>
To: Wei Wang <weiwan@...gle.com>
CC: Linux Kernel Network Developers <netdev@...r.kernel.org>,
David Ahern <dsahern@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: BUG: unable to handle kernel paging request in fib6_node_lookup_1
> On Sep 5, 2018, at 10:09 AM, Wei Wang <weiwan@...gle.com> wrote:
>
> On Tue, Sep 4, 2018 at 11:11 PM Song Liu <songliubraving@...com> wrote:
>>
>> We are debugging an issue with fib6_node_lookup_1().
>>
>> We use a 4.16 based kernel, and we have back ported most upstream
>> patches in ip6_fib.{c.h}. The only major differences I can spot are
>>
>> 8b7f2731bd68d83940714ce92381d1a72596407c
>> c3506372277779fccbffee2475400fcd689d5738
>>
>> I guess the issue is not related to these two fixes.
>>
>> After staring at the call trace and disassembly code (attached below)
>> I guess this is a use-after-free issue in (or right after) the lookup
>> loop:
>>
>> for (;;) {
>> struct fib6_node *next;
>>
>> dir = addr_bit_set(args->addr, fn->fn_bit);
>>
>> next = dir ? rcu_dereference(fn->right) :
>> rcu_dereference(fn->left);
>>
>> if (next) {
>> fn = next;
>> continue;
>> }
>> break;
>> }
>>
>> I guess this probably also happens to latest upstream. I haven't
>> tested this with upstream kernel (or net tree) yet, because we
>> can only trigger this about once a week on 100 servers.
>>
>> Does this look familiar? Any comments and/or suggestions are highly
>> appreciated.
>>
> By glancing at the commit logs, I don't think any changes were made
> regarding the core logic of fib6_node handling recently.
> (There were a couple of fixes regarding fib6_info but I don't think it
> is the cause here... But it is still good to check if you have commit
> 9b0a8da8c4c6, e873e4b9cc7e, e70a3aad44cc in your build.)
Looks like we don't have e70a3aad44cc. I think it fixes a memory leak
(instead of a use-after-free)? Let me add it and run some tests anyway.
Thanks a lot for this information.
>
> I also went through the call path and did not find anything obviously wrong...
> I think it's the best for you to reproduce it and we can debug further.
> One question is, do you have "CONFIG_IPV6_SUBTREE" enabled and specify
> src IP in the routing table?
We do have CONFIG_IPV6_SUBTREE enabled. But we usually do not specify
src IP in the routing table.
Let me try to reproduce it.
Thanks again,
Song
Powered by blists - more mailing lists