lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEA6p_COwXWP97wSHvzokmm8ckQ2QEd7=dfNH04ttDPm_z3bcg@mail.gmail.com>
Date:   Fri, 11 Aug 2017 17:10:02 -0700
From:   Wei Wang <weiwan@...gle.com>
To:     Cong Wang <xiyou.wangcong@...il.com>,
        John Stultz <john.stultz@...aro.org>,
        Martin KaFai Lau <kafai@...com>
Cc:     lkml <linux-kernel@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>,
        Linux USB List <linux-usb@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Felipe Balbi <felipe.balbi@...ux.intel.com>
Subject: Re: unregister_netdevice: waiting for eth0 to become free. Usage
 count = 1

> If after Cong's fix, the issue still happens, could you help try the
> patch attached and collect all logs when you try the reproduce the
> issue? It would be great to have logs for both success case and the
> failure case.
>
> Thanks so much for your help.
>

I think we have a potential fix for this issue.
Martin and I found that when addrconf_dst_alloc() creates a rt6, it is
possible that rt6->dst.dev points to loopback device while
rt6->rt6i_idev->dev points to a real device.
When the real device goes down, the current fib6 clean up code only
checks for rt6->dst.dev and assumes rt6->rt6i_idev->dev is the same.
That leaves unreleased refcnt on the real device if rt6->dst.dev
points to loopback dev.

The attached potential fix is tested by Martin and made sure it fixes his issue.

John,
It will be great if you can also give it a try and see if it fixes the
issue on your side before I submit an official patch.

Thanks very much for the help from everyone.

Wei

On Fri, Aug 11, 2017 at 10:25 AM, Wei Wang <weiwan@...gle.com> wrote:
> On Fri, Aug 11, 2017 at 9:48 AM, Cong Wang <xiyou.wangcong@...il.com> wrote:
>> Hi,
>>
>> On Thu, Aug 10, 2017 at 11:12 AM, John Stultz <john.stultz@...aro.org> wrote:
>>> On Wed, Aug 9, 2017 at 10:41 PM, Wei Wang <weiwan@...gle.com> wrote:
>>>> Hi John,
>>>>
>>>> Is it possible to try the attached patch?
>>>
>>> Thanks so much for the quick turn around!
>>>
>>> So I dropped all the reverts you suggested, and applied this one
>>> against 4.13-rc4, but I'm still seeing the problematic behavior.
>>
>> Does the following one-line fix make a difference?
>>
>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>> index a640fbcba15d..c145a35763a0 100644
>> --- a/net/ipv6/route.c
>> +++ b/net/ipv6/route.c
>> @@ -141,7 +141,7 @@ static void rt6_uncached_list_del(struct rt6_info *rt)
>>                 struct uncached_list *ul = rt->rt6i_uncached_list;
>>
>>                 spin_lock_bh(&ul->lock);
>> -               list_del(&rt->rt6i_uncached);
>> +               list_del_init(&rt->rt6i_uncached);
>>                 spin_unlock_bh(&ul->lock);
>>         }
>>  }
>
>
> Thanks a lot Cong for proposing this fix.
>
> For the last few days, John has been helping me running debug image
> and we found out that the leaked dst is probably in addrconf.c.
> Martin and I are looking through the code and trying to put more debugs.
>
> John,
>
> If after Cong's fix, the issue still happens, could you help try the
> patch attached and collect all logs when you try the reproduce the
> issue? It would be great to have logs for both success case and the
> failure case.
>
> Thanks so much for your help.
>
> Wei

View attachment "0001-potential-fix-for-unregister_netdevice.patch" of type "text/x-patch" (1515 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ