lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <839f0ad6-83c1-1df6-c34d-b844c52ba771@gmail.com>
Date:   Fri, 11 Dec 2020 09:10:26 -0700
From:   David Ahern <dsahern@...il.com>
To:     stranche@...eaurora.org
Cc:     Wei Wang <weiwan@...gle.com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Martin KaFai Lau <kafai@...com>,
        Mahesh Bandewar <maheshb@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Subash Abhinov Kasiviswanathan <subashab@...eaurora.org>
Subject: Re: Refcount mismatch when unregistering netdevice from kernel

On 12/10/20 6:12 PM, stranche@...eaurora.org wrote:
>>> BTW, have you tried your previous proposed patch and confirmed it
>>> would fix the issue?
>>>
> 
> Yes, we shared this with the customer and the refcount mismatch still
> occurred, so this doesn't seem sufficient either.
> 
>>> Could we further distinguish between dst added to the uncached list by
>>> icmp6_dst_alloc() and xfrm6_fill_dst(), and confirm which ones are the
>>> ones leaking reference?
>>> I suspect it would be the xfrm ones, but I think it is worth verifying.
>>>
> 
> After digging into the DST allocation/destroy a bit more, it seems that
> there are some cases where the DST's refcount does not hit zero, causing
> them to never be freed and release their references.
> One case comes from here on the IPv6 packet output path (these DST
> structs would hold references to both the inet6_dev and the netdevice)
> ip6_pol_route_output+0x20/0x2c -> ip6_pol_route+0x1dc/0x34c ->
> rt6_make_pcpu_route+0x18/0xf4 -> ip6_rt_pcpu_alloc+0xb4/0x19c

This is the normal data path, and this refers to a per-cpu dst cache.
Delete the route and the cached entries get removed.

> 
> We also see two DSTs where they are stored as the xdst->rt entry on the
> XFRM path that do not get released. One is allocated by the same path as
> above, and the other like this
> xfrm6_esp_err+0x7c/0xd4 -> esp6_err+0xc8/0x100 ->
> ip6_update_pmtu+0xc8/0x100 -> __ip6_rt_update_pmtu+0x248/0x434 ->
> ip6_rt_cache_alloc+0xa0/0x1dc

This entry goes into an exception cache. I have lost track of kernel
versions and features. Try listing the route cache to see these:  ip -6
ro ls cache

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ