lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKEbCKUxieR298R5-BaFQUDXV0o+J3bWjHqv4LyaYDMYw@mail.gmail.com>
Date: Fri, 29 Nov 2024 10:18:58 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jiri Wiesner <jwiesner@...e.de>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>, 
	David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Xin Long <lucien.xin@...il.com>, yousaf.kaukab@...e.com, andreas.taschner@...e.com
Subject: Re: [PATCH v2 net] net/ipv6: release expired exception dst cached in socket

On Thu, Nov 28, 2024 at 9:59 AM Jiri Wiesner <jwiesner@...e.de> wrote:
>
> Dst objects get leaked in ip6_negative_advice() when this function is
> executed for an expired IPv6 route located in the exception table. There
> are several conditions that must be fulfilled for the leak to occur:
> * an ICMPv6 packet indicating a change of the MTU for the path is received,
>   resulting in an exception dst being created
> * a TCP connection that uses the exception dst for routing packets must
>   start timing out so that TCP begins retransmissions
> * after the exception dst expires, the FIB6 garbage collector must not run
>   before TCP executes ip6_negative_advice() for the expired exception dst
>
> When TCP executes ip6_negative_advice() for an exception dst that has
> expired and if no other socket holds a reference to the exception dst, the
> refcount of the exception dst is 2, which corresponds to the increment
> made by dst_init() and the increment made by the TCP socket for which the
> connection is timing out. The refcount made by the socket is never
> released. The refcount of the dst is decremented in sk_dst_reset() but
> that decrement is counteracted by a dst_hold() intentionally placed just
> before the sk_dst_reset() in ip6_negative_advice(). After
> ip6_negative_advice() has finished, there is no other object tied to the
> dst. The socket lost its reference stored in sk_dst_cache and the dst is
> no longer in the exception table. The exception dst becomes a leaked
> object.
>
> As a result of this dst leak, an unbalanced refcount is reported for the
> loopback device of a net namespace being destroyed under kernels that do
> not contain e5f80fcf869a ("ipv6: give an IPv6 dev to blackhole_netdev"):
> unregister_netdevice: waiting for lo to become free. Usage count = 2
>
> Fix the dst leak by removing the dst_hold() in ip6_negative_advice(). The
> patch that introduced the dst_hold() in ip6_negative_advice() was
> 92f1655aa2b22 ("net: fix __dst_negative_advice() race"). But 92f1655aa2b22
> merely refactored the code with regards to the dst refcount so the issue
> was present even before 92f1655aa2b22. The bug was introduced in
> 54c1a859efd9f ("ipv6: Don't drop cache route entry unless timer actually
> expired.") where the expired cached route is deleted and the sk_dst_cache
> member of the socket is set to NULL by calling dst_negative_advice() but
> the refcount belonging to the socket is left unbalanced.
>
> The IPv4 version - ipv4_negative_advice() - is not affected by this bug.
> When the TCP connection times out ipv4_negative_advice() merely resets the
> sk_dst_cache of the socket while decrementing the refcount of the
> exception dst.
>
> Fixes: 92f1655aa2b22 ("net: fix __dst_negative_advice() race")
> Fixes: 54c1a859efd9f ("ipv6: Don't drop cache route entry unless timer actually expired.")
> Link: https://lore.kernel.org/netdev/20241113105611.GA6723@incl/T/#u
> Signed-off-by: Jiri Wiesner <jwiesner@...e.de>

Reviewed-by: Eric Dumazet <edumazet@...gle.com>

Thanks a lot Jiri !

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ