lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJne2+k+MJQzu1U7vO6eEbTLjD7QQxSG6hPgZ1i7+AutA@mail.gmail.com>
Date: Thu, 11 Apr 2024 20:42:45 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, pabeni@...hat.com, 
	Stefano Brivio <sbrivio@...hat.com>, Ilya Maximets <i.maximets@....org>, dsahern@...nel.org, 
	donald.hunter@...il.com
Subject: Re: [PATCH net] inet: bring NLM_DONE out to a separate recv() again

On Thu, Apr 11, 2024 at 8:02 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Commit under Fixes optimized the number of recv() calls
> needed during RTM_GETROUTE dumps, but we got multiple
> reports of applications hanging on recv() calls.
> Applications expect that a route dump will be terminated
> with a recv() reading an individual NLM_DONE message.
>
> Coalescing NLM_DONE is perfectly legal in netlink,
> but even tho reporters fixed the code in respective
> projects, chances are it will take time for those
> applications to get updated. So revert to old behavior
> (for now)?
>
> Old kernel (5.19):
>

> Reported-by: Stefano Brivio <sbrivio@...hat.com>
> Link: https://lore.kernel.org/all/20240315124808.033ff58d@elisabeth
> Reported-by: Ilya Maximets <i.maximets@....org>
> Link: https://lore.kernel.org/all/02b50aae-f0e9-47a4-8365-a977a85975d3@ovn.org
> Fixes: 4ce5dc9316de ("inet: switch inet_dump_fib() to RCU protection")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> CC: dsahern@...nel.org
> CC: donald.hunter@...il.com
> ---
>  net/ipv4/fib_frontend.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 48741352a88a..c484b1c0fc00 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -1050,6 +1050,11 @@ static int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
>                         e++;
>                 }
>         }
> +
> +       /* Don't let NLM_DONE coalesce into a message, even if it could.
> +        * Some user space expects NLM_DONE in a separate recv().
> +        */
> +       err = skb->len;

My plan was to perform this generically from netlink_dump_done().

This would still avoid calling a RTNL-enabled-dump() again if EOF has
been met already.

A sysctl could opt-in for the coalescing, if there is interest.

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index dc8c3c01d51b709c132ff63a0c534c1cc286589a..cad1124393ac74f3d5bfa86556ed9028f5ec8f65
100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2282,7 +2282,12 @@ static int netlink_dump(struct sock *sk, bool lock_taken)
                cb->extack = NULL;
        }

+       /* Don't let NLM_DONE coalesce into a message, even if it could.
+        * Some user space expects NLM_DONE in a separate recv().
+        * Maybe opt-in this coalescing with a sysctl or socket option ?
+        */
        if (nlk->dump_done_errno > 0 ||
+           skb->len ||
            skb_tailroom(skb) <
nlmsg_total_size(sizeof(nlk->dump_done_errno))) {
                mutex_unlock(&nlk->nl_cb_mutex);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ