lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKGTPHK5wMyP4oRoAuv8f56VY-RrrMPBSb8jRMJSiL5Qg@mail.gmail.com>
Date:   Wed, 17 May 2023 16:44:55 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     menglong8.dong@...il.com
Cc:     kuba@...nel.org, davem@...emloft.net, pabeni@...hat.com,
        dsahern@...nel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, Menglong Dong <imagedong@...cent.com>
Subject: Re: [PATCH net-next 2/3] net: tcp: send zero-window when no memory

On Wed, May 17, 2023 at 2:42 PM <menglong8.dong@...il.com> wrote:
>
> From: Menglong Dong <imagedong@...cent.com>
>
> For now, skb will be dropped when no memory, which makes client keep
> retrans util timeout and it's not friendly to the users.

Yes, networking needs memory. Trying to deny it is recipe for OOM.

>
> Therefore, now we force to receive one packet on current socket when
> the protocol memory is out of the limitation. Then, this socket will
> stay in 'no mem' status, util protocol memory is available.
>

I think you missed one old patch.

commit ba3bb0e76ccd464bb66665a1941fabe55dadb3ba    tcp: fix
SO_RCVLOWAT possible hangs under high mem pressure



> When a socket is in 'no mem' status, it's receive window will become
> 0, which means window shrink happens. And the sender need to handle
> such window shrink properly, which is done in the next commit.
>
> Signed-off-by: Menglong Dong <imagedong@...cent.com>
> ---
>  include/net/sock.h    |  1 +
>  net/ipv4/tcp_input.c  | 12 ++++++++++++
>  net/ipv4/tcp_output.c |  7 +++++++
>  3 files changed, 20 insertions(+)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 5edf0038867c..90db8a1d7f31 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -957,6 +957,7 @@ enum sock_flags {
>         SOCK_XDP, /* XDP is attached */
>         SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
>         SOCK_RCVMARK, /* Receive SO_MARK  ancillary data with packet */
> +       SOCK_NO_MEM, /* protocol memory limitation happened */
>  };
>
>  #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index a057330d6f59..56e395cb4554 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
>                 if (skb_queue_len(&sk->sk_receive_queue) == 0)
>                         sk_forced_mem_schedule(sk, skb->truesize);

I think you missed this part : We accept at least one packet,
regardless of memory pressure,
if the queue is empty.

So your changelog is misleading.

>                 else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) {
> +                       if (sysctl_tcp_wnd_shrink)

We no longer add global sysctls for TCP. All new sysctls must per net-ns.

> +                               goto do_wnd_shrink;
> +
>                         reason = SKB_DROP_REASON_PROTO_MEM;
>                         NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP);
>                         sk->sk_data_ready(sk);
>                         goto drop;
> +do_wnd_shrink:
> +                       if (sock_flag(sk, SOCK_NO_MEM)) {
> +                               NET_INC_STATS(sock_net(sk),
> +                                             LINUX_MIB_TCPRCVQDROP);
> +                               sk->sk_data_ready(sk);
> +                               goto out_of_window;
> +                       }
> +                       sk_forced_mem_schedule(sk, skb->truesize);

So now we would accept two packets per TCP socket, and yet EPOLLIN
will not be sent in time ?

packets can consume about 45*4K each, I do not think it is wise to
double receive queue sizes.

What you want instead is simply to send EPOLLIN sooner (when the first
packet is queued instead when the second packet is dropped)
by changing sk_forced_mem_schedule() a bit.

This might matter for applications using SO_RCVLOWAT, but not for
other applications.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ